Free Professional-Data-Engineer Practice Exam Questions

Which of the following statements about Legacy SQL and Standard SQL is not true?
.

A.

Standard SQL is the preferred query language for BigQuery.

B.

If you write a query in Legacy SQL, it might generate an error if you try to run it with Standard SQL.
One difference between the two query languages is how you specify fully-qualified table names (i.e.
table names that include their associated project name).
You need to set a query language for each dataset and the default is Standard SQL

C.

One difference between the two query languages is how you specify fully-qualified table names (i.e.
table names that include their associated project name).

D.

You need to set a query language for each dataset and the default is Standard SQL

D.

You need to set a query language for each dataset and the default is Standard SQL

Explanation
You do not set a query language for each dataset. It is set each time you run a query and the default query
language is Legacy SQL.
Standard SQL has been the preferred query language since BigQuery 2.0 was released.
In legacy SQL, to query a table with a project-qualified name, you use a colon, :, as a separator. In standard
SQL, you use a period, ., instead.
Due to the differences in syntax between the two query languages (such as with project-qualified table names), if you write a query in Legacy SQL, it might generate an error if you try to run it with Standard SQL

Which of these statements about exporting data from BigQuery is false?

A.

To export more than 1 GB of data, you need to put a wildcard in the destination filename.

B.

The only supported export destination is Google Cloud Storage.

C.

Data can only be exported in JSON or Avro format.

D.

The only compression option available is GZIP.

C.

Data can only be exported in JSON or Avro format.

Data can be exported in CSV, JSON, or Avro format. If you are exporting nested or repeated data, then CSV format is not supported.

When creating a new Cloud Dataproc cluster with the projects.regions.clusters.create operation, these four
values are required: project, region, name, and ____.

A.

zone

B.

node

C.

label

D.

type

A.

zone

At a minimum, you must specify four values when creating a new cluster with the
projects.regions.clusters.create operation:
The project in which the cluster will be created
The region to use
The name of the cluster
The zone in which the cluster will be created
You can specify many more details beyond these minimum requirements. For example, you can
also specify the number of workers, whether preemptible compute should be used, and the network settings.

e on a Cloud Dataproc cluster
____.

A.

application node

B.

conditional node

C.

master node

D.

worker node

C.

master node

The YARN ResourceManager and the HDFS NameNode interfaces are available on a Cloud Dataproc cluster
master node. The cluster master-host-name is the name of your Cloud Dataproc cluster followed by an -m
suffix—for example, if your cluster is named "my-cluster", the master-host-name would be "my-cluster-m".

Which of the following is NOT true about Dataflow pipelines?

A.

Dataflow pipelines are tied to Dataflow, and cannot be run on any other runner

B.

Dataflow pipelines can consume data from other Google Cloud services

C.

Dataflow pipelines can be programmed in Java

D.

Dataflow pipelines use a unified programming model, so can work both with streaming and batch data
sources

A.

Dataflow pipelines are tied to Dataflow, and cannot be run on any other runner

Dataflow pipelines can also run on alternate runtimes like Spark and Flink, as they are built using the Apache
Beam SDKs

Pass exam with Dumps4free or we will provide you with three additional months of access for FREE.

Professional-Data-Engineer Practice Test