Go Back on DAS-C01 Exam
Available in 1, 3, 6 and 12 Months Free Updates Plans
PDF: $15 $60

Test Engine: $20 $80

PDF + Engine: $25 $99

DAS-C01 Practice Test


Page 4 out of 13 Pages

A company has a business unit uploading .csv files to an Amazon S3 bucket. The company’s data platform
team has set up an AWS Glue crawler to do discovery, and create tables and schemas. An AWS Glue job
writes processed data from the created tables to an Amazon Redshift database. The AWS Glue job handles
column mapping and creating the Amazon Redshift table appropriately. When the AWS Glue job is rerun for
any reason in a day, duplicate records are introduced into the Amazon Redshift table.
Which solution will update the Redshift table without duplicates when jobs are rerun?


A.

Modify the AWS Glue job to copy the rows into a staging table. Add SQL commands to replace the
existing rows in the main table as postactions in the DynamicFrameWriter class.


B.

Load the previously inserted data into a MySQL database in the AWS Glue job. Perform an upsert
operation in MySQL, and copy the results to the Amazon Redshift table


C.

Use Apache Spark’s DataFrame dropDuplicates() API to eliminate duplicates and then write the data to Amazon Redshift.


D.

Use the AWS Glue ResolveChoice built-in transform to select the most recent value of the column.





B.
  

Load the previously inserted data into a MySQL database in the AWS Glue job. Perform an upsert
operation in MySQL, and copy the results to the Amazon Redshift table



A real estate company has a mission-critical application using Apache HBase in Amazon EMR. Amazon EMR is configured with a single master node. The company has over 5 TB of data stored on an Hadoop Distributed File System (HDFS). The company wants a cost-effective solution to make its HBase data highly available. Which architectural pattern meets company’s requirements?


A.

Use Spot Instances for core and task nodes and a Reserved Instance for the EMR master node.
Configure the EMR cluster with multiple master nodes. Schedule automated snapshots using Amazon
EventBridge.


B.

Store the data on an EMR File System (EMRFS) instead of HDFS. Enable EMRFS consistent view.
Create an EMR HBase cluster with multiple master nodes. Point the HBase root directory to an Amazon S3 bucket.


C.

Store the data on an EMR File System (EMRFS) instead of HDFS and enable EMRFS consistent view.Run two separate EMR clusters in two different Availability Zones. Point both clusters to the same
HBase root directory in the same Amazon S3 bucket.


D.

Store the data on an EMR File System (EMRFS) instead of HDFS and enable EMRFS consistent view.
Create a primary EMR HBase cluster with multiple master nodes. Create a secondary EMR HBase readreplica cluster in a separate Availability Zone. Point both clusters to the same HBase root directory in the same Amazon S3 bucket.





C.
  

Store the data on an EMR File System (EMRFS) instead of HDFS and enable EMRFS consistent view.Run two separate EMR clusters in two different Availability Zones. Point both clusters to the same
HBase root directory in the same Amazon S3 bucket.



A banking company is currently using an Amazon Redshift cluster with dense storage (DS) nodes to store
sensitive data. An audit found that the cluster is unencrypted. Compliance requirements state that a database
with sensitive data must be encrypted through a hardware security module (HSM) with automated key
rotation.
Which combination of steps is required to achieve compliance? (Choose two.)


A.

Set up a trusted connection with HSM using a client and server certificate with automatic key rotation.


B.

Modify the cluster with an HSM encryption option and automatic key rotation.


C.

Create a new HSM-encrypted Amazon Redshift cluster and migrate the data to the new cluster.


D.

Enable HSM with key rotation through the AWS CLI.


E.

Enable Elliptic Curve Diffie-Hellman Ephemeral (ECDHE) encryption in the HSM.





B.
  

Modify the cluster with an HSM encryption option and automatic key rotation.



D.
  

Enable HSM with key rotation through the AWS CLI.



An online retail company with millions of users around the globe wants to improve its ecommerce analytics
capabilities. Currently, clickstream data is uploaded directly to Amazon S3 as compressed files. Several times
each day, an application running on Amazon EC2 processes the data and makes search options and reports
available for visualization by editors and marketers. The company wants to make website clicks and
aggregated data available to editors and marketers in minutes to enable them to connect with users more
effectively.
Which options will help meet these requirements in the MOST efficient way? (Choose two.)


A.

Use Amazon Kinesis Data Firehose to upload compressed and batched clickstream records to Amazon Elasticsearch Service.


B.

Upload clickstream records to Amazon S3 as compressed files. Then use AWS Lambda to send data to Amazon Elasticsearch Service from Amazon S3.


C.

Use Amazon Elasticsearch Service deployed on Amazon EC2 to aggregate, filter, and process the data. Refresh content performance dashboards in near-real time.


D.

Use Kibana to aggregate, filter, and visualize the data stored in Amazon Elasticsearch Service. Refresh content performance dashboards in near-real time.


E.

Upload clickstream records from Amazon S3 to Amazon Kinesis Data Streams and use a Kinesis Data
Streams consumer to send records to Amazon Elasticsearch Service.





C.
  

Use Amazon Elasticsearch Service deployed on Amazon EC2 to aggregate, filter, and process the data. Refresh content performance dashboards in near-real time.



E.
  

Upload clickstream records from Amazon S3 to Amazon Kinesis Data Streams and use a Kinesis Data
Streams consumer to send records to Amazon Elasticsearch Service.



A marketing company wants to improve its reporting and business intelligence capabilities. During the
planning phase, the company interviewed the relevant stakeholders, and discovered that:
The operations team reports are run hourly for the current month’s data.
The sales team wants to use multiple Amazon QuickSight dashboards to show a rolling view of the last
30 days based on several categories. The sales team also wants to view the data as soon as it reaches the
reporting backend.
The finance team’s reports are run daily for last month’s data and once a month for the last 24 months of
data.
Currently, there is 400 TB of data in the system with an expected additional 100 TB added every month. The
company is looking for a solution that is as cost-effective as possible.
Which solution meets the company’s requirements?


A.

Store the last 24 months of data in Amazon Redshift. Configure Amazon QuickSight with Amazon
Redshift as the data source.


B.

Store the last 2 months of data in Amazon Redshift and the rest of the months in Amazon S3. Set up an external schema and table for Amazon Redshift Spectrum. Configure Amazon QuickSight with Amazon Redshift as the data source.


C.

Store the last 24 months of data in Amazon S3 and query it using Amazon Redshift Spectrum.
Configure Amazon QuickSight with Amazon Redshift Spectrum as the data source.


D.

Store the last 2 months of data in Amazon Redshift and the rest of the months in Amazon S3. Use a
long- running Amazon EMR with Apache Spark cluster to query the data as needed. Configure Amazon QuickSight with Amazon EMR as the data source.





B.
  

Store the last 2 months of data in Amazon Redshift and the rest of the months in Amazon S3. Set up an external schema and table for Amazon Redshift Spectrum. Configure Amazon QuickSight with Amazon Redshift as the data source.




Page 4 out of 13 Pages
Previous