Topic 1, Contoso Case StudyTransactional Date
Contoso has three years of customer, transactional, operation, sourcing, and supplier data
comprised of 10 billion records stored across multiple on-premises Microsoft SQL Server
servers. The SQL server instances contain data from various operational systems. The
data is loaded into the instances by using SQL server integration Services (SSIS)
packages.
You estimate that combining all product sales transactions into a company-wide sales
transactions dataset will result in a single table that contains 5 billion rows, with one row
per transaction.
Most queries targeting the sales transactions data will be used to identify which products
were sold in retail stores and which products were sold online during different time period.
Sales transaction data that is older than three years will be removed monthly.
You plan to create a retail store table that will contain the address of each retail store. The
table will be approximately 2 MB. Queries for retail store sales will include the retail store
addresses.
You plan to create a promotional table that will contain a promotion ID. The promotion ID
will be associated to a specific product. The product will be identified by a product ID. The
table will be approximately 5 GB.
Streaming Twitter Data
The ecommerce department at Contoso develops and Azure logic app that captures
trending Twitter feeds referencing the company’s products and pushes the products to
Azure Event Hubs.
Planned Changes
Contoso plans to implement the following changes:
* Load the sales transaction dataset to Azure Synapse Analytics.
* Integrate on-premises data stores with Azure Synapse Analytics by using SSIS packages.
* Use Azure Synapse Analytics to analyze Twitter feeds to assess customer sentiments
about products.
Sales Transaction Dataset Requirements
Contoso identifies the following requirements for the sales transaction dataset:
• Partition data that contains sales transaction records. Partitions must be designed to
provide efficient loads by month. Boundary values must belong: to the partition on the right.
• Ensure that queries joining and filtering sales transaction records based on product ID
complete as quickly as possible.
• Implement a surrogate key to account for changes to the retail store addresses.
• Ensure that data storage costs and performance are predictable.
• Minimize how long it takes to remove old records.
Customer Sentiment Analytics Requirement
Contoso identifies the following requirements for customer sentiment analytics:
• Allow Contoso users to use PolyBase in an A/ure Synapse Analytics dedicated SQL pool
to query the content of the data records that host the Twitter feeds. Data must be protected
by using row-level security (RLS). The users must be authenticated by using their own
A/ureAD credentials.
• Maximize the throughput of ingesting Twitter feeds from Event Hubs to Azure Storage
without purchasing additional throughput or capacity units.
• Store Twitter feeds in Azure Storage by using Event Hubs Capture. The feeds will be
converted into Parquet files.
• Ensure that the data store supports Azure AD-based access control down to the object
level.
• Minimize administrative effort to maintain the Twitter feed data records.
• Purge Twitter feed data records;itftaitJ are older than two years.
Data Integration Requirements
Contoso identifies the following requirements for data integration:
Use an Azure service that leverages the existing SSIS packages to ingest on-premises
data into datasets stored in a dedicated SQL pool of Azure Synaps Analytics and transform
the data.
Identify a process to ensure that changes to the ingestion and transformation activities can
be version controlled and developed independently by multiple data engineers.
You need to implement the surrogate key for the retail store table. The solution must meet
the sales transaction
dataset requirements.
What should you create?
A.
a table that has an IDENTITY property
B.
a system-versioned temporal table
C.
a user-defined SEQUENCE object
D.
a table that has a FOREIGN KEY constraint
a table that has an IDENTITY property
Explanation:
Scenario: Implement a surrogate key to account for changes to the retail store addresses.
A surrogate key on a table is a column with a unique identifier for each row. The key is not
generated from the table data. Data modelers like to create surrogate keys on their tables
when they design data warehouse models. You can use the IDENTITY property to achieve
this goal simply and effectively without affecting load performance.
Reference:
https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-datawarehouse-
tables-identity
What should you do to improve high availability of the real-time data processing solution?
A.
Deploy identical Azure Stream Analytics jobs to paired regions in Azure.
B.
Deploy a High Concurrency Databricks cluster.
C.
Deploy an Azure Stream Analytics job and use an Azure Automation runbook to check the status of the job and to start the job if it stops.
D.
Set Data Lake Storage to use geo-redundant storage (GRS).
Deploy identical Azure Stream Analytics jobs to paired regions in Azure.
Explanation:
Guarantee Stream Analytics job reliability during service updates
Part of being a fully managed service is the capability to introduce new service functionality
and improvements at a rapid pace. As a result, Stream Analytics can have a service update
deploy on a weekly (or more frequent) basis. No matter how much testing is done there is
still a risk that an existing, running job may break due to the introduction of a bug. If you are
running mission critical jobs, these risks need to be avoided. You can reduce this risk by
following Azure’s paired region model.
Scenario: The application development team will create an Azure event hub to receive realtime
sales data, including store number, date, time, product ID, customer loyalty number,
price, and discount amount, from the point of sale (POS) system and output the data to
data storage in Azure
Reference:
https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-job-reliability
Which Azure Data Factory components should you recommend using together to import
the daily inventory data from the SQL server to Azure Data Lake Storage? To answer,
select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
You need to create a partitioned table in an Azure Synapse Analytics dedicated SQL pool.
How should you complete the Transact-SQL statement? To answer, drag the appropriate
values to the correct targets. Each value may be used once, more than once, or not at all.
You may need to drag the split bar between panes or scroll to view content.
NOTE: Each correct selection is worth one point.
Note: This question is part of a series of questions that present the same scenario.
Each question in the series contains a unique solution that might meet the stated
goals. Some question sets might have more than one correct solution, while others
might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a
result, these questions will not appear in the review screen.
You are designing an Azure Stream Analytics solution that will analyze Twitter data.
You need to count the tweets in each 10-second window. The solution must ensure that
each tweet is counted only once.
Solution: You use a session window that uses a timeout size of 10 seconds.
Does this meet the goal?
A.
Yes
B.
No
No
Page 2 out of 42 Pages |
Previous |