aiotestking uk

70-475 Exam Questions - Online Test


70-475 Premium VCE File

Learn More 100% Pass Guarantee - Dumps Verified - Instant Download
150 Lectures, 20 Hours

Q1. DRAG DROP

You have data generated by sensors. The data is sent to Microsoft Azure Event Hubs.

You need to have an aggregated view of the data in near real-time by using five minute tumbling windows to identity short-term trends. You must also have hourly and a daily aggregated views of the data.

Which technology should you use for each task? To answer, drag the appropriate technologies to the correct tasks. Each technology may be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll to view content.

NOTE: Each correct selection is worth one point.

Answer:

Q2. HOTSPOT 

You have a pipeline that contains an input dataset in Microsoft Azure Table Storage and an output dataset in Azure Blob storage. You have the following JSON data.

Use the drop-down menus to select the answer choice that completes each statement based on the information presented in the JSON data.

Answer:

Q3. DRAG DROP 

Your company has a Microsoft Azure environment that contains an Azure HDInsight Hadoop cluster and an Azure SQL data warehouse. The Hadoop cluster contains text files that are formatted by using UTF-8 character encoding.

You need to implement a solution to ingest the data to the SQL data warehouse from the

Hadoop cluster. The solution must provide optimal read performance for the data after ingestion.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Answer:

Q4. A company named Fabrikam, Inc. has a Microsoft Azure web app. Billions of users visit the

app daily.

The web app logs all user activity by using text files in Azure Blob storage. Each day, approximately 200 GB of text files are created.

Fabrikam uses the log files from an Apache Hadoop cluster on Azure DHlnsight.

You need to recommend a solution to optimize the storage of the log files for later Hive use.

What is the best property to recommend adding to the Hive table definition to achieve the goal? More than one answer choice may achieve the goal. Select the BEST answer.

A. STORED AS RCFILE

B. STORED AS GZIP

C. STORED AS ORC

D. STORED AS TEXTFILE

Answer: A

Q5. You have a Microsoft Azure Data Factory pipeline.

You discover that the pipeline fails to execute because data is missing. You need to rerun the failure in the pipeline.

Which cmdlet should you use?

A. Set-AzureAutomationJob

B. Resume-AzureDataFactoryPipeline

C. Resume-AzureAutomationJob

D. Set-AzureDataFactotySliceStatus

Answer: B

Q6. DRAG DROP

You have data generated by sensors. The data is sent to Microsoft Azure Event Hubs.

You need to have an aggregated view of the data in near real-time by using five minute tumbling windows to identity short-term trends. You must also have hourly and a daily aggregated views of the data.

Which technology should you use for each task? To answer, drag the appropriate technologies to the correct tasks. Each technology may be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll to view content.

NOTE: Each correct selection is worth one point.

Answer:

Q7. HOTSPOT

You have four on-premises Microsoft SQL Server data sources as described in the following table.

You plan to create three Azure data factories that will interact with the data sources as described in the following table.

You need to deploy Microsoft Data Management Gateway to support the Azure Data Factory deployment. The solution must use new servers to host the instances of Data Management Gateway.

What is the minimum number of new servers and data management gateways you should you deploy? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Answer:

Q8. You are designing a solution that will use Apache HBase on Microsoft Azure HDInsight.

You need to design the row keys for the database to ensure that client traffic is directed over all of the nodes in the cluster.

What are two possible techniques that you can use? Each correct answer presents a complete solution.

NOTE: Each correct selection is worth one point.

A. padding

B. trimming

C. hashing

D. salting

Answer: C

Q9. Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

Your company has multiple databases that contain millions of sales transactions. You plan to implement a data mining solution to identity purchasing fraud.

You need to design a solution that mines 10 terabytes (TB) of sales data. The solution must meet the following requirements:

• Run the analysis to identify fraud once per week.

• Continue to receive new sales transactions while the analysis runs.

• Be able to stop computing services when the analysis is NOT running.

Solution: You create a Cloudera Hadoop cluster on Microsoft Azure virtual machines. Does this meet the goal?

A. Yes

B. No

Answer: A

Q10. HOTSPOT

A department in your company creates an Azure SQL database named DB1. DB1 is a data mart.

Each night, you need to insert new rows Into 9.000 tables in DB1 from changed data in DW1. The solution must minimize costs.

What should you use to move the data from DW1 to DB1, and then to import the changed data to DB1? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Answer: