Home » Microsoft » 70-475 v.2 » Which two services should you include in the recommendation?
You have an Apache Hive cluster in Microsoft Azure HDInsight. The cluster contains 10 million data files.
You plan to archive the data.
The data will be analyzed monthly.
You need to recommend a solution to move and store the data. The solution must minimize how long it takes to move the data and must minimize costs.
Which two services should you include in the recommendation? Each correct answer presents part of the solution.
NOTE: Each correct selection is worth one point.
A. Azure Queue storage
B. Microsoft SQL Server Integration Services (SSIS)
C. Azure Table Storage
D. Azure Data Lake
E. Azure Data Factory
Correct Answer: DE
Explanation/Reference:
D: To analyze data in HDInsight cluster, you can store the data either in Azure Storage, Azure Data Lake Storage Gen 1/Azure Data Lake Storage Gen 2, or both. Both storage options enable you to safely delete HDInsight clusters that are used for computation without losing user data.
E: The Spark activity in a Data Factory pipeline executes a Spark program on your own or on-demand HDInsight cluster. It handles data transformation and the supported transformation activities.
References: https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-hadoop-use-data-lake-store https://docs.microsoft.com/en-us/azure/data-factory/transform-data-using-spark