You are designing a cloud-native historical data processing system to meet the following conditions: The data being analyzed is in CSV, Avro, and PDF formats and will be accessed by multiple analysis tools including Cloud Dataproc, BigQuery, and Compute Engine. A streaming data pipeline stores new data daily. Peformance is not a factor in the solution. The solution design should maximize availability. How should you design data storage for this solution?
A) Create a Cloud Dataproc cluster with high availability. Store the data in HDFS, and peform analysis as needed.
B) Store the data in BigQuery. Access the data using the BigQuery Connector or Cloud Dataproc and Compute Engine.
C) Store the data in a regional Cloud Storage bucket. Aceess the bucket directly using Cloud Dataproc, BigQuery, and Compute Engine.
D) Store the data in a multi-regional Cloud Storage bucket. Access the data directly using Cloud Dataproc, BigQuery, and Compute Engine.
Correct Answer:
Verified
Q241: Google Cloud Bigtable indexes a single value
Q242: You architect a system to analyze seismic
Q243: When you store data in Cloud Bigtable,
Q244: If you're running a performance test that
Q245: You need to choose a database for
Q247: You have Cloud Functions written in Node.js
Q248: Cloud Bigtable is a recommended option for
Q249: As your organization expands its usage of
Q250: You have a data pipeline that writes
Q251: You work for a shipping company that
Unlock this Answer For Free Now!
View this answer and more for free by performing one of the following actions
Scan the QR code to install the App and get 2 free unlocks
Unlock quizzes for free by uploading documents