You are designing a pipeline that publishes application events to a Pub/Sub topic. Although message ordering is not important, you need to be able to aggregate events across disjoint hourly intervals before loading the results to BigQuery for analysis. What technology should you use to process and load this data to BigQuery while ensuring that it will scale with large volumes of events?
A) Create a Cloud Function to perform the necessary data processing that executes using the Pub/Sub trigger every time a new message is published to the topic.
B) Schedule a Cloud Function to run hourly, pulling all available messages from the Pub/Sub topic and performing the necessary aggregations.
C) Schedule a batch Dataflow job to run hourly, pulling all available messages from the Pub/Sub topic and performing the necessary aggregations.
D) Create a streaming Dataflow job that reads continually from the Pub/Sub topic and performs aggregations using tumbling windows.
Correct Answer:
Verified
Q151: You work for a large bank that
Q152: You are building new real-time data warehouse
Q153: You need to choose a database to
Q154: A shipping company has live package-tracking data
Q155: You work for a financial institution that
Q157: You work for a mid-sized enterprise that
Q158: MJTelco Case Study Company Overview MJTelco is
Q159: You want to process payment transactions in
Q160: You are updating the code for a
Q161: You work for a bank. You have
Unlock this Answer For Free Now!
View this answer and more for free by performing one of the following actions
Scan the QR code to install the App and get 2 free unlocks
Unlock quizzes for free by uploading documents