You are creating a new pipeline in Google Cloud to stream IoT data from Cloud Pub/Sub through Cloud Dataflow to BigQuery. While previewing the data, you notice that roughly 2% of the data appears to be corrupt. You need to modify the Cloud Dataflow pipeline to filter out this corrupt data. What should you do?
A) Add a SideInput that returns a Boolean if the element is corrupt.
B) Add a ParDo transform in Cloud Dataflow to discard corrupt elements.
C) Add a Partition transform in Cloud Dataflow to separate valid data from corrupt data.
D) Add a GroupByKey transform in Cloud Dataflow to group all of the valid data together and discard the rest.
Correct Answer:
Verified
Q81: You use a dataset in BigQuery for
Q82: You are building a new application that
Q83: You decided to use Cloud Datastore to
Q84: You operate a database that stores stock
Q85: A data scientist has created a BigQuery
Q87: You are building an application to share
Q88: A shipping company has live package-tracking data
Q89: Your United States-based company has created an
Q90: You are operating a Cloud Dataflow streaming
Q91: You are managing a Cloud Dataproc cluster.
Unlock this Answer For Free Now!
View this answer and more for free by performing one of the following actions
Scan the QR code to install the App and get 2 free unlocks
Unlock quizzes for free by uploading documents