You are building new real-time data warehouse for your company and will use Google BigQuery streaming inserts. There is no guarantee that data will only be sent in once but you do have a unique ID for each row of data and an event timestamp. You want to ensure that duplicates are not included while interactively querying data. Which query type should you use?
A) Include ORDER BY DESK on timestamp column and LIMIT to 1. Include ORDER BY DESK on timestamp column and LIMIT to 1.
B) Use GROUP BY on the unique ID column and timestamp column and SUM on the values. Use GROUP BY on the unique ID column and timestamp column and SUM on the values.
C) Use the LAG window function with PARTITION by unique ID along with WHERE LAG IS NOT NULL . Use the LAG window function with PARTITION by unique ID along with WHERE LAG IS NOT NULL .
D) Use the ROW_NUMBER window function with PARTITION by unique ID along with WHERE row equals 1. Use the ROW_NUMBER window function with WHERE row equals 1.
Correct Answer:
Verified
Q147: You have data pipelines running on BigQuery,
Q148: After migrating ETL jobs to run on
Q149: You are analyzing the price of a
Q150: You need to deploy additional dependencies to
Q151: You work for a large bank that
Q153: You need to choose a database to
Q154: A shipping company has live package-tracking data
Q155: You work for a financial institution that
Q156: You are designing a pipeline that publishes
Q157: You work for a mid-sized enterprise that
Unlock this Answer For Free Now!
View this answer and more for free by performing one of the following actions
Scan the QR code to install the App and get 2 free unlocks
Unlock quizzes for free by uploading documents