You work on a regression problem in a natural language processing domain, and you have 100M labeled exmaples in your dataset. You have randomly shuffled your data and split your dataset into train and test samples (in a 90/10 ratio) . After you trained the neural network and evaluated your model on a test set, you discover that the root-mean-squared error (RMSE) of your model is twice as high on the train set as on the test set. How should you improve the performance of your model?
A) Increase the share of the test sample in the train-test split.
B) Try to collect more data and increase the size of your dataset.
C) Try out regularization techniques (e.g., dropout of batch normalization) to avoid overfitting.
D) Increase the complexity of your model by, e,g., introducing an additional layer or increase sizing the size of vocabularies or n-grams used.
Correct Answer:
Verified
Q126: You decided to use Cloud Datastore to
Q127: Data Analysts in your company have the
Q128: You are deploying MariaDB SQL databases on
Q129: You have an Apache Kafka cluster on-prem
Q130: You are designing storage for 20 TB
Q132: Your company is in the process of
Q133: You are integrating one of your internal
Q134: You work for an advertising company, and
Q135: You have a petabyte of analytics data
Q136: You need to create a data pipeline
Unlock this Answer For Free Now!
View this answer and more for free by performing one of the following actions
Scan the QR code to install the App and get 2 free unlocks
Unlock quizzes for free by uploading documents