Deck 2: Data Science

Full screen (f)
exit full mode
Question
Point out the wrong statement.

A)Hardtop processing capabilities are huge and its real advantage lies in the ability to process terabytes & petabytes of data
B)Hardtop processing capabilities are huge and its real advantage lies in the ability to process terabytes & petabytes of data.
C)The programming model, MapReduce, used by Hadoop is difficult to write and test
D)All of these
Use Space or
up arrow
down arrow
to flip the card.
Question
__________ can best be described as a programming model used to develop Hadoop-based applications that can process massive amounts of data.

A)MapReduce
B)Mahout
C)Oozie
D)All of the mentioned
Question
__________ has the world's largest Hadoop cluster.

A)Apple
B)Datamatics
C)Facebook
D)None of the mentioned
Question
Facebook Tackles Big Data With _______ based on Hadoop.

A)'Project Prism'
B)'Prism'
C)'Project Big'
D)'Project Data'
Question
Data science is the process of diverse set of data through ?

A)organizing data
B)processing data
C)analysing data
D)All of the above
Question
The modern conception of data science as an independent discipline is sometimes attributed to?

A)William S.
B)John McCarthy
C)Arthur Samuel
D)Satoshi Nakamoto
Question
Which of the following language is used in Data science?

A)C
B)C++
C)R
D)Ruby
Question
Which of the following is false?

A)Subsetting can be used to select and exclude variables and observations
B)Raw data should be processed only one time.
C)Merging concerns combining datasets on the same observations to produce a result with more variables
D)None Of the above
Question
What is the work of Data Architect?

A)utilize large data sets to gather information that meets their company's needs
B)work with businesses to determine the best usage of the information yielded from data
C)build data solutions that are optimized for performance and design applications
D)All of the above
Question
Which of the following is correct skills for a Data Scientist?

A)Probability & Statistics
B)Machine Learning / Deep Learning
C)Data Wrangling
D)All of the above
Question
Which of the following are correct component for data science?

A)Data Engineering
B)Advanced Computing
C)Domain expertise
D)All of the above
Question
Which of the following is not a part of data science process?

A)Discovery
B)Model Planning
C)Communication Building
D)Operationalize
Question
Which of the following are the Data Sources in data science?

A)Structured
B)Unstructured
C)Both A and B
D)None Of the above
Question
Which of the following is not a application for data science?

A)Recommendation Systems
B)Image & Speech Recognition
C)Online Price Comparison
D)Privacy Checker
Question
Point out the correct statement.

A)Raw data is original source of data
B)Preprocessed data is original source of data
C)Raw data is the data obtained after processing steps
D)None of the above
Question
Which of the following is one of the key data science skills?

A)Statistics
B)Machine Learning
C)Data Visualization
D)All of the above
Question
Which of the following is a key characteristic of a hacker?

A)Afraid to say they don't know the answer
B)Willing to find answers on their own
C)Not Willing to find answers on their own
D)All of the above
Question
Raw data should be processed only one time.

A)True
B)False
C)Can be true or false
D)Can not say
Question
Which of the following is the common goal of statistical modelling?

A)Inference
B)Summarizing
C)Subsetting
D)None of the above
Question
Causal analysis is commonly applied to census data.

A)True
B)False
C)Can be true or false
D)Can not say
Question
Which of the following model is usually a gold standard for data analysis?

A)Inferential
B)Descriptive
C)Causal
D)All of the above
Question
Which of the following is a revision control system?

A)Git
B)Numpy
C)Scipy
D)Slidify
Question
Which of the following step is performed by data scientist after acquiring the data?

A)Data Cleaning
B)Data Integration
C)Data Replication
D)All of the above
Question
Which of the following focuses on the discovery of (previously) unknown properties on the data?

A)Data mining
B)BigData
C)Data wrangling
D)Machine Learning
Question
Which of the following can be used to create sub-samples using a maximum dissimilarity approach?

A)minDissim
B)maxDissim
C)inmaxDissim
D)All of the Mentioned
Unlock Deck
Sign up to unlock the cards in this deck!
Unlock Deck
Unlock Deck
1/25
auto play flashcards
Play
simple tutorial
Full screen (f)
exit full mode
Deck 2: Data Science
1
Point out the wrong statement.

A)Hardtop processing capabilities are huge and its real advantage lies in the ability to process terabytes & petabytes of data
B)Hardtop processing capabilities are huge and its real advantage lies in the ability to process terabytes & petabytes of data.
C)The programming model, MapReduce, used by Hadoop is difficult to write and test
D)All of these
The programming model, MapReduce, used by Hadoop is difficult to write and test
2
__________ can best be described as a programming model used to develop Hadoop-based applications that can process massive amounts of data.

A)MapReduce
B)Mahout
C)Oozie
D)All of the mentioned
MapReduce
3
__________ has the world's largest Hadoop cluster.

A)Apple
B)Datamatics
C)Facebook
D)None of the mentioned
Facebook
4
Facebook Tackles Big Data With _______ based on Hadoop.

A)'Project Prism'
B)'Prism'
C)'Project Big'
D)'Project Data'
Unlock Deck
Unlock for access to all 25 flashcards in this deck.
Unlock Deck
k this deck
5
Data science is the process of diverse set of data through ?

A)organizing data
B)processing data
C)analysing data
D)All of the above
Unlock Deck
Unlock for access to all 25 flashcards in this deck.
Unlock Deck
k this deck
6
The modern conception of data science as an independent discipline is sometimes attributed to?

A)William S.
B)John McCarthy
C)Arthur Samuel
D)Satoshi Nakamoto
Unlock Deck
Unlock for access to all 25 flashcards in this deck.
Unlock Deck
k this deck
7
Which of the following language is used in Data science?

A)C
B)C++
C)R
D)Ruby
Unlock Deck
Unlock for access to all 25 flashcards in this deck.
Unlock Deck
k this deck
8
Which of the following is false?

A)Subsetting can be used to select and exclude variables and observations
B)Raw data should be processed only one time.
C)Merging concerns combining datasets on the same observations to produce a result with more variables
D)None Of the above
Unlock Deck
Unlock for access to all 25 flashcards in this deck.
Unlock Deck
k this deck
9
What is the work of Data Architect?

A)utilize large data sets to gather information that meets their company's needs
B)work with businesses to determine the best usage of the information yielded from data
C)build data solutions that are optimized for performance and design applications
D)All of the above
Unlock Deck
Unlock for access to all 25 flashcards in this deck.
Unlock Deck
k this deck
10
Which of the following is correct skills for a Data Scientist?

A)Probability & Statistics
B)Machine Learning / Deep Learning
C)Data Wrangling
D)All of the above
Unlock Deck
Unlock for access to all 25 flashcards in this deck.
Unlock Deck
k this deck
11
Which of the following are correct component for data science?

A)Data Engineering
B)Advanced Computing
C)Domain expertise
D)All of the above
Unlock Deck
Unlock for access to all 25 flashcards in this deck.
Unlock Deck
k this deck
12
Which of the following is not a part of data science process?

A)Discovery
B)Model Planning
C)Communication Building
D)Operationalize
Unlock Deck
Unlock for access to all 25 flashcards in this deck.
Unlock Deck
k this deck
13
Which of the following are the Data Sources in data science?

A)Structured
B)Unstructured
C)Both A and B
D)None Of the above
Unlock Deck
Unlock for access to all 25 flashcards in this deck.
Unlock Deck
k this deck
14
Which of the following is not a application for data science?

A)Recommendation Systems
B)Image & Speech Recognition
C)Online Price Comparison
D)Privacy Checker
Unlock Deck
Unlock for access to all 25 flashcards in this deck.
Unlock Deck
k this deck
15
Point out the correct statement.

A)Raw data is original source of data
B)Preprocessed data is original source of data
C)Raw data is the data obtained after processing steps
D)None of the above
Unlock Deck
Unlock for access to all 25 flashcards in this deck.
Unlock Deck
k this deck
16
Which of the following is one of the key data science skills?

A)Statistics
B)Machine Learning
C)Data Visualization
D)All of the above
Unlock Deck
Unlock for access to all 25 flashcards in this deck.
Unlock Deck
k this deck
17
Which of the following is a key characteristic of a hacker?

A)Afraid to say they don't know the answer
B)Willing to find answers on their own
C)Not Willing to find answers on their own
D)All of the above
Unlock Deck
Unlock for access to all 25 flashcards in this deck.
Unlock Deck
k this deck
18
Raw data should be processed only one time.

A)True
B)False
C)Can be true or false
D)Can not say
Unlock Deck
Unlock for access to all 25 flashcards in this deck.
Unlock Deck
k this deck
19
Which of the following is the common goal of statistical modelling?

A)Inference
B)Summarizing
C)Subsetting
D)None of the above
Unlock Deck
Unlock for access to all 25 flashcards in this deck.
Unlock Deck
k this deck
20
Causal analysis is commonly applied to census data.

A)True
B)False
C)Can be true or false
D)Can not say
Unlock Deck
Unlock for access to all 25 flashcards in this deck.
Unlock Deck
k this deck
21
Which of the following model is usually a gold standard for data analysis?

A)Inferential
B)Descriptive
C)Causal
D)All of the above
Unlock Deck
Unlock for access to all 25 flashcards in this deck.
Unlock Deck
k this deck
22
Which of the following is a revision control system?

A)Git
B)Numpy
C)Scipy
D)Slidify
Unlock Deck
Unlock for access to all 25 flashcards in this deck.
Unlock Deck
k this deck
23
Which of the following step is performed by data scientist after acquiring the data?

A)Data Cleaning
B)Data Integration
C)Data Replication
D)All of the above
Unlock Deck
Unlock for access to all 25 flashcards in this deck.
Unlock Deck
k this deck
24
Which of the following focuses on the discovery of (previously) unknown properties on the data?

A)Data mining
B)BigData
C)Data wrangling
D)Machine Learning
Unlock Deck
Unlock for access to all 25 flashcards in this deck.
Unlock Deck
k this deck
25
Which of the following can be used to create sub-samples using a maximum dissimilarity approach?

A)minDissim
B)maxDissim
C)inmaxDissim
D)All of the Mentioned
Unlock Deck
Unlock for access to all 25 flashcards in this deck.
Unlock Deck
k this deck
locked card icon
Unlock Deck
Unlock for access to all 25 flashcards in this deck.