Deck 4: Data Analysis and Visualization

Full screen (f)
exit full mode
Question
Which of the following is not a major data analysis approaches?

A)Data Mining
B)Predictive Intelligence
C)Business Intelligence
D)Text Analytics
Use Space or
up arrow
down arrow
to flip the card.
Question
How many main statistical methodologies are used in data analysis?

A)2
B)3
C)4
D)5
Question
In descriptive statistics, data from the entire population or a sample is summarized with ?

A)integer descriptors
B)floating descriptors
C)numerical descriptors
D)decimal descriptors
Question
Data Analysis is defined by the statistician?

A)William S.
B)Hans Peter Luhn
C)Gregory Piatetsky-Shapiro
D)John Tukey
Question
Which of the following is true about hypothesis testing?

A)answering yes/no questions about the data
B)estimating numerical characteristics of the data
C)describing associations within the data
D)modeling relationships within the data
Question
The goal of business intelligence is to allow easy interpretation of large volumes of data to identify new opportunities.

A)TRUE
B)FALSE
C)Can be true or false
D)Can not say
Question
The branch of statistics which deals with development of particular statistical methods is classified as

A)industry statistics
B)economic statistics
C)applied statistics.
D)applied statistics
Question
Which of the following is true about regression analysis?

A)answering yes/no questions about the data
B)estimating numerical characteristics of the data
C)modeling relationships within the data
D)describing associations within the data
Question
Text Analytics, also referred to as Text Mining?

A)TRUE
B)FALSE
C)Can be true or false
D)Can not say
Question
What is true about Data Visualization?

A)Data Visualization is used to communicate information clearly and efficiently to users by the usage of information graphics such as tables and charts.
B)Data Visualization helps users in analyzing a large amount of data in a simpler way.
C)Data Visualization makes complex data more accessible, understandable, and usable.
D)All of the above
Question
Data can be visualized using?

A)graphs
B)charts
C)maps
D)All of the above
Question
Data visualization is also an element of the broader _____________.

A)deliver presentation architecture
B)data presentation architecture
C)dataset presentation architecture
D)data process architecture
Question
Which method shows hierarchical data in a nested format?

A)Treemaps
B)Scatter plots
C)Population pyramids
D)Area charts
Question
Which is used to inference for 1 proportion using normal approx?

A)fisher.test()
B)chisq.test()
C)Lm.test()
D)prop.test()
Question
Which is used to find the factor congruence coefficients?

A)factor.mosaicplot
B)factor.xyplot
C)factor.congruence
D)factor.cumsum
Question
Which of the following is tool for checking normality?

A)qqline()
B)qline()
C)anova()
D)lm()
Question
Which of the following is false?

A)data visualization include the ability to absorb information quickly
B)Data visualization is another form of visual art
C)Data visualization decrease the insights and take solwer decisions
D)None Of the above
Question
Common use cases for data visualization include?

A)Politics
B)Sales and marketing
C)Healthcare
D)All of the above
Question
Which of the following plots are often used for checking randomness in time series?

A)Autocausation
B)Autorank
C)Autocorrelation
D)None of the above
Question
To find the minimum or the maximum of a function, we set the gradient to zero because:

A)The value of the gradient at extrema of a function is always zero
B)Depends on the type of problem
C)Both A and B
D)None of the above
Question
Which of the following techniques can not be used for normalization in text mining?

A)Stemming
B)Lemmatization
C)Stop Word Removal
D)None of the above
Question
In which of the following cases will K-means clustering fail to give good results?
1) Data points with outliers
2) Data points with different densities
3) Data points with nonconvex shapes

A)1 and 2
B)2 and 3
C)1 and 3
D)All of the above
Question
Which of the following is a reasonable way to select the number of principal components "k"?

A)Choose k to be the smallest value so that at least 99% of the varinace is retained.
B)Choose k to be 99% of m (k = 0.99*m, rounded to the nearest integer).
C)Choose k to be the largest value so that 99% of the variance is retained.
D)Use the elbow method.
Question
Which of the following is false?

A)Subsetting can be used to select and exclude variables and observations
B)Raw data should be processed only one time.
C)Merging concerns combining datasets on the same observations to produce a result with more variables
D)None Of the above
Question
According to analysts, for what can traditional IT systems provide a foundation when they're integrated with big data technologies like Hadoop?

A)Big data management and data mining
B)Data warehousing and business intelligence
C)Management of Hadoop clusters
D)Collecting and storing unstructured data
Question
________ Programming language is dialect of S.

A)B
B)C
C)R
D)None of the above
Question
File containing R scripts end with extension _______.

A).R
B).S
C).bigdata
D)All of the above
Question
Which of the following is a subset of machine learning?

A)Numpy
B)SciPy
C)Deep Learning
D)All of the above
Question
How many layers Deep learning algorithms are constructed?

A)2
B)3
C)4
D)5
Unlock Deck
Sign up to unlock the cards in this deck!
Unlock Deck
Unlock Deck
1/29
auto play flashcards
Play
simple tutorial
Full screen (f)
exit full mode
Deck 4: Data Analysis and Visualization
1
Which of the following is not a major data analysis approaches?

A)Data Mining
B)Predictive Intelligence
C)Business Intelligence
D)Text Analytics
Predictive Intelligence
2
How many main statistical methodologies are used in data analysis?

A)2
B)3
C)4
D)5
2
3
In descriptive statistics, data from the entire population or a sample is summarized with ?

A)integer descriptors
B)floating descriptors
C)numerical descriptors
D)decimal descriptors
numerical descriptors
4
Data Analysis is defined by the statistician?

A)William S.
B)Hans Peter Luhn
C)Gregory Piatetsky-Shapiro
D)John Tukey
Unlock Deck
Unlock for access to all 29 flashcards in this deck.
Unlock Deck
k this deck
5
Which of the following is true about hypothesis testing?

A)answering yes/no questions about the data
B)estimating numerical characteristics of the data
C)describing associations within the data
D)modeling relationships within the data
Unlock Deck
Unlock for access to all 29 flashcards in this deck.
Unlock Deck
k this deck
6
The goal of business intelligence is to allow easy interpretation of large volumes of data to identify new opportunities.

A)TRUE
B)FALSE
C)Can be true or false
D)Can not say
Unlock Deck
Unlock for access to all 29 flashcards in this deck.
Unlock Deck
k this deck
7
The branch of statistics which deals with development of particular statistical methods is classified as

A)industry statistics
B)economic statistics
C)applied statistics.
D)applied statistics
Unlock Deck
Unlock for access to all 29 flashcards in this deck.
Unlock Deck
k this deck
8
Which of the following is true about regression analysis?

A)answering yes/no questions about the data
B)estimating numerical characteristics of the data
C)modeling relationships within the data
D)describing associations within the data
Unlock Deck
Unlock for access to all 29 flashcards in this deck.
Unlock Deck
k this deck
9
Text Analytics, also referred to as Text Mining?

A)TRUE
B)FALSE
C)Can be true or false
D)Can not say
Unlock Deck
Unlock for access to all 29 flashcards in this deck.
Unlock Deck
k this deck
10
What is true about Data Visualization?

A)Data Visualization is used to communicate information clearly and efficiently to users by the usage of information graphics such as tables and charts.
B)Data Visualization helps users in analyzing a large amount of data in a simpler way.
C)Data Visualization makes complex data more accessible, understandable, and usable.
D)All of the above
Unlock Deck
Unlock for access to all 29 flashcards in this deck.
Unlock Deck
k this deck
11
Data can be visualized using?

A)graphs
B)charts
C)maps
D)All of the above
Unlock Deck
Unlock for access to all 29 flashcards in this deck.
Unlock Deck
k this deck
12
Data visualization is also an element of the broader _____________.

A)deliver presentation architecture
B)data presentation architecture
C)dataset presentation architecture
D)data process architecture
Unlock Deck
Unlock for access to all 29 flashcards in this deck.
Unlock Deck
k this deck
13
Which method shows hierarchical data in a nested format?

A)Treemaps
B)Scatter plots
C)Population pyramids
D)Area charts
Unlock Deck
Unlock for access to all 29 flashcards in this deck.
Unlock Deck
k this deck
14
Which is used to inference for 1 proportion using normal approx?

A)fisher.test()
B)chisq.test()
C)Lm.test()
D)prop.test()
Unlock Deck
Unlock for access to all 29 flashcards in this deck.
Unlock Deck
k this deck
15
Which is used to find the factor congruence coefficients?

A)factor.mosaicplot
B)factor.xyplot
C)factor.congruence
D)factor.cumsum
Unlock Deck
Unlock for access to all 29 flashcards in this deck.
Unlock Deck
k this deck
16
Which of the following is tool for checking normality?

A)qqline()
B)qline()
C)anova()
D)lm()
Unlock Deck
Unlock for access to all 29 flashcards in this deck.
Unlock Deck
k this deck
17
Which of the following is false?

A)data visualization include the ability to absorb information quickly
B)Data visualization is another form of visual art
C)Data visualization decrease the insights and take solwer decisions
D)None Of the above
Unlock Deck
Unlock for access to all 29 flashcards in this deck.
Unlock Deck
k this deck
18
Common use cases for data visualization include?

A)Politics
B)Sales and marketing
C)Healthcare
D)All of the above
Unlock Deck
Unlock for access to all 29 flashcards in this deck.
Unlock Deck
k this deck
19
Which of the following plots are often used for checking randomness in time series?

A)Autocausation
B)Autorank
C)Autocorrelation
D)None of the above
Unlock Deck
Unlock for access to all 29 flashcards in this deck.
Unlock Deck
k this deck
20
To find the minimum or the maximum of a function, we set the gradient to zero because:

A)The value of the gradient at extrema of a function is always zero
B)Depends on the type of problem
C)Both A and B
D)None of the above
Unlock Deck
Unlock for access to all 29 flashcards in this deck.
Unlock Deck
k this deck
21
Which of the following techniques can not be used for normalization in text mining?

A)Stemming
B)Lemmatization
C)Stop Word Removal
D)None of the above
Unlock Deck
Unlock for access to all 29 flashcards in this deck.
Unlock Deck
k this deck
22
In which of the following cases will K-means clustering fail to give good results?
1) Data points with outliers
2) Data points with different densities
3) Data points with nonconvex shapes

A)1 and 2
B)2 and 3
C)1 and 3
D)All of the above
Unlock Deck
Unlock for access to all 29 flashcards in this deck.
Unlock Deck
k this deck
23
Which of the following is a reasonable way to select the number of principal components "k"?

A)Choose k to be the smallest value so that at least 99% of the varinace is retained.
B)Choose k to be 99% of m (k = 0.99*m, rounded to the nearest integer).
C)Choose k to be the largest value so that 99% of the variance is retained.
D)Use the elbow method.
Unlock Deck
Unlock for access to all 29 flashcards in this deck.
Unlock Deck
k this deck
24
Which of the following is false?

A)Subsetting can be used to select and exclude variables and observations
B)Raw data should be processed only one time.
C)Merging concerns combining datasets on the same observations to produce a result with more variables
D)None Of the above
Unlock Deck
Unlock for access to all 29 flashcards in this deck.
Unlock Deck
k this deck
25
According to analysts, for what can traditional IT systems provide a foundation when they're integrated with big data technologies like Hadoop?

A)Big data management and data mining
B)Data warehousing and business intelligence
C)Management of Hadoop clusters
D)Collecting and storing unstructured data
Unlock Deck
Unlock for access to all 29 flashcards in this deck.
Unlock Deck
k this deck
26
________ Programming language is dialect of S.

A)B
B)C
C)R
D)None of the above
Unlock Deck
Unlock for access to all 29 flashcards in this deck.
Unlock Deck
k this deck
27
File containing R scripts end with extension _______.

A).R
B).S
C).bigdata
D)All of the above
Unlock Deck
Unlock for access to all 29 flashcards in this deck.
Unlock Deck
k this deck
28
Which of the following is a subset of machine learning?

A)Numpy
B)SciPy
C)Deep Learning
D)All of the above
Unlock Deck
Unlock for access to all 29 flashcards in this deck.
Unlock Deck
k this deck
29
How many layers Deep learning algorithms are constructed?

A)2
B)3
C)4
D)5
Unlock Deck
Unlock for access to all 29 flashcards in this deck.
Unlock Deck
k this deck
locked card icon
Unlock Deck
Unlock for access to all 29 flashcards in this deck.