Deck 3: Finding Relationships Among Variables

Full screen (f)
exit full mode
Question
An example of a joint category of two variables is the count of all non-drinkers who are also nonsmokers.
Use Space or
up arrow
down arrow
to flip the card.
Question
Comparing a numerical variable across two or more subpopulations is known as a comparison problem.
Question
Correlation is not useful for describing the strength and direction of nonlinear relationships.
Question
The correlation between two variables is unitless and always between -1 and +1.
Question
Correlation is a single-number summary of a scatterplot.
Question
Strongly related variables may have a correlation close to zero if the relationship is nonlinear.
Question
Counts for a categorical variable are often expressed as percentages of the total.
Question
The scatterplot is a graphical technique used to display the relationship between two numerical variables.
Question
A trend line on a scatterplot is a line or a curve that "fits" the scatter as well as possible.
Question
We cannot attempt to interpret correlations numerically,with the one possible exception of indicating whether they are positive or negative.
Question
If the standard deviations of X and Y are 15.5 and 10.8,respectively,and the covariance of X and Y is 128.8,then the correlation coefficient is approximately 0.77.
Question
We can use side-by-side boxplots to compare at most 2 distributions of numeric data.
Question
The advantage that correlation has over covariance is that correlation has a set lower and upper limit.
Question
Correlation is affected by the measurement scales applied to the X and Y variables.
Question
If the standard deviation of X is 15,the covariance of X and Y is 94.5,and the correlation is 0.90,then the variance of Y is 7.0.
Question
The cutoff for defining a large correlation is 0.5.
Question
To form a scatterplot of X versus Y,X and Y must be paired variables.
Question
Side-by-side box plots allow you to quickly see how two or more categories of a numerical variable compare.
Question
If the coefficient of correlation r = 0 .80,the standard deviations of X and Y are 20 and 25,respectively,then Cov(X,Y)must be 400.
Question
Correlation and covariance can be used to examine relationships between numeric variables as well as for categorical variables that have been coded numerically.
Question
To examine relationships between two categorical variables,we can use

A)counts and corresponding charts of the counts.
B)scatter plots.
C)histograms.
D)boxplots.
Question
Tables used to display counts of a categorical variable are called

A)crosstabs.
B)contingency tables.
C)either crosstabs or contingency tables.
D)neither crosstabs nor contingency tables.
Question
Relationships between two variables are less evident when counts are expressed as percentages of row totals or column totals.
Question
Examples of comparison problems include

A)salary broken down by male and female subpopulations.
B)cost of living broken down by region of a country.
C)recovery rate for a disease broken down by patients who have taken a drug and patients who have taken a placebo.
D)all of these choices.
Question
Displaying all correlations between 0.6 and 0.999 on a scatterplot as green and all correlations between -1.0 and -0.6 as red is known as _____ formatting.

A)rank-order
B)categorical
C)conditional
D)numerical
Question
We study relationships among numerical variables using

A)pie charts.
B)counts.
C)scatterplot charts.
D)percentages.
Question
One characteristic of "paired variables" is that

A)one variable is a negative value and the other is a positive value.
B)both variables are positive values.
C)each variable has the same number of observations.
D)each variable has a different number of observations.
Question
The limitation of covariance as a descriptive measure of association is that it

A)only captures positive relationships.
B)does not capture the units of the variables.
C)is very sensitive to the units of the variables.
D)is invalid if one of the variables is categorical.
Question
Which correlation coefficient suggests the strongest relationship?

A)+1
B)-1
C)0
D)+0.5
Question
Correlation and covariance measure the

A)strength of a linear relationship between two numerical variables.
B)direction of a linear relationship between two numerical variables.
C)strength and direction of a linear relationship between two numerical variables.
D)strength and direction of a linear relationship between two categorical variables.
Question
Which of the following are considered numerical summary measures?

A)Mean and variance
B)Variance and correlation
C)Correlation and covariance
D)Covariance and variance
Question
Statisticians often refer to the pivot tables that display counts as contingency tables or crosstabs.
Question
Scatterplots are also referred to as

A)crosstabs.
B)contingency charts.
C)X-Y charts.
D)all of these choices
Question
A useful way of comparing the distribution of a numerical variable across categories of some categorical variable is with

A)a side-by-side box plot.
B)a side-by-side pivot table.
C)a side-by-side plot or side-by-side pivot table.
D)neither a side-by-side box plot nor side-by-side pivot table.
Question
The most common data format is

A)long.
B)short.
C)stacked.
D)unstacked.
Question
If the correlation of variables is close to 0,then we expect to see a(n)_____ on the scatterplot.

A)upward sloping cluster of points
B)downward sloping cluster of points
C)cluster of points around a trendline
D)random scatter of points with no apparent relationship
Question
A line or curve superimposed on a scatterplot to quantify an apparent relationship is known as a(n)

A)average.
B)trend line.
C)slope.
D)function.
Question
Correlation is useful only for

A)assessing the weakness of a linear relationship.
B)conveying the same information in a simpler format than a scatterplot.
C)measuring the strength of a linear relationship.
D)measuring the strength of a nonlinear relationship.
Question
We can infer that there is a strong relationship between two numeric variables when the points on a scatterplot

A)cluster tightly around a straight line.
B)are randomly scattered in no clear pattern.
C)display a positive relationship.
D)display a negative relationship.
Question
The filters field of a pivot table contains the data that you want summarized.
Question
Approximate the percentage of these Internet users who are in the 58-71 age group.
Question
The tables of counts that result from pivot tables are often called

A)samples.
B)sub-tables.
C)populations.
D)crosstabs.
Question
What is the average annual salary of the employed Internet users in this sample?
Question
Approximate the percentage of these Internet users who are married.
Question
Approximate the percentage of these Internet users who are single with no formal education beyond high school.
Question
Which two variables have the strongest linear relationship with annual salary?
Question
Changing the location of fields in a pivot table is known as

A)slicing.
B)dicing.
C)sorting.
D)pivoting.
Question
Approximate the percentage of these Internet users who are currently employed.
Question
The four areas of a pivot table are

A)Crosstabs,Fields,Rows,and Columns.
B)Data,Count,Contingency,and Percentage.
C)Filters,Rows,Columns,and Values.
D)Sort,Rows,Columns,and Count.
Question
Approximate the percentage of these Internet users who are married with formal education beyond high school.
Question
What does a scatterplot illustrate?

A)The median of the variables
B)What type of relationship there is between two variables
C)The percentage of values that fall in a particular category
D)The variability of the middle 50% of the data
Question
What percentage of these internet users has formal education beyond high school?
Question
Approximate the percentage of these internet users who are women in the 30-43 age group.
Question
Approximate the percentage of these internet users who are women.
Question
Approximate the percentage of these Internet users who are men under the age of 30.
Question
The tool that provides useful information about a data set by breaking it down into categories is a

A)histogram.
B)scatterplot.
C)pivot table.
D)spreadsheet.
Unlock Deck
Sign up to unlock the cards in this deck!
Unlock Deck
Unlock Deck
1/56
auto play flashcards
Play
simple tutorial
Full screen (f)
exit full mode
Deck 3: Finding Relationships Among Variables
1
An example of a joint category of two variables is the count of all non-drinkers who are also nonsmokers.
True
2
Comparing a numerical variable across two or more subpopulations is known as a comparison problem.
True
3
Correlation is not useful for describing the strength and direction of nonlinear relationships.
True
4
The correlation between two variables is unitless and always between -1 and +1.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
5
Correlation is a single-number summary of a scatterplot.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
6
Strongly related variables may have a correlation close to zero if the relationship is nonlinear.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
7
Counts for a categorical variable are often expressed as percentages of the total.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
8
The scatterplot is a graphical technique used to display the relationship between two numerical variables.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
9
A trend line on a scatterplot is a line or a curve that "fits" the scatter as well as possible.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
10
We cannot attempt to interpret correlations numerically,with the one possible exception of indicating whether they are positive or negative.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
11
If the standard deviations of X and Y are 15.5 and 10.8,respectively,and the covariance of X and Y is 128.8,then the correlation coefficient is approximately 0.77.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
12
We can use side-by-side boxplots to compare at most 2 distributions of numeric data.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
13
The advantage that correlation has over covariance is that correlation has a set lower and upper limit.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
14
Correlation is affected by the measurement scales applied to the X and Y variables.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
15
If the standard deviation of X is 15,the covariance of X and Y is 94.5,and the correlation is 0.90,then the variance of Y is 7.0.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
16
The cutoff for defining a large correlation is 0.5.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
17
To form a scatterplot of X versus Y,X and Y must be paired variables.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
18
Side-by-side box plots allow you to quickly see how two or more categories of a numerical variable compare.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
19
If the coefficient of correlation r = 0 .80,the standard deviations of X and Y are 20 and 25,respectively,then Cov(X,Y)must be 400.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
20
Correlation and covariance can be used to examine relationships between numeric variables as well as for categorical variables that have been coded numerically.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
21
To examine relationships between two categorical variables,we can use

A)counts and corresponding charts of the counts.
B)scatter plots.
C)histograms.
D)boxplots.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
22
Tables used to display counts of a categorical variable are called

A)crosstabs.
B)contingency tables.
C)either crosstabs or contingency tables.
D)neither crosstabs nor contingency tables.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
23
Relationships between two variables are less evident when counts are expressed as percentages of row totals or column totals.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
24
Examples of comparison problems include

A)salary broken down by male and female subpopulations.
B)cost of living broken down by region of a country.
C)recovery rate for a disease broken down by patients who have taken a drug and patients who have taken a placebo.
D)all of these choices.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
25
Displaying all correlations between 0.6 and 0.999 on a scatterplot as green and all correlations between -1.0 and -0.6 as red is known as _____ formatting.

A)rank-order
B)categorical
C)conditional
D)numerical
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
26
We study relationships among numerical variables using

A)pie charts.
B)counts.
C)scatterplot charts.
D)percentages.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
27
One characteristic of "paired variables" is that

A)one variable is a negative value and the other is a positive value.
B)both variables are positive values.
C)each variable has the same number of observations.
D)each variable has a different number of observations.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
28
The limitation of covariance as a descriptive measure of association is that it

A)only captures positive relationships.
B)does not capture the units of the variables.
C)is very sensitive to the units of the variables.
D)is invalid if one of the variables is categorical.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
29
Which correlation coefficient suggests the strongest relationship?

A)+1
B)-1
C)0
D)+0.5
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
30
Correlation and covariance measure the

A)strength of a linear relationship between two numerical variables.
B)direction of a linear relationship between two numerical variables.
C)strength and direction of a linear relationship between two numerical variables.
D)strength and direction of a linear relationship between two categorical variables.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
31
Which of the following are considered numerical summary measures?

A)Mean and variance
B)Variance and correlation
C)Correlation and covariance
D)Covariance and variance
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
32
Statisticians often refer to the pivot tables that display counts as contingency tables or crosstabs.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
33
Scatterplots are also referred to as

A)crosstabs.
B)contingency charts.
C)X-Y charts.
D)all of these choices
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
34
A useful way of comparing the distribution of a numerical variable across categories of some categorical variable is with

A)a side-by-side box plot.
B)a side-by-side pivot table.
C)a side-by-side plot or side-by-side pivot table.
D)neither a side-by-side box plot nor side-by-side pivot table.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
35
The most common data format is

A)long.
B)short.
C)stacked.
D)unstacked.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
36
If the correlation of variables is close to 0,then we expect to see a(n)_____ on the scatterplot.

A)upward sloping cluster of points
B)downward sloping cluster of points
C)cluster of points around a trendline
D)random scatter of points with no apparent relationship
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
37
A line or curve superimposed on a scatterplot to quantify an apparent relationship is known as a(n)

A)average.
B)trend line.
C)slope.
D)function.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
38
Correlation is useful only for

A)assessing the weakness of a linear relationship.
B)conveying the same information in a simpler format than a scatterplot.
C)measuring the strength of a linear relationship.
D)measuring the strength of a nonlinear relationship.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
39
We can infer that there is a strong relationship between two numeric variables when the points on a scatterplot

A)cluster tightly around a straight line.
B)are randomly scattered in no clear pattern.
C)display a positive relationship.
D)display a negative relationship.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
40
The filters field of a pivot table contains the data that you want summarized.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
41
Approximate the percentage of these Internet users who are in the 58-71 age group.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
42
The tables of counts that result from pivot tables are often called

A)samples.
B)sub-tables.
C)populations.
D)crosstabs.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
43
What is the average annual salary of the employed Internet users in this sample?
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
44
Approximate the percentage of these Internet users who are married.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
45
Approximate the percentage of these Internet users who are single with no formal education beyond high school.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
46
Which two variables have the strongest linear relationship with annual salary?
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
47
Changing the location of fields in a pivot table is known as

A)slicing.
B)dicing.
C)sorting.
D)pivoting.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
48
Approximate the percentage of these Internet users who are currently employed.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
49
The four areas of a pivot table are

A)Crosstabs,Fields,Rows,and Columns.
B)Data,Count,Contingency,and Percentage.
C)Filters,Rows,Columns,and Values.
D)Sort,Rows,Columns,and Count.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
50
Approximate the percentage of these Internet users who are married with formal education beyond high school.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
51
What does a scatterplot illustrate?

A)The median of the variables
B)What type of relationship there is between two variables
C)The percentage of values that fall in a particular category
D)The variability of the middle 50% of the data
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
52
What percentage of these internet users has formal education beyond high school?
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
53
Approximate the percentage of these internet users who are women in the 30-43 age group.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
54
Approximate the percentage of these internet users who are women.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
55
Approximate the percentage of these Internet users who are men under the age of 30.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
56
The tool that provides useful information about a data set by breaking it down into categories is a

A)histogram.
B)scatterplot.
C)pivot table.
D)spreadsheet.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
locked card icon
Unlock Deck
Unlock for access to all 56 flashcards in this deck.