Deck 7: Linear Regression
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Unlock Deck
Sign up to unlock the cards in this deck!
Unlock Deck
Unlock Deck
1/46
Play
Full screen (f)
Deck 7: Linear Regression
1
In a simple linear regression model, y = ß0 + ß1x + ε the parameter ß1 represents the
A)intercept.
B)slope of the true regression line.
C)mean value of x.
D)error term.
A)intercept.
B)slope of the true regression line.
C)mean value of x.
D)error term.
slope of the true regression line.
2
In the simple linear regression model, the ____________ accounts for the variability in the dependent variable that cannot be explained by the linear relationship between the variables.
A)constant term
B)error term
C)model parameter
D)residual
A)constant term
B)error term
C)model parameter
D)residual
error term
3
The __________ is a measure of the error that results from using the estimated regression equation to predict the values of the dependent variable in the sample.
A)sum of squares due to regression (SSR)
B)error term
C)sum of squares due to error (SSE)
D)residual
A)sum of squares due to regression (SSR)
B)error term
C)sum of squares due to error (SSE)
D)residual
sum of squares due to error (SSE)
4
In a linear regression model, the variable that is being predicted or explained is known as _____________. It is denoted by y and is often referred to as the response variable.
A)dependent variable
B)independent variable
C)residual variable
D)regression variable
A)dependent variable
B)independent variable
C)residual variable
D)regression variable
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
5
Prediction of the value of the dependent variable outside the experimental region is called
A)interpolation.
B)forecasting.
C)averaging.
D)extrapolation.
A)interpolation.
B)forecasting.
C)averaging.
D)extrapolation.
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
6
The least squares regression line minimizes the sum of the
A)differences between actual and predicted y values.
B)absolute deviations between actual and predicted y values.
C)absolute deviations between actual and predicted x values.
D)squared differences between actual and predicted y values.
A)differences between actual and predicted y values.
B)absolute deviations between actual and predicted y values.
C)absolute deviations between actual and predicted x values.
D)squared differences between actual and predicted y values.
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
7
A __________ is used to visualize sample data graphically and to draw preliminary conclusions about the possible relationship between the variables.
A)contingency table
B)scatter chart
C)Gantt chart
D)pie chart
A)contingency table
B)scatter chart
C)Gantt chart
D)pie chart
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
8
What would be the value of the sum of squares due to regression (SSR) if the total sum of squares (SST) is 25.32 and the sum of squares due to error (SSE) is 6.89?
A)31.89
B)19.32
C)18.43
D)15.32
A)31.89
B)19.32
C)18.43
D)15.32
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
9
The graph of the simple linear regression equation is a(n)
A)ellipse.
B)hyperbola.
C)parabola.
D)straight line.
A)ellipse.
B)hyperbola.
C)parabola.
D)straight line.
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
10
The difference between the observed value of the dependent variable and the value predicted using the estimated regression equation is known as the
A)constant term.
B)error term.
C)residual.
D)model parameter.
A)constant term.
B)error term.
C)residual.
D)model parameter.
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
11
In a linear regression model, the variable (or variables) used for predicting or explaining values of the response variable are known as the __________. It(they) is(are) denoted by x.
A)dependent variable
B)independent variable
C)residual variable
D)regression variable
A)dependent variable
B)independent variable
C)residual variable
D)regression variable
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
12
A regression analysis involving one independent variable and one dependent variable is referred to as a
A)factor analysis.
B)time series analysis.
C)simple linear regression.
D)data mining.
A)factor analysis.
B)time series analysis.
C)simple linear regression.
D)data mining.
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
13
The __________ is the range of values of the independent variables in the data used to estimate the regression model.
A)confidence interval
B)codomain
C)experimental region
D)validation set
A)confidence interval
B)codomain
C)experimental region
D)validation set
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
14
__________ is a statistical procedure used to develop an equation showing how two variables are related.
A)Regression analysis
B)Data mining
C)Time series analysis
D)Factor analysis
A)Regression analysis
B)Data mining
C)Time series analysis
D)Factor analysis
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
15
A procedure for using sample data to find the estimated regression equation is
A)point estimation.
B)interval estimation.
C)the least squares method.
D)extrapolation.
A)point estimation.
B)interval estimation.
C)the least squares method.
D)extrapolation.
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
16
When the mean value of the dependent variable is independent of variation in the independent variable, the slope of the regression line is
A)positive.
B)zero.
C)negative.
D)infinite.
A)positive.
B)zero.
C)negative.
D)infinite.
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
17
In the graph of the simple linear regression equation, the parameter ß0 represents the ___________ of the true regression line.
A)slope
B)x-intercept
C)y-intercept
D)end-point
A)slope
B)x-intercept
C)y-intercept
D)end-point
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
18
Prediction of the mean value of the dependent variable y for values of the independent variables x1, x2, . . . , xq that are outside the experimental range is called
A)dummy variable.
B)overfitting.
C)extrapolation.
D)interaction.
A)dummy variable.
B)overfitting.
C)extrapolation.
D)interaction.
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
19
In a simple linear regression analysis the quantity that gives the amount by which the dependent variable changes for a unit change in the independent variable is called the
A)coefficient of determination.
B)slope of the regression line.
C)correlation coefficient.
D)standard error.
A)coefficient of determination.
B)slope of the regression line.
C)correlation coefficient.
D)standard error.
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
20
In the graph of the simple linear regression equation, the parameter ß1 is the ___________ of the true regression line.
A)slope
B)x-intercept
C)y-intercept
D)end-point
A)slope
B)x-intercept
C)y-intercept
D)end-point
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
21
A variable used to model the effect of categorical independent variables in a regression model is known as a
A)dependent variable.
B)response.
C)dummy variable.
D)predictor variable.
A)dependent variable.
B)response.
C)dummy variable.
D)predictor variable.
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
22
The ___________ is a measure of the goodness of fit of the estimated regression equation. It can be interpreted as the proportion of the variability in the dependent variable y that is explained by the estimated regression equation.
A)residual
B)coefficient of determination
C)dummy variable
D)interaction variable
A)residual
B)coefficient of determination
C)dummy variable
D)interaction variable
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
23
The prespecified value of the independent variable at which its relationship with the dependent variable changes in a piecewise linear regression model is referred to as the
A)milestone.
B)knot.
C)tipping point.
D)watchpoint.
A)milestone.
B)knot.
C)tipping point.
D)watchpoint.
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
24
The degree of correlation among independent variables in a regression model is called
A)multicollinearity.
B)interaction.
C)the coefficient of determination.
D)the sum of squared errors (SSE).
A)multicollinearity.
B)interaction.
C)the coefficient of determination.
D)the sum of squared errors (SSE).
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
25
A variable used to model the effect of categorical independent variables in a regression model which generally takes only the value zero or one is called
A)a residual.
B)the coefficient of determination.
C)a dummy variable.
D)interaction.
A)a residual.
B)the coefficient of determination.
C)a dummy variable.
D)interaction.
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
26
A normally distributed error term with a mean of zero would
A)have values that are symmetric about the variance.
B)allow more accurate modeling.
C)yield biased regression estimates.
D)be a hyperbolic curve.
A)have values that are symmetric about the variance.
B)allow more accurate modeling.
C)yield biased regression estimates.
D)be a hyperbolic curve.
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
27
Which of the following regression models is used to model a nonlinear relationship between the independent and dependent variables by including the independent variable and the square of the independent variable in the model?
A)Multiple regression model
B)Quadratic regression model
C)Simple regression model
D)Least squares regression model
A)Multiple regression model
B)Quadratic regression model
C)Simple regression model
D)Least squares regression model
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
28
The process of making estimates and drawing conclusions about one or more characteristics of a population through analysis of sample data drawn from the population is known as
A)inductive inference.
B)deductive inference.
C)statistical inference.
D)Bayesian inference.
A)inductive inference.
B)deductive inference.
C)statistical inference.
D)Bayesian inference.
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
29
The scatter chart below displays the residuals versus the dependent variable, x. Which of the following conclusions can be drawn based upon this scatter chart? 
A)The residuals have a constant variance.
B)The model captures the relationship between the variables accurately.
C)The model underpredicts the value of the dependent variable for intermediate values of the independent variable.
D)The residual distribution is not normally distributed.

A)The residuals have a constant variance.
B)The model captures the relationship between the variables accurately.
C)The model underpredicts the value of the dependent variable for intermediate values of the independent variable.
D)The residual distribution is not normally distributed.
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
30
The scatter chart below displays the residuals versus the dependent variable, x. Which of the following conclusions can be drawn based upon this scatter chart? 
A)The residuals have a constant variance.
B)The model fails to capture the relationship between the variables accurately.
C)The model overpredicts the value of the dependent variable for small values and large values of the independent variable.
D)The residuals are normally distributed.

A)The residuals have a constant variance.
B)The model fails to capture the relationship between the variables accurately.
C)The model overpredicts the value of the dependent variable for small values and large values of the independent variable.
D)The residuals are normally distributed.
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
31
__________ is used to test the hypothesis that the values of the regression parameters ß1, ß2, ... ßq are all zero.
A)An F test
B)A t test
C)The least squares method
D)Extrapolation
A)An F test
B)A t test
C)The least squares method
D)Extrapolation
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
32
The coefficient of determination
A)takes values between -1 to +1.
B)is equal to zero for a perfect fit.
C)is equal to negative one for the poorest fit.
D)is used to evaluate the goodness of fit.
A)takes values between -1 to +1.
B)is equal to zero for a perfect fit.
C)is equal to negative one for the poorest fit.
D)is used to evaluate the goodness of fit.
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
33
The scatter chart below displays the residuals versus the dependent variable, x. Which of the following conclusions can be drawn from the scatter chart given below? 
A)The residuals have an increasing variance as the dependent variable increases.
B)The model captures the relationship between the variables accurately.
C)The regression model follows the standard normal probability distribution.
D)The residual distribution is consistently scattered about zero.

A)The residuals have an increasing variance as the dependent variable increases.
B)The model captures the relationship between the variables accurately.
C)The regression model follows the standard normal probability distribution.
D)The residual distribution is consistently scattered about zero.
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
34
The process of making a conjecture about the value of a population parameter, collecting sample data that can be used to assess this conjecture, measuring the strength of the evidence against the conjecture that is provided by the sample, and using these results to draw a conclusion about the conjecture is known as
A)postulation.
B)hypothesis testing.
C)statistical inference.
D)empirical research.
A)postulation.
B)hypothesis testing.
C)statistical inference.
D)empirical research.
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
35
The scatter chart below displays the residuals versus the dependent variable, t. Which of the following conclusions can be drawn based upon this scatter chart? 
A)model is time-invariant.
B)model captures the relationship between the variables accurately.
C)residuals are not independent.
D)residuals are normally distributed.

A)model is time-invariant.
B)model captures the relationship between the variables accurately.
C)residuals are not independent.
D)residuals are normally distributed.
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
36
Regression analysis involving one dependent variable and more than one independent variable is known as
A)simple regression.
B)linear regression.
C)multiple regression.
D)None of these are correct.
A)simple regression.
B)linear regression.
C)multiple regression.
D)None of these are correct.
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
37
_________ refers to the use of sample data to calculate a range of values that is believed to include the value of the population parameter.
A)Interval estimation
B)Hypothesis testing
C)Statistical inference
D)Point estimation
A)Interval estimation
B)Hypothesis testing
C)Statistical inference
D)Point estimation
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
38
What would be the coefficient of determination if the total sum of squares (SST) is 23.29 and the sum of squares due to regression (SSR) is 10.03?
A)2.32
B)0.43
C)0.19
D)0.89
A)2.32
B)0.43
C)0.19
D)0.89
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
39
The __________ is an indication of how frequently interval estimates based on samples of the same size taken from the same population using identical sampling techniques will contain the true value of the parameter we are estimating.
A)residual
B)tolerance factor
C)confidence level
D)accuracy level
A)residual
B)tolerance factor
C)confidence level
D)accuracy level
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
40
__________ refers to the degree of correlation among independent variables in a regression model.
A)Multicollinearity
B)Tolerance
C)Rank
D)Confidence level
A)Multicollinearity
B)Tolerance
C)Rank
D)Confidence level
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
41
__________ is the data set used to build the candidate models.
A)Range
B)Codomain
C)Validation set
D)Training set
A)Range
B)Codomain
C)Validation set
D)Training set
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
42
Fitting a model too closely to sample data, resulting in a model that does not accurately reflect the population is termed as
A)approximation.
B)hypothesizing.
C)overfitting.
D)postulating.
A)approximation.
B)hypothesizing.
C)overfitting.
D)postulating.
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
43
Assessing the regression model on data other than the sample data that was used to generate the model is known as
A)approximation.
B)cross-validation.
C)graphical validation.
D)postulation.
A)approximation.
B)cross-validation.
C)graphical validation.
D)postulation.
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
44
__________ refers to the data set used to compare model forecasts and ultimately pick a model for predicting values of the dependent variable.
A)Codomain
B)Training set
C)Validation set
D)Range
A)Codomain
B)Training set
C)Validation set
D)Range
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
45
__________ refers to the scenario in which the relationship between the dependent variable and one independent variable is different at different values of a second independent variable.
A)Interaction
B)Multicollinearity
C)Autocorrelation
D)Covariance
A)Interaction
B)Multicollinearity
C)Autocorrelation
D)Covariance
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
46
Given the partial Excel output from a multiple regression, formulate the regression model.

Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck