Deck 12: Multiple Regression and Model Building

Full screen (f)
exit full mode
Question
<strong>  Which of the following assumptions appears violated based on this plot?</strong> A) The variance of the errors is constant B) The errors are independent C) The mean of the errors is zero D) The errors are normally distributed <div style=padding-top: 35px> Which of the following assumptions appears violated based on this plot?

A) The variance of the errors is constant
B) The errors are independent
C) The mean of the errors is zero
D) The errors are normally distributed
Use Space or
up arrow
down arrow
to flip the card.
Question
 <div style=padding-top: 35px>
Question
Consider the second-order model Consider the second-order model  <div style=padding-top: 35px>
Question
What relationship between x and y is suggested by the scattergram? <strong>What relationship between x and y is suggested by the scattergram?  </strong> A) a quadratic relationship with downward concavity B) a linear relationship with negative slope C) a linear relationship with positive slope D) a quadratic relationship with upward concavity <div style=padding-top: 35px>

A) a quadratic relationship with downward concavity
B) a linear relationship with negative slope
C) a linear relationship with positive slope
D) a quadratic relationship with upward concavity
Question
A study of the top MBA programs attempted to predict the average starting salary (in $1000's) of graduates of the program based on the amount of tuition (in $1000's) charged by the program and The average GMAT score of the program's students. The results of a regression analysis based on a Sample of 75 MBA programs is shown below:  Least Squares Linear Regression of Salary \text { Least Squares Linear Regression of Salary }

 Predictor  Variables  Coefficient  Std Error  T  P  Constant 169.91026.53506.400.0000 Tuition 3.373730.811714.160.0001 TxT 0.035630.005906.030.0000\begin{array}{l}\text { Predictor }\\\begin{array}{lcccl}\text { Variables } & \text { Coefficient } & \text { Std Error } & \text { T } & \text { P } \\\text { Constant } & 169.910 & 26.5350 & 6.40 & 0.0000 \\\text { Tuition } & -3.37373 & 0.81171 & -4.16 & 0.0001 \\\text { TxT } & 0.03563 & 0.00590 & 6.03 & 0.0000\end{array}\end{array}

 R-Squared 0.7361 Resid. Mean Square (MSE) 358.887 Adjusted R-Squared 0.7288 Standard Deviation 18.9443\begin{array}{lccc}\text { R-Squared } & 0.7361 & \text { Resid. Mean Square (MSE) } & 358.887 \\\text { Adjusted R-Squared } & 0.7288 & \text { Standard Deviation } & 18.9443\end{array}



 Source  DF  SS  MS  F  P  Regression 272081.836040.9100.420.0000 Residual 7225839.8358.9 Total 7497921.7 Cases Included 75 Missing Cases 0\begin{array}{l}\begin{array} { l l l c c c c } \text { Source } & \text { DF } & \text { SS } & \text { MS } & \text { F } & \text { P } \\\text { Regression } & 2 & & 72081.8 & 36040.9 & 100.42 & 0.0000 \\\text { Residual } & & 72 & 25839.8 & 358.9 & \\\text { Total } & & 74 & 97921.7 & & &\end{array}\\\\\text { Cases Included } 75 \text { Missing Cases } 0\end{array}
The global-f test statistic is shown on the printout to be the value F=100.42\mathrm { F } = 100.42 . Interpret this value.

A) There is insufficient evidence, at α=0.05\alpha = 0.05 , to indicate that at least one of the variables proposed in the interaction model is useful at predicting the average starting salary of graduates of MBA programs.
B) There is sufficient evidence, at α=0.05\alpha = 0.05 , to indicate that at least one of the variables proposed in the interaction model is useful at predicting the average starting salary of graduates of MBA programs.
C) There is sufficient evidence, at α=0.05\alpha = 0.05 , to indicate that there is a linear relationship between average starting salary of graduates of MBA programs and the tuition of the MBA program.
D) There is sufficient evidence, at α=0.05\alpha = 0.05 , to indicate that there is a curvilinear relationship between average starting salary of graduates of MBA programs and the tuition of the MBA program.
Question
A study of the top MBA programs attempted to predict the average starting salary (in $1000's) of graduates of the program based on the amount of tuition (in $1000's) charged by the program and the average GMAT score of the program's students. The results of a regression analysis based on a sample of 75 MBA programs is shown below: Least Squares Linear Regression of Salary

Predictor Variables  Coefficient  Std Error  T  P  VIF  Constant 203.40251.65733.940.00020.0 Gmat 0.394120.090394.360.00002.0 Tuition 0.920120.178755.150.00002.0\begin{array}{lcccccccc}\text {Predictor}\\\text { Variables } & \text { Coefficient } & \text { Std Error } & \text { T } & & \text { P } & {\text { VIF }} \\\text { Constant } & -203.402 & 51.6573 & -3.94 & & 0.0002 & 0.0 \\\text { Gmat } & 0.39412 & 0.09039 & 4.36 & & 0.0000 & 2.0 & \\\text { Tuition } & 0.92012 & 0.17875 & 5.15 & & 0.0000 & 2.0 &\end{array}


 Source  DF  SS  MS  F  P  Regression 267140.933570.578.530.0000 Residual 7230780.8427.5 Total 7497921.7\begin{array}{lllcccc}\text { Source } & \text { DF } & \text { SS } & \text { MS } & \text { F } & \text { P } \\\text { Regression } & 2 & 67140.9 & 33570.5 & 78.53 & 0.0000 \\\text { Residual } & 72 & 30780.8 & 427.5 & \\\text { Total } & 74 & 97921.7 & & &\end{array}

Interpret the p-value for the global f-test shown on the printout.

A) At ? = 0.05, there is sufficient evidence to indicate that something in the regression model is useful for predicting the average starting salary of the graduates of an MBA program.
B) At ? = 0.05, there is insufficient evidence to indicate that the average GMAT score of the MBA program's students is useful for predicting the average starting salary of the graduates of an
MBA program.
C) At ? = 0.05, there is sufficient evidence to indicate that the average GMAT score of the MBA program's students is useful for predicting the average starting salary of the graduates of an
MBA program.
D) At ? = 0.05, there is insufficient evidence to indicate that something in the regression model is useful for predicting the average starting salary of the graduates of an MBA program.
Question
 <div style=padding-top: 35px>
Question
A study of the top MBA programs attempted to predict the average starting salary (in $1000's) of graduates of the program based on the amount of tuition (in $1000's) charged by the program and the average GMAT score of the program's students. The results of a regression analysis based on a sample of 75 MBA programs is shown below: Least Squares Linear Regression of Salary
 Predictor  Variables  Coefficient  Std Error  T  P  VIF  Constant 203.40251.65733.940.00020.0 Gmat 0.394120.090394.360.00002.0\begin{array}{l}\text { Predictor }\\\begin{array}{lccccccc}\text { Variables } & \text { Coefficient } & \text { Std Error } & \text { T } & \text { P } & {\text { VIF }} \\\text { Constant } & -203.402 & 51.6573 & -3.94 & {0.0002} & 0.0 \\\text { Gmat } & 0.39412 & 0.09039 & 4.36 & 0.0000 & 2.0\end{array}\end{array}


 R-Squared 0.6857 Resid. Mean Square (MSE) 427.511 Adjusted R-Squared 0.6769 Standard Deviation 20.6763\begin{array}{lrrr}\text { R-Squared } & 0.6857 & \text { Resid. Mean Square (MSE) } & 427.511 \\\text { Adjusted R-Squared } & 0.6769 & \text { Standard Deviation } & 20.6763\end{array}

Identify the test statistic that should be used to test to determine if the amount of tuition charged by a program is a useful predictor of the average starting salary of the graduates of the program.

A) t=5.15t = 5.15
B) t=20.67t = 20.67
C) t=3.94t = - 3.94
D) t=4.36t = 4.36
Question
A public health researcher wants to use regression to predict the sun safety knowledge of pre-school children. The researcher randomly sampled 35 preschoolers, assigned them to one of Two groups, and then measured the following three variables:
SUNSCORE: y=\quad \mathrm { y } = Score on sun-safety comprehension test
READING: x1=\quad \mathrm { x } _ { 1 } = Reading comprehension score
GROUP: x2=1\quad\quad x _ { 2 } = 1 if child received a Be Sun Safe demonstration, 0 if not

The following two models were hypothesized:
Model 1: E(y)=β0+β1x1+β2x12+β3x2+β4x1x2+β5x12x2E ( y ) = \beta _ { 0 } + \beta _ { 1 } x _ { 1 } + \beta _ { 2 } x _ { 1 } ^ { 2 } + \beta _ { 3 } x _ { 2 } + \beta _ { 4 } x _ { 1 } x _ { 2 } + \beta _ { 5 } x _ { 1 } ^ { 2 } x _ { 2 }
Model 2: E(y)=β0+β1x1+β3x2+β4x1x2E ( y ) = \beta _ { 0 } + \beta _ { 1 } x _ { 1 } + \beta _ { 3 } x _ { 2 } + \beta _ { 4 } x _ { 1 } x _ { 2 }

A partial f-test was conducted to compare the two models and the resulting p-value was found to be 0.0023. Fill in the blank. The results lead us to conclude that there is _____  (at α=0.05)\text { (at } \alpha = 0.05 )

A) insufficient evidence of quadratic relationship between sun-safety score to reading score.
B) sufficient evidence of a statistically useful model for sun-safety score.
C) sufficient evidence of interaction between sun-safety score and reading score.
D) sufficient evidence of a quadratic relationship between sun-safety score to reading score.
Question
We decide to conduct a multiple regression analysis to predict the attendance at a major league baseball game. We use the size of the stadium as a quantitative independent variable and the type Of game as a qualitative variable (with two levels - day game or night game). We hypothesize the
Following model: E(y)=β0+β1x1+β2x2+β3x3\mathrm { E } ( \mathrm { y } ) = \beta _ { 0 } + \beta _ { 1 ^ { \mathrm { x } } 1 } + \beta _ { 2 \mathrm { x } _ { 2 } } + \beta _ { 3 } \mathrm { x } _ { 3 }
Where \quad x1=x _ { 1 } = size of the stadium
\quad \quad \quad x2=1x _ { 2 } = 1 if a day game, 0 if a night game

A plot of the yx1y - x _ { 1 } relationship would show:

A) Two non-parallel curves
B) Two parallel lines
C) Two parallel curves
D) Two non-parallel lines
Question
Which equation represents a complete second-order model for two quantitative independent variables?

A) E(y)=β0+β1x12+β2x22+β3x12x2+β4x1x22+β5x12x22E ( y ) = \beta _ { 0 } + \beta _ { 1 } x _ { 1 } ^ { 2 } + \beta _ { 2 } x _ { 2 } ^ { 2 } + \beta _ { 3 } x _ { 1 } ^ { 2 } x _ { 2 } + \beta _ { 4 } x _ { 1 } x _ { 2 } ^ { 2 } + \beta _ { 5 } x _ { 1 } ^ { 2 } x _ { 2 } ^ { 2 }
B) E(y)=β0+β1x1+β2x2+β3x1x2+β4x12+β5x22E ( y ) = \beta _ { 0 } + \beta _ { 1 } x _ { 1 } + \beta _ { 2 } x _ { 2 } + \beta _ { 3 } x _ { 1 } x _ { 2 } + \beta _ { 4 } x _ { 1 } ^ { 2 } + \beta _ { 5 } x _ { 2 } ^ { 2 }
C) E(y)=β0+β1x1+β2x2+β3x12+β4x22E ( y ) = \beta _ { 0 } + \beta _ { 1 } x _ { 1 } + \beta _ { 2 } x _ { 2 } + \beta _ { 3 } x _ { 1 } ^ { 2 } + \beta _ { 4 } x _ { 2 } ^ { 2 }
D) E(y)=β0+β1x1x2+β2x12+β3x22E ( y ) = \beta _ { 0 } + \beta _ { 1 } x _ { 1 } x _ { 2 } + \beta _ { 2 } x _ { 1 } ^ { 2 } + \beta _ { 3 } x _ { 2 } ^ { 2 }
Question
 <div style=padding-top: 35px>
Question
<strong> </strong> A) 11 B) .9286 C) 5.5 D) .9405 <div style=padding-top: 35px>

A) 11
B) .9286
C) 5.5
D) .9405
Question
Which of the following is not a possible indicator of multicollinearity?

A) significant correlations between pairs of independent variables
B) non-significant t-tests for individual β parameters when the F-test for overall model adequacy is significant
C) signs opposite from what is expected in the estimated β parameters
D) non-random patterns in the plot of the residuals versus the fitted values
Question
A study of the top MBA programs attempted to predict the average starting salary (in $1000's) of graduates of the program based on the amount of tuition (in $1000's) charged by the program and The average GMAT score of the program's students. The results of a regression analysis based on a Sample of 75 MBA programs is shown below: Least Squares Linear Regression of Salary

<strong>A study of the top MBA programs attempted to predict the average starting salary (in $1000's) of graduates of the program based on the amount of tuition (in $1000's) charged by the program and The average GMAT score of the program's students. The results of a regression analysis based on a Sample of 75 MBA programs is shown below: Least Squares Linear Regression of Salary     The model was then used to create 95% confidence and prediction intervals for y and for E(Y) when The tuition charged by the MBA program was $75,000 and the GMAT score was 675. The results are Shown here:  95% confidence interval for E(Y): ($126,610, $136,640) 95% prediction interval for Y: ($90,113, $173,160)  Which of the following interpretations is correct if you want to use the model to estimate E(Y) for All MBA programs?</strong> A) We are 95% confident that the average starting salary for graduates of a single MBA program that charges $75,000 in tuition and has an average GMAT score of 675 will fall between $90,113 and $173,16,30. B) We are 95% confident that the average starting salary for graduates of a single MBA program that charges $75,000 in tuition and has an average GMAT score of 675 will fall between $126,610 and $136,640. C) We are 95% confident that the average of all starting salaries for graduates of all MBA programs that charge $75,000 in tuition and have an average GMAT score of 675 will fall Between $126,610 and $136,640. D) We are 95% confident that the average of all starting salaries for graduates of all MBA programs that charge $75,000 in tuition and have an average GMAT score of 675 will fall Between $90,113 and $173,16,30. <div style=padding-top: 35px>

The model was then used to create 95% confidence and prediction intervals for y and for E(Y) when The tuition charged by the MBA program was $75,000 and the GMAT score was 675. The results are Shown here:

95% confidence interval for E(Y): ($126,610, $136,640)
95% prediction interval for Y: ($90,113, $173,160)

Which of the following interpretations is correct if you want to use the model to estimate E(Y) for All MBA programs?

A) We are 95% confident that the average starting salary for graduates of a single MBA program that charges $75,000 in tuition and has an average GMAT score of 675 will fall between
$90,113 and $173,16,30.
B) We are 95% confident that the average starting salary for graduates of a single MBA program that charges $75,000 in tuition and has an average GMAT score of 675 will fall between
$126,610 and $136,640.
C) We are 95% confident that the average of all starting salaries for graduates of all MBA programs that charge $75,000 in tuition and have an average GMAT score of 675 will fall
Between $126,610 and $136,640.
D) We are 95% confident that the average of all starting salaries for graduates of all MBA programs that charge $75,000 in tuition and have an average GMAT score of 675 will fall
Between $90,113 and $173,16,30.
Question
 <div style=padding-top: 35px>
Question
 <div style=padding-top: 35px>
Question
A study of the top MBA programs attempted to predict the average starting salary (in $1000's) of graduates of the program based on the amount of tuition (in $1000's) charged by the program and the average GMAT score of the program's students. The results of a regression analysis based on a sample of 75 MBA programs is shown below:

 Least Squares Linear Regression of Salary \text { Least Squares Linear Regression of Salary }

 Predictor  Variables  Coefficient  Std Error  T  P  VIF  Constant 203.40251.65733.940.00020.0 Gmat 0.394120.090394.360.00002.0 Tuition 0.920120.178755.150.00002.0\begin{array}{l}\text { Predictor }\\\begin{array}{lcccccccc}\text { Variables } & \text { Coefficient } & {\text { Std Error }} & \text { T } & \text { P } &{\text { VIF }} \\\text { Constant } & -203.402 & 51.6573 & -3.94 & 0.0002 & 0.0 \\\text { Gmat } & 0.39412 & 0.09039 & 4.36 & 0.0000 & 2.0 & \\\text { Tuition } & 0.92012 & 0.17875 & 5.15 & 0.0000 & 2.0 &\end{array}\end{array}


 R-Squared 0.6857 Resid. Mean Square (MSE) 427.511 Adjusted R-Squared 0.6769 Standard Deviation 20.6763\begin{array}{lccc}\text { R-Squared } & 0.6857 & \text { Resid. Mean Square (MSE) } & 427.511 \\\text { Adjusted R-Squared } & 0.6769 & \text { Standard Deviation } & 20.6763\end{array}

Interpret the coefficient for the tuition variable shown on the printout.

A) For every $1000 increase in the tuition charged by the MBA program, we estimate that the average starting salary will increase by $920.12, holding the GMAT score constant
B) For every $1000 increase in the tuition charged by the MBA program, we estimate that the average starting salary will increase by $394.12, holding the GMAT score constant
C) For every $1000 increase in the tuition charged by the MBA program, we estimate that the average starting salary will decrease by $203,402, holding the GMAT score constant.
D) For every $1000 increase in the average starting salary, we estimate that the tuition charged by the MBA program will increase by $920.12.
Question
A study of the top MBA programs attempted to predict the average starting salary (in $1000's) of graduates of the program based on the amount of tuition (in $1000's) charged by the program and the average GMAT score of the program's students. The results of a regression analysis based on a sample of 75 MBA programs is shown below:

Least Squares Linear Regression of Salary


Predictor Variables  Coefficient  Std Error  T  P  Constant 169.91026.53506.400.0000 Tuition 3.373730.811714.160.0001 TxT 0.035630.005906.030.0000\begin{array} { l c c c l }\text {Predictor}\\ \text { Variables } & \text { Coefficient } & \text { Std Error } & \text { T } & \text { P } \\ \text { Constant } & 169.910 & 26.5350 & 6.40 & 0.0000 \\ \text { Tuition } & - 3.37373 & 0.81171 & - 4.16 & 0.0001 \\ \text { TxT } & 0.03563 & 0.00590 & 6.03 & 0.0000 \end{array}


 R-Squared 0.7361 Resid. Mean Square (MSE) 358.887 Adjusted R-Squared 0.7288 Standard Deviation 18.9443\begin{array} { l c c r } \text { R-Squared } & 0.7361 & \text { Resid. Mean Square (MSE) } & 358.887 \\ \text { Adjusted R-Squared } & 0.7288 & \text { Standard Deviation } & 18.9443 \end{array}


 Source  DF  SS  Regression 272081.8 Residual 7225839.8 Total 7497921.7 Cases Included 75 Missing Cases 0 \begin{array} { l l c c } \text { Source } & \text { DF } & \text { SS } \\ \text { Regression } & 2 & & 72081.8 \\ \text { Residual } & & 72 & 25839.8 \\ \text { Total } & & 74 & 97921.7 \\ & & & \\ \text { Cases Included } 75 & \text { Missing Cases 0 } \end{array}
One of the t-test test statistics is shown on the printout to be the value t=6.03t = 6.03 . Interpret this value.

A) There is sufficient evidence, at α=0.05\alpha = 0.05 , to indicate that at least one of the variables proposed in the interaction model is useful at predicting the average starting salary of graduates of MBA programs.
B) There is sufficient evidence, at α=0.05\alpha = 0.05 , to indicate that there is a linear relationship between average starting salary of graduates of MBA programs and the tuition of the MBA program.
C) There is insufficient evidence, at α=0.05\alpha = 0.05 , to indicate that at least one of the variables proposed in the interaction model is useful at predicting the average starting salary of graduates of MBA programs.
D) There is sufficient evidence, at α=0.05\alpha = 0.05 , to indicate that there is a curvilinear relationship between average starting salary of graduates of MBA programs and the tuition of the MBA program.
Question
 <div style=padding-top: 35px>
Question
Consider the partial printout below.  Coefficients  Standard Error t Stat  P-value  Lower 95%  Upper 95%  Intercept 63.1487393125.091151122.5167733040.045484943124.54461921.752859365 X1 114.725078648.1135817411.8148678490.1194666995.12815519734.57831248 X2 12.487845464.6860637432.6648902240.0372798791.02145216523.95423875 X1X2 1.8869351351.3449998341.4029259240.2102101415.1780335751.404163305 Is there evidence (at α=.05 ) that x1 and x2 interact? Explain. \begin{array}{l}\begin{array} { l c l l l l l } \hline & \text { Coefficients } & \text { Standard Error } & t \text { Stat } & \text { P-value } & \text { Lower 95\% } & \text { Upper 95\% } \\\hline \text { Intercept } &- 63.14873931 & 25.09115112 & - 2.516773304 & 0.045484943 & - 124.5446192 & - 1.752859365 \\\text { X1 } _ { 1 } & 14.72507864 & 8.113581741 & 1.814867849 & 0.119466699 & - 5.128155197 & 34.57831248 \\\text { X2 } & 12.48784546 & 4.686063743 & 2.664890224 & 0.037279879 & 1.021452165 & 23.95423875 \\\text { X1X2 } & - 1.886935135 & 1.344999834 & - 1.402925924 & 0.210210141 & - 5.178033575 & 1.404163305 \\\hline\end{array}\\\text { Is there evidence (at } \alpha = .05 \text { ) that } x _ { 1 } \text { and } x _ { 2 } \text { interact? Explain. }\end{array}
Question
<strong> </strong> A) 4.2 B) 10.8 C) 11.4 D) 1.8 <div style=padding-top: 35px>

A) 4.2
B) 10.8
C) 11.4
D) 1.8
Question
It is dangerous to predict outside the range of the data collected in a regression analysis. For instance, we shouldn't predict the price of a 5000 square foot home if all our sample homes were smaller than 4500 square feet. Which of the following multiple regression pitfalls does this example describe?

A) Estimability
B) Multicollinearity
C) Stepwise Regression
D) Extrapolation
Question
A study of the top MBA programs attempted to predict the average starting salary (in $1000's) of graduates of the program based on the amount of tuition (in $1000's) charged by the program and the average GMAT score of the program's students. The results of a regression analysis based on a sample of 75 MBA programs is shown below:

Least Squares Linear Regression of Salary

 Predictor  Variables  Coefficient  Std Error  T  P  VIF  Constant 203.40251.65733.940.00020.0 Gmat 0.394120.090394.360.00002.0 Tuition 0.920120.178755.150.00002.0\begin{array}{l}\text { Predictor }\\\begin{array}{lcccccccc}\text { Variables } & \text { Coefficient } & {\text { Std Error }} & \text { T } & \text { P } & {\text { VIF }} \\\text { Constant } & -203.402 & 51.6573 & -3.94 & 0.0002 & 0.0 \\\text { Gmat } & 0.39412 & 0.09039 & 4.36 & 0.0000 & 2.0 & \\\text { Tuition } & 0.92012 & 0.17875 & 5.15 & 0.0000 & 2.0 &\end{array}\end{array}

 R-Squared 0.6857 Resid. Mean Square (MSE) 427.511 Adjusted R-Squared 0.6769 Standard Deviation 20.6763\begin{array}{lccc}\text { R-Squared } & 0.6857 & \text { Resid. Mean Square (MSE) } & 427.511 \\\text { Adjusted R-Squared } & 0.6769 & \text { Standard Deviation } & 20.6763\end{array}

 Source  DF  SS  MS  F  P  Regression 267140.933570.578.530.0000 Residual 7230780.8427.5 Total 7497921.7\begin{array}{lllcccc}\text { Source } & \text { DF } & \text { SS } & \text { MS } & \text { F } & \text { P } \\\text { Regression } & 2 & 67140.9 & 33570.5 & 78.53 & 0.0000 \\\text { Residual } & 72 & 30780.8 & 427.5 & \\\text { Total } & 74 & 97921.7 & &\end{array}

A) At α=0.05\alpha = 0.05 , there is insufficient evidence to indicate that something in the regression model is useful for predicting the average starting salary of the graduates of an MBA program.
B) We expect most of the average starting salaries to fall within $20,676\$ 20,676 of their least squares predicted values.
C) We expect most of the average starting salaries to fall within $41,353\$ 41,353 of their least squares predicted values.
D) We can explain 68.57%68.57 \% of the variation in the average starting salaries around their mean using the model that includes the average GMAT score and the tuition for the MBA program.
Question
Retail price data for n=60n = 60 hard disk drives were recently reported in a computer magazine. Three variables were recorded for each hard disk drive:
y=y = Retail PRICE (measured in dollars)
x1=x _ { 1 } = Microprocessor SPEED (measured in megahertz)
(Values in sample range from 10 to 40 )
x2=CHIPx _ { 2 } = \mathrm { CHIP } size (measured in computer processing units)
(Values in sample range from 286 to 486 )

A first-order regression model was fit to the data. Part of the printout follows:

\quad \quad \quad \quad \quad \quad \quad \quad Parameter Estimates
\quad \quad \quad PARAMETER STANDARD \quad \quad T FOR 0:
VARIABLE DF ESTIMATE ERROR PARAMETER =0= 0 PROB >T> | T |

 INTERCEPT 1373.5263921258.12433960.2970.7676 SPEED 1104.83894022.362981954.6880.0001 CHIP 13.5718503.894229350.9170.3629\begin{array} { l l l l l l } \text { INTERCEPT } &1 & - 373.526392 & 1258.1243396 & - 0.297 & 0.7676 \\\text { SPEED } & 1 & 104.838940 & 22.36298195 & 4.688 & 0.0001 \\\text { CHIP } & 1 & 3.571850 & 3.89422935 & 0.917 & 0.3629\end{array}


Identify and interpret the estimate for the SPEED β\beta -coefficient, β^1\hat { \beta } _ { 1 } .

A) β^1=3.57\hat { \beta } _ { 1 } = 3.57 ; For every 1-megahertz increase in SPEED, we estimate PRICE to increase $3,57\$ 3,57 , holding CHIP fixed.
B) β^1=105\hat { \beta } _ { 1 } = 105 ; For every 1-megahertz increase in SPEED, we estimate PRICE (y) to increase $105\$ 105 , holding CHIP fixed.
C) β^1=105\hat { \beta } _ { 1 } = 105 ; For every $1\$ 1 increase in PRICE, we estimate SPEED to increase 105 megahertz, holding CHIP fixed.
D) β^1=3.57\hat { \beta } _ { 1 } = 3.57 ; For every $1\$ 1 increase in PRICE, we estimate SPPED to increase by about 4 megahertz, holding CHIP fixed.
Question
The first-order model below was fit to a set of data. The first-order model below was fit to a set of data.   Explain how to determine if the constant variance assumption is satisfied.<div style=padding-top: 35px> Explain how to determine if the constant variance assumption is satisfied.
Question
Twenty colleges each recommended one of its graduating seniors for a prestigious graduate fellowship. The process to determine which student will receive the fellowship includes several interviews. The gender of each student and his or her score on the first interview are shown below.

 Student  Gender  Score 1 Male 182 Female 173 Female 194 Female 165 Male 126 Female 157 Female 188 Male 169 Male 1810 Female 20\begin{array}{clc}\hline \text { Student } & \text { Gender } & \text { Score } \\\hline 1 & \text { Male } & 18 \\2 & \text { Female } & 17 \\3 & \text { Female } & 19 \\4 & \text { Female } & 16 \\5 & \text { Male } & 12 \\6 & \text { Female } & 15 \\7 & \text { Female } & 18 \\8 & \text { Male } & 16 \\9 & \text { Male } & 18 \\10 & \text { Female } & 20\end{array}

 Student  Gender  Score 11 Female 1712 Male 1613 Male 1614 Female 1915 Female 1616 Male 1517 Female 1218 Male 1419 Female 1620 Female 18\begin{array}{clc}\hline \text { Student } & \text { Gender } & \text { Score } \\\hline 11 & \text { Female } & 17 \\12 & \text { Male } & 16 \\13 & \text { Male } & 16 \\14 & \text { Female } & 19 \\15 & \text { Female } & 16 \\16 & \text { Male } & 15 \\17 & \text { Female } & 12 \\18 & \text { Male } & 14 \\19 & \text { Female } & 16 \\20 & \text { Female } & 18\end{array}
a. Suppose you want to use gender to model the score on the interview y. Create the
appropriate number of dummy variables for gender and write the model.
b. Fit the model to the data.
c. Give the null hypothesis for testing whether gender is a useful predictor of the score y.
d. Conduct the test and give the appropriate conclusion  Use α=.05\text { Use } \alpha = .05
Question
Retail price data for n = 60 hard disk drives were recently reported in a computer magazine. Three variables were recorded for each hard disk drive:
y= Retail PRICE (measured in dollars) x1= Microprocessor SPEED (measured in megahertz)  (Values in sample range from 10 to 40 ) x2= CHIP size (measured in computer processing units)  (Values in sample range from 286 to 486 ) \begin{aligned} y = & \text { Retail PRICE (measured in dollars) } \\ x _ { 1 } = & \text { Microprocessor SPEED (measured in megahertz) } \\ & \text { (Values in sample range from } 10 \text { to } 40 \text { ) } \\ x _ { 2 } = & \text { CHIP size (measured in computer processing units) } \\ & \text { (Values in sample range from } 286 \text { to } 486 \text { ) } \end{aligned}
A first-order regression model. was fit to the data. Part of the printout follows:
\quad \quad \quad \quad \quad \quad \quad \quad \quad Parameter Estimates
\quad \quad \quad \quad \quad PARAMETER STANDARD \quad T FOR 0 :
VARIABLE DF ESTIMATE ERROR PARAMETER =0= 0 PROB >T> | \mathrm { T } |
 INTERCEPT 1373.5263921258.12433960.2970.7676 SPEED 1104.83894022.362981954.6880.0001 CHIP 13.5718503.894229350.9170.3629\begin{array} { l r l l l l } \text { INTERCEPT } &1 & - 373.526392 & 1258.1243396 & - 0.297 & 0.7676 \\\text { SPEED } & 1 & 104.838940 & 22.36298195 & 4.688 & 0.0001 \\\text { CHIP } & 1 & 3.571850 & 3.89422935 & 0.917 & 0.3629\end{array}


 Identify and interpret the estimate of β2\text { Identify and interpret the estimate of } \beta_{2} \text {. }
Question
A public health researcher wants to use regression to predict the sun safety knowledge of pre-school children. The researcher randomly sampled 35 preschoolers, assigned them to one of two groups, and then measured the following three variables:
SUNSCORE: y=\quad \mathrm { y } = Score on sun-safety comprehension test
READING: x1=\quad \mathrm { x } _ { 1 } = Reading comprehension score
GROUP: x2=1\quad \quad x _ { 2 } = 1 if child received a Be Sun Safe demonstration, 0 if not

A regression model was fit and the following residual plot was observed.
Predicted value of yy
 <strong>A public health researcher wants to use regression to predict the sun safety knowledge of pre-school children. The researcher randomly sampled 35 preschoolers, assigned them to one of two groups, and then measured the following three variables: SUNSCORE:  \quad \mathrm { y } =  Score on sun-safety comprehension test READING:  \quad \mathrm { x } _ { 1 } =  Reading comprehension score GROUP:  \quad \quad x _ { 2 } = 1  if child received a Be Sun Safe demonstration, 0 if not  A regression model was fit and the following residual plot was observed. Predicted value of  y    Which of the following assumptions appears violated based on this plot?</strong> A) The errors are normally distributed B) The errors are independent C) The mean of the errors is zero D) The variance of the errors is constant <div style=padding-top: 35px>
Which of the following assumptions appears violated based on this plot?

A) The errors are normally distributed
B) The errors are independent
C) The mean of the errors is zero
D) The variance of the errors is constant
Question
Consider the partial printout for an interaction regression analysis of the relationship between a dependent variable yy and two independent variables x1x _ { 1 } and x2x _ { 2 } .
ANOVA
 df  SS  MS F Significance F  Regression 33393.6773241131.2257759391.9747822.11084E11 Residual 60.7226759870.120445998 Total 93394.4\begin{array}{llllll}\hline & \text { df } & \text { SS } & \text { MS } & F & \text { Significance F } \\\hline \text { Regression } & 3 & 3393.677324 & 1131.225775 & 9391.974782 & 2.11084 \mathrm{E}-11 \\\text { Residual } & 6 & 0.722675987 & 0.120445998 & & \\\text { Total } & 9 & 3394.4 & & & \\\hline\end{array}


 Coefficients  Standard Error t Stat  P-value  Lower 95%  Upper 95%  Intercept 16.721970148.2839972192.0185871260.090076543.54825565936.99219593 X1 13.0373177592.6787487051.1338569210.3001163829.5919845063.517348987 X2 21.0465227541.5471326450.6764272970.5239739884.8322227272.73917722 X1X2 14.0716851470.4440599339.1692243459.47663E052.985108845.158261454\begin{array}{lllllll} & \text { Coefficients } & \text { Standard Error } & t \text { Stat } & \text { P-value } & \text { Lower 95\% } & \text { Upper 95\% } \\\hline \text { Intercept } & 16.72197014 & 8.283997219 & 2.018587126 & 0.09007654 & -3.548255659 & 36.99219593 \\\text { X1 }_{1} & -3.037317759 & 2.678748705 & -1.133856921 & 0.300116382 & -9.591984506 & 3.517348987 \\\text { X2 }_{2} & -1.046522754 & 1.547132645 & -0.676427297 & 0.523973988 & -4.832222727 & 2.73917722 \\\text { X1X2 }_{1} & 4.071685147 & 0.444059933 & 9.169224345 & 9.47663 \mathrm{E}-05 & 2.98510884 & 5.158261454\end{array}


a. Write the prediction equation for the interaction model.
b. Test the overall utility of the interaction model using the global FF -test at α=.05\alpha = .05 .
c. Test the hypothesis (at α=.05\alpha = .05 ) that x1x _ { 1 } and x2x _ { 2 } interact positively.
d. Estimate the change in yy for each additional 1-unit increase in x1x _ { 1 } when x2=6x _ { 2 } = 6 .
Question
A study of the top MBA programs attempted to predict the average starting salary (in $1000's) of graduates of the program based on the amount of tuition (in $1000's) charged by the program and the average GMAT score of the program's students. The results of a regression analysis based on a sample of 75 MBA programs is shown below:
 Least Squares Linear Regression of Salary  Predictor  Variables  Coefficient  Std Error  T  P \multicolumn2c VIF  Constant 203.40251.65733.940.00020.0 Gmat 0.394120.090394.360.00002.0 Tuition 0.920120.178755.150.00002.0\begin{array}{l}\text { Least Squares Linear Regression of Salary }\\\begin{array} { l c c c c c c c c } \text { Predictor } & & & & & & & \\\text { Variables } & \text { Coefficient } &{ \text { Std Error } } & \text { T } & & \text { P } & \multicolumn{2}{c} { \text { VIF } } \\\text { Constant } & - 203.402 & 51.6573 & - 3.94 & & 0.0002 & 0.0 \\\text { Gmat } & 0.39412 & 0.09039 & 4.36 & & 0.0000 & 2.0 & \\\text { Tuition } & 0.92012 & 0.17875 & 5.15 & & 0.0000 & 2.0 &\end{array}\end{array} The model was then used to create 95% confidence and prediction intervals for y and for E(Y) when the tuition charged by the MBA program was $75,000 and the GMAT score was 675. The results are shown here:

95% confidence interval for E(Y): ($126,610, $136,640)
95% prediction interval for Y: ($90,113, $173,160)

Which of the following interpretations is correct if you want to use the model to estimate Y for a single MBA program?

A) We are 95% confident that the average of all starting salaries for graduates of all MBA programs that charge $75,000 in tuition and have an average GMAT score of 675 will fall
Between $126,610 and $136,640.
B) We are 95% confident that the average starting salary for graduates of a single MBA program that charges $75,000 in tuition and has an average GMAT score of 675 will fall between
$126,610 and $136,640.
C) We are 95% confident that the average starting salary for graduates of a single MBA program that charges $75,000 in tuition and has an average GMAT score of 675 will fall between
$90,113 and $173,16,30.
D) We are 95% confident that the average of all starting salaries for graduates of all MBA programs that charge $75,000 in tuition and have an average GMAT score of 675 will fall
Between $90,113 and $173,16,30.
Question
During its manufacture, a product is subjected to four different tests in sequential order. An efficiency expert claims that the fourth (and last) test is unnecessary since its results can be predicted based on the first three tests. To test this claim, multiple regression will be used to model Test4 score (y)( y ) , as a function of Test1 score (x1)\left( x _ { 1 } \right) , Test 2 score (x2)\left( x _ { 2 } \right) , and Test3 score (x3)\left( x _ { 3 } \right) . [Note: All test scores range from 200 to 800 , with higher scores indicative of a higher quality product.] Consider the model:
E(y)=β1+β1x1+β2x2+β3x3E ( y ) = \beta _ { 1 } + \beta _ { 1 } x _ { 1 } + \beta _ { 2 } x _ { 2 } + \beta _ { 3 } x _ { 3 }
The first-order model was fit to the data for each of 12 units sampled from the production line. The results are summarized in the printout.
 SOURCE  DF  SS  MS  F VALUE  PROB > F  MODEL 31514175047218.16.0075 ERROR 8222312779 TOTAL 12173648\begin{array}{lrrrrr}\text { SOURCE } & \text { DF } & \text { SS } & \text { MS } & \text { F VALUE } & \text { PROB > F } \\\text { MODEL } & 3 & 151417 & 50472 & 18.16 & .0075 \\\text { ERROR } & 8 & 22231 & 2779 & & \\\text { TOTAL } & 12 & 173648 & & &\end{array}


 ROOT MSE 52.72 R-SQUARE 0.872 DEP MEAN 645.8 ADJ R-SQ 0.824\begin{array}{llll}\text { ROOT MSE } & 52.72 & \text { R-SQUARE } & 0.872 \\\text { DEP MEAN } & 645.8 & \text { ADJ R-SQ } & 0.824\end{array}

 PARAMETER  STANDARD  T FOR 0:  VARIABLE  ESTIMATE  ERROR  PARAMETER =0 PROB > T INTERCEPT 11.9880.500.150.885 X1(TEST1) 0.27450.11112.470.039 X2(TEST2) 0.37620.09863.820.005 X3(TEST3) 0.32650.08084.040.004\begin{array}{lrrrr} & \text { PARAMETER } & \text { STANDARD } & \text { T FOR 0: } & \\\text { VARIABLE } & \text { ESTIMATE } & \text { ERROR } & \text { PARAMETER }=0 & \text { PROB > }|\mathrm{T}| \\\text { INTERCEPT } & 11.98 & 80.50 & 0.15 & 0.885 \\\text { X1(TEST1) } & 0.2745 & 0.1111 & 2.47 & 0.039 \\\text { X2(TEST2) } & 0.3762 & 0.0986 & 3.82 & 0.005 \\\text { X3(TEST3) } & 0.3265 & 0.0808 & 4.04 & 0.004\end{array}

Suppose the 95%95 \% confidence interval for β3\beta _ { 3 } is (.15,.47)( .15 , .47 ) . Which of the following statements is incorrect?

A) We are 95%95 \% confident that the Test 3 is a useful linear predictor of Test 4 score, holding Test1 and Test2 fixed.
B) At α=.05\alpha = .05 , there is insufficient evidence to reject H0:β3=0H _ { 0 } : \beta _ { 3 } = 0 in favor of Ha:β30H _ { \mathrm { a } } : \beta 3 \neq 0 .
C) We are 95%95 \% confident that the increase in Test4 score for every 1-point increase in Test3 score falls between .15.15 and .47.47 , holding Test1 and Test 2 fixed.
D) We are 95%95 \% confident that the estimated slope for the Test4-Test3 line falls between .15.15 and .47.47 holding Test1 and Test2 fixed.
Question
The confidence interval for the mean E(y) is narrower that the prediction interval for y.
Question
 <div style=padding-top: 35px>
Question
   <div style=padding-top: 35px>    <div style=padding-top: 35px>
Question
 <div style=padding-top: 35px>
Question
In regression, it is desired to predict the dependent variable based on values of other related independent variables. Occasionally, there are relationships that exist between the independent variables. Which of the following multiple regression pitfalls does this example describe?

A) Multicollinearity
B) Extrapolation
C) Stepwise Regression
D) Estimability
Question
<strong> </strong> A) 1 B) 10 C) 16 D) 13 <div style=padding-top: 35px>

A) 1
B) 10
C) 16
D) 13
Question
A study of the top MBA programs attempted to predict the average starting salary (in $1000's) of graduates of the program based on the amount of tuition (in $1000's) charged by the program and the average GMAT score of the program's students. The results of a regression analysis based on a sample of 75 MBA programs is shown below:  Least Squares Linear Regression of Salary \text { Least Squares Linear Regression of Salary }

 Predictor  Variables  Coefficient  Std Error  T  P  Constant 687.851165.4064.160.0001 Tuition 11.31972.197245.150.0000 GMAT 0.967270.255353.790.0003 TxG 0.018500.003315.580.0000\begin{array}{l}\text { Predictor }\\\begin{array}{lccccc}\text { Variables } & \text { Coefficient } & \text { Std Error } & \text { T } & \text { P } \\\text { Constant } & -687.851 & 165.406 & 4.16 & 0.0001 \\\text { Tuition } & -11.3197 & 2.19724 & -5.15 & 0.0000 \\\text { GMAT } & -0.96727 & 0.25535 & -3.79 & 0.0003 \\\text { TxG } & 0.01850 & 0.00331 & 5.58 & 0.0000\end{array}\end{array}

 R-Squared 0.7816 Resid. Mean Square (MSE) 301.251 Adjusted R-Squared 0.7723 Standard Deviation 17.3566\begin{array}{lccc}\text { R-Squared } & 0.7816 & \text { Resid. Mean Square (MSE) } & 301.251 \\\text { Adjusted R-Squared } & 0.7723 & \text { Standard Deviation } & 17.3566\end{array}

 Source  DF  SS  MS  F  P  Regression 376523.825510.984.680.0000 Residual 7121388.8301.3 Total 7497921.7\begin{array}{lllcccc}\text { Source } & \text { DF } & & \text { SS } & \text { MS } & \text { F } & \text { P } \\\text { Regression } & 3 & & 76523.8 & 25510.9 & 84.68 & 0.0000 \\\text { Residual } & & 71 & 21388.8 & 301.3 & \\\text { Total } & & 74 & 97921.7 & &\end{array}

Cases Included 75 Missing Cases 0 One of the t-test test statistics is shown on the printout to be value t=5.58t = 5.58 . Interpret this value.

A) There is sufficient evidence, at α=0.05\alpha = 0.05 , to indicate that the interaction between average tuition and average GMAT score is a useful predictor of the average starting salary of graduates of MBA programs.
B) There is insufficient evidence, at α=0.05\alpha = 0.05 , to indicate that at least one of the variables proposed in the interaction model is useful at predicting the average starting salary of graduates of MBA programs.
C) There is sufficient evidence, at α=0.05\alpha = 0.05 , to indicate that at least one of the variables proposed in the interaction model is useful at predicting the average starting salary of graduates of MBA programs.
D) There is insufficient evidence, at α=0.05\alpha = 0.05 , to indicate that the interaction between average tuition and average GMAT score is a useful predictor of the average starting salary of graduates of MBA programs.
Question
A study of the top MBA programs attempted to predict the average starting salary (in $1000's) of graduates of the program based on the amount of tuition (in $1000's) charged by the program and the average GMAT score of the program's students. The results of a regression analysis based on a sample of 75 MBA programs is shown below:  Least Squares Linear Regression of Salary  Predictor  Variables  Coefficient  Std Error  T  P  Constant 687.851165.4064.160.0001 Tuition 11.31972.197245.150.0000 GMAT 0.967270.255353.790.0003\begin{array}{l}\text { Least Squares Linear Regression of Salary }\\\begin{array} { l c c c c c } \text { Predictor } & & & & & \\\text { Variables } & \text { Coefficient } & \text { Std Error } & \text { T } & \text { P } \\\text { Constant } & - 687.851 & 165.406 & 4.16 & 0.0001 \\\text { Tuition } & - 11.3197 & 2.19724 & - 5.15 & 0.0000 \\\text { GMAT } & - 0.96727 & 0.25535 & - 3.79 & 0.0003\end{array}\end{array}

 TxG 0.018500.003315.580.0000 \begin{array}{lllll}\text { TxG } & 0.01850 & 0.00331 & 5.58 & 0.0000\end{array}

 R-Squared 0.7816 Resid. Mean Square (MSE) 301.251 Adjusted R-Squared 0.7723 Standard Deviation 17.3566\begin{array}{lccc}\text { R-Squared } & 0.7816 & \text { Resid. Mean Square (MSE) } & 301.251\\\text { Adjusted R-Squared } & 0.7723& \text { Standard Deviation } & 17.3566\end{array}

 Source  DF  SS  MS  F  P  Regression 376523.825510.984.680.0000 Residual 7121388.8301.3 Total 7497921.7\begin{array}{lllcccc}\text { Source } & \text { DF } & \text { SS } & \text { MS } & \text { F } & \text { P } \\\text { Regression } & 3 &76523.8& 25510.9 & 84.68& 0.0000 \\\text { Residual } & 71 &21388.8& 301.3 & \\\text { Total } & 74 & 97921.7 & &\end{array}


Cases Included 75 Missing Cases 0

The global-f test statistic is shown on the printout to be the value F=84.68F = 84.68 . Interpret this value.

A) There is sufficient evidence, at α=0.05\alpha = 0.05 , to indicate that at least one of the variables proposed in the interaction model is useful at predicting the average starting salary of graduates of MBA programs.
B) There is insufficient evidence, at α=0.05\alpha = 0.05 , to indicate that at least one of the variables proposed in the interaction model is useful at predicting the average starting salary of graduates of MBA programs.
C) There is insufficient evidence, at α=0.05\alpha = 0.05 , to indicate that the interaction between average tuition and average GMAT score is a useful predictor of the average starting salary of graduates of MBA programs.
D) There is sufficient evidence, at α=0.05\alpha = 0.05 , to indicate that the interaction between average tuition and average GMAT score is a useful predictor of the average starting salary of graduates of MBA programs.
Question
A fast food chain test marketing a new sandwich chose 18 of its stores in one major
metropolitan area. Nine of the stores were in malls and nine were free standing. The sandwich was offered at three different introductory prices. The table shows the number of new sandwiches sold at each location for each location type and price combination.

Number of New Sandwiches Sold
 A fast food chain test marketing a new sandwich chose 18 of its stores in one major metropolitan area. Nine of the stores were in malls and nine were free standing. The sandwich was offered at three different introductory prices. The table shows the number of new sandwiches sold at each location for each location type and price combination.  Number of New Sandwiches Sold    a. Write a model for the mean number of sandwiches sold,  E ( y ) , assuming that the relationship between  E ( y )  and price,  x _ { 1 } , is first-order. b. Fit the model to the data. c. Write the prediction equations for mall and free-standing stores. d. Do the data provide sufficient evidence that the change in number of sandwiches sold with respect to price is different for mall and free-standing stores? Use  \alpha = .01 .<div style=padding-top: 35px>

a. Write a model for the mean number of sandwiches sold, E(y)E ( y ) , assuming that the relationship between E(y)E ( y ) and price, x1x _ { 1 } , is first-order.
b. Fit the model to the data.
c. Write the prediction equations for mall and free-standing stores.
d. Do the data provide sufficient evidence that the change in number of sandwiches sold with respect to price is different for mall and free-standing stores? Use α=.01\alpha = .01 .
Question
The concessions manager at a beachside park recorded the high temperature, the number of people at the park, and the number of bottles of water sold for each of 12 consecutive saturdays. The data are shown below.  Bottles Sold  Temperature (F) People 341731625425792100457802125485802800469812550395821975511832675549832800543852850537882775621892800897913100\begin{array} { c c c } \hline \text { Bottles Sold } & \text { Temperature } \left( { } ^ { \circ } \mathrm { F } \right) & \text { People } \\\hline 341 & 73 & 1625 \\425 & 79 & 2100 \\457 & 80 & 2125 \\485 & 80 & 2800 \\469 & 81 & 2550 \\395 & 82 & 1975 \\511 & 83 & 2675 \\549 & 83 & 2800 \\543 & 85 & 2850 \\537 & 88 & 2775 \\621 & 89 & 2800 \\897 & 91 & 3100 \\\hline\end{array} a. Fit the model E(y)=β0+β1x1+β2x2E ( y ) = \beta _ { 0 } + \beta _ { 1 } x _ { 1 } + \beta _ { 2 } x _ { 2 } to the data, letting yy represent the number of bottles of water sold, x1x _ { 1 } the temperature, and x2x _ { 2 } the number of people at the park.
b. Find the 95%95 \% confidence interval for the mean number of bottles of water sold when the temperature is 84F84 ^ { \circ } \mathrm { F } and there are 2700 people at the park.
c. Find the 95%95 \% prediction interval for the number of bottles of water sold when the temperature is 84F84 ^ { \circ } \mathrm { F } and there are 2700 people at the park.
Question
In Hawaii, proceedings are under way to enable private citizens to own the property that their homes are built on. In prior years, only estates were permitted to own land, and homeowners leased the land from the estate. In order to comply with the new law, a large Hawaiian estate wants to use regression analysis to estimate the fair market value of the land. The following variables are proposed: y= Sale price of property ($ thousands) y = \text { Sale price of property (\$ thousands) }
x2=1x _ { 2 } = 1 if property near Cove, 0 if not Write a regression model relating the sale price of a property to the qualitative variable x. Interpret all the ?s in the model.
Question
 <div style=padding-top: 35px>
Question
  Interpret the residual plot.<div style=padding-top: 35px> Interpret the residual plot.
Question
A college admissions officer proposes to use regression to model a student's college GPA at graduation in terms of the following two variables: x1= high school GPA x2= SAT score \begin{array} { l } x _ { 1 } = \text { high school GPA } \\x _ { 2 } = \text { SAT score }\end{array} The admissions officer believes the relationship between college GPA and high school GPA is linear and the relationship between SAT score and college GPA is linear. She also believes that the relationship between college GPA and high school GPA depends on the student's SAT score. Write the regression model she should fit.
Question
A certain type of rare gem serves as a status symbol for many of its owners. In theory, for low prices, the demand decreases as the price of the gem increases. However, experts hypothesize that when the gem is valued at very high prices, the demand increases with price due to the status the owners believe they gain by obtaining the gem. Thus, the model proposed to best explain the demand for the gem by its price is the quadratic model E(y)=β0+β1x+β2x2E ( y ) = \beta _ { 0 } + \beta _ { 1 } x + \beta _ { 2 } x ^ { 2 }
where y=y = Demand (in thousands) and x=x = Retail price per carat (dollars).
This model was fit to data collected for a sample of 12 rare gems. A portion of the printout is given below:
 SOURCE  DF  SS  MS  F  PR >F  Model 211514557573373.0001 Error 91388154 TOTAL 11116533\begin{array} { l r r r r r } \text { SOURCE } & \text { DF } & \text { SS } & \text { MS } & \text { F } & \text { PR >F } \\ \text { Model } & 2 & 115145 & 57573 & 373 & .0001 \\ \text { Error } & 9 & 1388 & 154 & & \\ \text { TOTAL } & 11 & 116533 & & & \end{array}

 Root MSE 12.42 R-Square .988\begin{array} { l l l l } \text { Root MSE } & 12.42 & \text { R-Square } & .988\end{array}

PARAMETERT for HO : VARIABLES  ESTIMATES  STD. ERROR  PARAMETER =0 PR >T\begin{array} { l l l l l }&\text {PARAMETER}&&\text {T for \(H O\) :}\\ \text { VARIABLES } & \text { ESTIMATES } & \text { STD. ERROR } & \text { PARAMETER } = 0 & \text { PR } > | T | \end{array}

 INTERPCEP 286.429.6629.64.0001X.31.065.14.0006XX.000067.00007.95.3647\begin{array}{lrrrr}\text { INTERPCEP } & 286.42 & 9.66 & 29.64 & .0001 \\X & -.31 & .06 & -5.14 & .0006 \\X \cdot X & .000067 & .00007 & .95 & .3647\end{array}


Is there sufficient evidence to indicate the model is useful for predicting the demand for the gem? Use α=.01\alpha = .01 .
Question
In any production process in which one or more workers are engaged in a variety of tasks, the total time spent in production varies as a function of the size of the workpool and the level of output of the various activities. In a large metropolitan department store, it is believed that the number of man-hours worked (y)( y ) per day by the clerical staff depends on the number of pieces of mail processed per day (x1)\left( x _ { 1 } \right) and the number of checks cashed per day (x2)\left( x _ { 2 } \right) . Data collected for n=20n = 20 working days were used to fit the model:
E(y)=β0+β1x1+β2x2E ( y ) = \beta _ { 0 } + \beta _ { 1 } x _ { 1 } + \beta _ { 2 } x _ { 2 }
A printout for the analysis follows:

 Analysis of Variance  SOURCE  DF  SS  MS  F VALUE  PROB > F  MODEL 27089.065123544.5325613.2670.0003 ERROR 174541.72142267.16008 C TOTAL 1911630.78654\begin{array}{l}\text { Analysis of Variance }\\\begin{array} { l r r r r r } \text { SOURCE } & \text { DF } & \text { SS } & \text { MS } & \text { F VALUE } & \text { PROB > F } \\\\\text { MODEL } & 2 & 7089.06512 & 3544.53256 & 13.267 & 0.0003 \\\text { ERROR } & 17 & 4541.72142 & 267.16008 & & \\\text { C TOTAL } & 19 & 11630.78654 & & &\end{array}\end{array}

 ROOT MSE 16.34503 R-SQUARE 0.6095 DEP MEAN 93.92682 ADJR-SQ 0.5636 C.V. 17.40188\begin{array}{llll}\text { ROOT MSE } & 16.34503 & \text { R-SQUARE } & 0.6095 \\\text { DEP MEAN } & 93.92682 & \text { ADJR-SQ } & 0.5636 \\\text { C.V. } & 17.40188 & &\end{array}

Parameter Estimates
PARAMETER STANDARD T FOR 0:
VARIABLE DF ESTIMATE ERROR PARAMETER =0 =0 \quad PROB >T >|\mathrm{T}|

 INTERCEPT 1114.42097218.684857446.1240.0001 X1 10.0071020.001713754.1440.0007 X2 10.0372900.020439371.8240.0857\begin{array}{lrrrrr}\text { INTERCEPT } & 1 & 114.420972 & 18.68485744 & 6.124 & 0.0001 \\\text { X1 } & 1 & -0.007102 & 0.00171375 & -4.144 & 0.0007 \\\text { X2 } & 1 & 0.037290 & 0.02043937 & 1.824 & 0.0857\end{array}




 Actual  Predict  Lower 95% CL  Upper 95% CL  OBS X1X2 Value  Value  Residual  Predict  Predict 1778164474.70783.1758.46847.224119.126\begin{array}{rrrrrrrr} & & & \text { Actual } & \text { Predict } & & \text { Lower 95\% CL } & \text { Upper 95\% CL } \\\text { OBS } & \mathrm{X} 1 & \mathrm{X} 2 & \text { Value } & \text { Value } & \text { Residual } & \text { Predict } & \text { Predict } \\1 & 7781 & 644 & 74.707 & 83.175 & -8.468 & 47.224 & 119.126 \\\hline\end{array}

Test to determine if there is a positive linear relationship between the number of man-hours worked, yy , and the number of checks cashed per day, x2x _ { 2 } . Use α=.05\alpha = .05 .
Question
  Interpret the residual plot.<div style=padding-top: 35px> Interpret the residual plot.
Question
The model E(y)=β0+β1x1+β2x2+β3x3+β4x4E ( y ) = \beta _ { 0 } + \beta _ { 1 } x _ { 1 } + \beta _ { 2 } x _ { 2 } + \beta _ { 3 } x _ { 3 } + \beta _ { 4 } x _ { 4 } was used to relate E(y)E ( y ) to a single qualitative variable, where
x1={1 if level 20 if not x2={1 if level 30 if not x _ { 1 } = \left\{ \begin{array} { l l } 1 & \text { if level } 2 \\ 0 & \text { if not } \end{array} \quad x _ { 2 } = \left\{ \begin{array} { l l } 1 & \text { if level } 3 \\ 0 & \text { if not } \end{array} \right. \right.

x3={1 if level 40 if not x4={1 if level 50 if not x _ { 3 } = \left\{ \begin{array} { l l } 1 & \text { if level } 4 \\ 0 & \text { if not } \end{array} \quad x _ { 4 } = \left\{ \begin{array} { l l } 1 & \text { if level } 5 \\ 0 & \text { if not } \end{array} \right. \right.
This model was fit to n=40n = 40 data points and the following result was obtained:
y^=14.5+3x14x2+10x3+8x4\hat { y } = 14.5 + 3 x _ { 1 } - 4 x _ { 2 } + 10 x _ { 3 } + 8 x _ { 4 }
a. Use the least squares prediction equation to find the estimate of E(y)E ( y ) for each level of the qualitative variable.
b. Specify the null and alternative hypothesis you would use to test whether E(y)E ( y ) is the same for all levels of the independent variable.
Question
Retail price data for n=60n = 60 hard disk drives were recently reported in a computer magazine. Three variables were recorded for each hard disk drive:
y=y = Retail PRICE (measured in dollars)
x1=\mathrm { x } _ { 1 } = Microprocessor SPEED (measured in megahertz)
(Values in sample range from 10 to 40 )
x2=CHIP\mathrm { x } _ { 2 } = \mathrm { CHIP } size (measured in computer processing units)
(Values in sample range from 286 to 486 )

A first-order regression model was fit to the data. Part of the printout follows:

 Dep Var  Predict  Std Err  Lower 95%  Upper 95%  OBS  SPEED  CHIP  PRICE  Value  Predict  Predict  Predict  Residual 1333865099.04464.9260.7683942.74987.1634.1\begin{array} { r r r r r r r r r } \hline& & & \text { Dep Var } & \text { Predict } & \text { Std Err } & \text { Lower 95\% } & \text { Upper 95\% } & \\\text { OBS } & \text { SPEED } & \text { CHIP } & \text { PRICE } & \text { Value } & \text { Predict } & \text { Predict } & \text { Predict } & \text { Residual } \\& & & & & & & & \\1 & 33 & 386 & 5099.0 & 4464.9 & 260.768 & 3942.7 & 4987.1 & 634.1\\\hline\end{array}


Interpret the 95%95 \% prediction interval for yy when x1=33x _ { 1 } = 33 and x2=386x _ { 2 } = 386 .
Question
As part of a study at a large university, data were collected on n=224n = 224 freshmen computer science (CS) majors in a particular year. The researchers were interested in modeling yy , a student's grade point average (GPA) after three semesters, as a function of the following independent variables (recorded at the time the students enrolled in the university):
x1=x _ { 1 } = average high school grade in mathematics (HSM)
x2=x _ { 2 } = average high school grade in science (HSS)
x3=x _ { 3 } = average high school grade in English (HSE)
x4=x _ { 4 } = SAT mathematics score (SATM)
x5=x _ { 5 } = SAT verbal score (SATV)

A first-order model was fit to data with Ra2=.193R _ { a } ^ { 2 } = .193 .

Interpret the value of the adjusted coefficient of determination Ra2R _ { a } ^ { 2 } .
Question
Consider the data given in the table below. XY142625374746545563\begin{array} { c c } \hline \mathrm { X } & \mathrm { Y } \\\hline 1 & 4 \\2 & 6 \\2 & 5 \\3 & 7 \\4 & 7 \\4 & 6 \\5 & 4 \\5 & 5 \\6 & 3 \\\hline\end{array} a. Plot the data on a scattergram. Does a quadratic model seem to be a good fit for the
data? Explain.
b. Use the method of least squares to find a quadratic prediction equation.
c. Graph the prediction equation on your scattergram.
Question
A certain type of rare gem serves as a status symbol for many of its owners. In theory, for low prices, the demand decreases as the price of the gem increases. However, experts hypothesize that when the gem is valued at very high prices, the demand increases with price due to the status the owners believe they gain by obtaining the gem. Thus, the model proposed to best explain the demand for the gem by its price is the quadratic model E(y)=β0+β1x+β2x2E ( y ) = \beta _ { 0 } + \beta _ { 1 } x + \beta _ { 2 } x ^ { 2 }
where y=y = Demand (in thousands) and x=x = Retail price per carat (dollars). This model was fit to data collected for a sample of 12 rare gems. A portion of the printout is given below: Does the quadratic term contribute useful information for predicting the demand for the gem? Use α=.10\alpha = .10 .

 SOURCE  DF  SS  MS  F  PR > F  Model 211514557573373.0001 Error 91388154 TOTAL 11116533\begin{array}{lrrrrr}\text { SOURCE } & \text { DF } & \text { SS } & \text { MS } & \text { F } & \text { PR > F } \\\text { Model } & 2 & 115145 & 57573 & 373 & .0001 \\\text { Error } & 9 & 1388 & 154 & & \\\text { TOTAL } & 11 & 116533 & & &\end{array}


 Root MSE 12.42 R-Square .988\begin{array}{llll}\text { Root MSE } & 12.42 & \text { R-Square } & .988\end{array}


 PARAMETER  T for HO:  VARIABLES  ESTIMATES  STD. ERROR  PARAMETER =0 PR > > INTERPCEP 286.429.6629.64.0001 X .31.065.14.0006 X.X .000067.00007.95.3647\begin{array}{lrrrr} & \text { PARAMETER } & \text { T for HO: } \\\text { VARIABLES } & \text { ESTIMATES } & \text { STD. ERROR } & \text { PARAMETER }=0 & \text { PR > }>\mid \\\text { INTERPCEP } & 286.42 & 9.66 & 29.64 & .0001 \\\text { X } & -.31 & .06 & -5.14 & .0006 \\\text { X.X } & .000067 & .00007 & .95 & .3647\end{array}

Does the quadratic term contribute useful information for predicting the demand for the gem? Use α=10 \alpha=10 .


Question
  Is there evidence of multicollinearity in the printout? Explain.<div style=padding-top: 35px> Is there evidence of multicollinearity in the printout? Explain.
Question
 <div style=padding-top: 35px>
Question
 <div style=padding-top: 35px>
Question
Why is the random error term ? added to a multiple regression model?
Question
A college admissions officer proposes to use regression to model a student's college GPA at graduation in terms of the following two variables: x1= high school GPA x2= SAT score \begin{array} { l } x _ { 1 } = \text { high school GPA } \\x _ { 2 } = \text { SAT score }\end{array} The admissions officer believes the relationship between college GPA and high school GPA is linear and the relationship between SAT score and college GPA is linear. She also believes that the relationship between college GPA and high school GPA depends on the student's SAT score. She proposes the regression model: E(y)=β0+β1x1+β2x2+β3x1x2E ( y ) = \beta _ { 0 } + \beta _ { 1 } x _ { 1 } + \beta _ { 2 } x _ { 2 } + \beta _ { 3 } x _ { 1 } x _ { 2 } Explain how to determine if the relationship between college GPA and SAT score depends on the high school GPA.
Question
 <div style=padding-top: 35px>
Question
Consider the data given in the table below. XY1726253534444342545566\begin{array} { c c } \hline \mathrm { X } & \mathrm { Y } \\\hline 1 & 7 \\2 & 6 \\2 & 5 \\3 & 5 \\3 & 4 \\4 & 4 \\4 & 3 \\4 & 2 \\5 & 4 \\5 & 5 \\6 & 6 \\\hline\end{array} Plot the data on a scattergram. Does a second-order model seem to be a good fit for the data? Explain.
Question
The table shows the profit y (in thousands of dollars) that a company made during a month when the price of its product was x dollars per unit.

 Profit, y Price, x121.20171.25201.29211.30241.35261.39271.40231.45211.49201.50151.55111.59101.6051.65\begin{array}{cc}\hline \text { Profit, } y & \text { Price, } x \\\hline 12 & 1.20 \\17 & 1.25 \\20 & 1.29 \\21 & 1.30 \\24 & 1.35 \\26 & 1.39 \\27 & 1.40 \\23 & 1.45 \\21 & 1.49 \\20 & 1.50 \\15 & 1.55 \\11 & 1.59 \\10 & 1.60 \\5 & 1.65 \\\hline\end{array}

a. Fit the model y=β0+β1x+β2x2+εy = \beta _ { 0 } + \beta _ { 1 } x + \beta _ { 2 } x 2 + \varepsilon to the data and give the least squares prediction equation.
b. Plot the fitted equation on a scattergram of the data.
c. Is there sufficient evidence of downward curvature in the relationship between profit and price? Use α=.05\alpha = .05 .
Question
The model E(y)=β0+β1x1+β2x2+β3x3E ( y ) = \beta _ { 0 } + \beta _ { 1 } x _ { 1 } + \beta _ { 2 } x _ { 2 } + \beta _ { 3 } x _ { 3 } was used to relate E(y)E ( y ) to a single qualitative variable. How many levels does the qualitative variable have?
Question
The table below shows data for n=20n = 20 observations.

yx1x2183823510152731612244928511172719383071028581436327111728245102661127611213631713192825510\begin{array}{ccc}\hline \mathrm{y} & \mathrm{x} 1 & \mathrm{x} 2 \\\hline 18 & 3 & 8 \\23 & 5 & 10 \\15 & 2 & 7 \\31 & 6 & 12 \\24 & 4 & 9 \\28 & 5 & 11 \\17 & 2 & 7 \\19 & 3 & 8 \\30 & 7 & 10 \\28 & 5 & 8 \\14 & 3 & 6 \\32 & 7 & 11 \\17 & 2 & 8 \\24 & 5 & 10 \\26 & 6 & 11 \\27 & 6 & 11 \\21 & 3 & 6 \\31 & 7 & 13 \\19 & 2 & 8 \\25 & 5 & 10 \\\hline\end{array}

a. Use a first-order regression model to find a least squares prediction equation for the model.
b. Find a 95%95 \% confidence interval for the coefficient of x1x _ { 1 } in your model. Interpret the result.
c. Find a 95%95 \% confidence interval for the coefficient of x2x _ { 2 } in your model. Interpret the result.
d. Find R2R ^ { 2 } and Ra2R _ { a } 2 and interpret these values.
e. Test the null hypothesis H0:β1=β2=0H _ { 0 } : \beta _ { 1 } = \beta _ { 2 } = 0 against the alternative hypothesis Ha:H _ { \mathrm { a } } : at least one βi0\beta _ { i } \neq 0 . Use α=.05\alpha = .05 . Interpret the result.
Question
A statistics professor gave three quizzes leading up to the first test in his class. The quiz grades and test grade for each of eight students are given in the table.

 Student TestGradeQuiz 1  Quiz 2  Quiz 3 175895289107637398749187105649666788767831087871946\begin{array}{ccccc}\text { Student}& \text { TestGrade}& \text {Quiz 1 } & \text { Quiz 2 } & \text { Quiz 3 } \\\hline 1 & 75 & 8 & 9 & 5 \\2 & 89 & 10 & 7 & 6 \\3 & 73 & 9 & 8 & 7 \\4 & 91 & 8 & 7 & 10 \\5 & 64 & 9 & 6 & 6 \\6 & 78 & 8 & 7 & 6 \\7 & 83 & 10 & 8 & 7 \\8 & 71 & 9 & 4 & 6 \\ \hline\end{array}
The professor would like to use the data to find a first-order model that he might use to predict a student's grade on the first test using that student's grades on the first threequizzes.
a. Identify the dependent and independent variables for the model.
b. What is the least squares prediction equation?
c. Find the SSE and the estimator of σ2\sigma ^ { 2 } for the model.
Question
Retail price data for n=60n = 60 hard disk drives were recently reported in a computer magazine. Three variables were recorded for each hard disk drive:
y=y = Retail PRICE (measured in dollars)
x1=x _ { 1 } = Microprocessor SPEED (measured in megahertz)
(Values in sample range from 10 to 40 )
x2=x _ { 2 } = CHIP size (measured in computer processing units)
(Values in sample range from 286 to 486 )

A first-order regression model was fit to the data. Part of the printout follows:
 Analysis of Variance  SOURCE  DF  SS  MS  F VALUE  PROB > F  MODEL 234593103.00817296051.50419.0180.0001 ERROR 5751840202.926909477.24431 CTOTAL 5986432305.933\begin{array}{lrrrrr} & {\text { Analysis of Variance }} \\\text { SOURCE } & \text { DF } & \text { SS } & \text { MS } & \text { F VALUE } & \text { PROB > F } \\\text { MODEL } & 2 & 34593103.008 & 17296051.504 & 19.018 & 0.0001 \\\text { ERROR } & 57 & 51840202.926 & 909477.24431 & & \\\text { CTOTAL } & 59 & 86432305.933 & & &\end{array}


 ROOT MSE 953.66516 R-SQUARE 0.4002 DEP MEAN 3197.96667 ADJ R-SQ 0.3792 C.V. 29.82099\begin{array}{llll}\text { ROOT MSE } & 953.66516 & \text { R-SQUARE } & 0.4002 \\\text { DEP MEAN } & 3197.96667 & \text { ADJ R-SQ } & 0.3792 \\\text { C.V. } & 29.82099 & &\end{array}


Test to determine if the model is adequate for predicting the price of a computer. Use α=\alpha = .01.01 .
Question
The sum of squared errors (SSE) of a least squares regression model decreases when new terms are added to the model.
Question
 <div style=padding-top: 35px>
Question
An elections officer wants to model voter turnout (y) in a precinct as a function of type of election, national or state.

Write a model for mean voter turnout, E(y), as a function of type of election.

A) E(y)=β0+β1xE ( y ) = \beta _ { 0 } + \beta 1 ^ { x } , where x=1x = 1 if national, 0 if state
B) E(y)=β0+β1xE ( y ) = \beta _ { 0 } + \beta _ { 1 } x , where x=x = voter turnout
C) E(y)=β0+β1x1+β2x2E ( y ) = \beta _ { 0 } + \beta _ { 1 } x _ { 1 } + \beta _ { 2 } x _ { 2 } , where x1=1x _ { 1 } = 1 if national, 0 if not and x2=1x _ { 2 } = 1 if state, 0 if not
D) E(y)=β0+β1x+β2x2E ( y ) = \beta _ { 0 } + \beta _ { 1 } x + \beta _ { 2 } x ^ { 2 } , where x=x = voter turnout
Question
 <div style=padding-top: 35px>
Question
 <div style=padding-top: 35px>
Question
<strong> </strong> A) The model is statistically useful for predicting Test4 score. B) The model is not statistically useful for predicting Test4 score. C) The first three test scores are reliable predictors of Test4 score. D) The first three test scores are poor predictors of Test4 score. <div style=padding-top: 35px>

A) The model is statistically useful for predicting Test4 score.
B) The model is not statistically useful for predicting Test4 score.
C) The first three test scores are reliable predictors of Test4 score.
D) The first three test scores are poor predictors of Test4 score.
Question
In any production process in which one or more workers are engaged in a variety of tasks, the total time spent in production varies as a function of the size of the workpool and the level of output of the various activities. In a large metropolitan department store, it is believed that the number of man-hours worked (y)( y ) per day by the clerical staff depends on the number of pieces of mail processed per day (x1)\left( x _ { 1 } \right) and the number of checks cashed per day (x2)\left( x _ { 2 } \right) . Data collected for n=20n = 20 working days were used to fit the model:
E(y)=β0+β1x1+β2x2E ( y ) = \beta _ { 0 } + \beta _ { 1 } x _ { 1 } + \beta _ { 2 } x _ { 2 }
A partial printout for the analysis follows:
 Actual  Predict  Lower 95% CL  Upper 95% CL  OBS  X1  X2  Value  Value  Residual  Predict  Predict 1778164474.70783.1758.46847.224119.126\begin{array} { r r r r r r r r } \hline & & & \text { Actual } & \text { Predict } & & \text { Lower 95\% CL } & \text { Upper 95\% CL } \\\text { OBS } & \text { X1 } & \text { X2 } & \text { Value } & \text { Value } & \text { Residual } & \text { Predict } & \text { Predict } \\& & & & & & & \\1 & 7781 & 644 & 74.707 & 83.175 & - 8.468 & 47.224 & 119.126 \\\hline\end{array}

Interpret the 95% prediction interval for y shown on the printout.

A) We are 95% confident that the mean number of man-hours worked per day falls between 47.224 and 119.126 for all days in which 7,781 pieces of mail are processed and 644 checks are
Cashed.
B) We are 95% confident that the number of man-hours worked per day falls between 47.224 and 119.126.
C) We are 95% confident that between 47.224 and 119.126 man-hours will be worked during a single day in which 7,781 pieces of mail are processed and 644 checks are cashed.
D) We expect to predict number of man-hours worked per day to within an amount between 47.224 and 119.126 of the true value.
Question
Operations managers often use work sampling to estimate how much time workers spend on each operation. Work sampling-which involves observing workers at random points in time-was applied to the staff of the catalog sales department of a clothing manufacturer.
The department applied regression to the following data collected for 40 consecutive working days:
TIME: y=\quad\quad y = Time spent (in hours) taking telephone orders during the day
ORDERS: x1=\quad x _ { 1 } = Number of telephone orders received during the day
WEEK: x2=1\quad\quad x _ { 2 } = 1 weekday, 0 if Saturday or Sunday

Consider the complete 2nd-order model:
E(y)=β0+β1x1+β2(x1)2+β3x2+β4x1x2+β5(x1)2x2E ( y ) = \beta _ { 0 } + \beta _ { 1 } x _ { 1 } + \beta _ { 2 } \left( x _ { 1 } \right) ^ { 2 } + \beta _ { 3 } x _ { 2 } + \beta _ { 4 } x _ { 1 } x _ { 2 } + \beta _ { 5 } \left( x _ { 1 } \right) ^ { 2 } x _ { 2 } Explain how to conduct a test to determine if a quadratic relationship between total order time and the number of orders taken is necessary in the regression model above. Specify the null and alternative hypotheses that are to be tested.
Question
A statistics professor gave three quizzes leading up to the first test in his class. The quiz grades and test grade for each of eight students are given in the table.

 Student Test Grade  Quiz 1 Quiz 2 Quiz 3175895289107637398749187105649666788767831087871946\begin{array}{ccccc}\hline \text { Student Test Grade } & \text { Quiz } 1 & \text { Quiz } 2 & \text { Quiz } 3 \\\hline 1 & 75 & 8 & 9 & 5 \\2 & 89 & 10 & 7 & 6 \\3 & 73 & 9 & 8 & 7 \\4 & 91 & 8 & 7 & 10 \\5 & 64 & 9 & 6 & 6 \\6 & 78 & 8 & 7 & 6 \\7 & 83 & 10 & 8 & 7 \\8 & 71 & 9 & 4 & 6 \\\hline\end{array}
The professor fit a first-order model to the data that he intends to use to predict a student's grade on the first test using that student's grades on the first three quizzes.

Test the null hypothesis H0:β1=β2=β3=0H _ { 0 } : \beta _ { 1 } = \beta _ { 2 } = \beta _ { 3 } = 0 against the alternative hypothesis HaH _ { \mathrm { a } } : at least one βi0\beta _ { i } \neq 0 . Use α=.05\alpha = .05 . Interpret the result.
Question
In stepwise regression, the probability of making one or more Type I or Type II errors is quite small.
Question
The printout shows the results of a first-order regression analysis relating the sales price yy of a product to the time in hours x1x _ { 1 } and the cost of raw materials x2x _ { 2 } needed to make the product.
SUMMARY OUTPUT
 Regression Statistics  Multiple R 0.997578302 R Square 0.995162468 Adjusted R Square 0.990324936 Standard Error 1.185250723 Observations 5\begin{array}{ll}\hline \text { Regression Statistics } & \\\hline \text { Multiple R } & 0.997578302 \\\text { R Square } & 0.995162468 \\\text { Adjusted R Square } & 0.990324936 \\\text { Standard Error } & 1.185250723 \\\text { Observations } & 5 \\\hline\end{array}

ANOVA
df SS  MS F Significance F  Regression 2577.9903614288.9952205.7170.004837532 Residual 22.8096385541.404819 Total 4580.8\begin{array} { l l l l l l } & d f & \text { SS } & \text { MS } & F & \text { Significance F } \\\hline \text { Regression } & 2 & 577.9903614 & 288.9952 & 205.717 & 0.004837532 \\\text { Residual } & 2 & 2.809638554 & 1.404819 & & \\\text { Total } & 4 & 580.8 & & & \\\hline\end{array}


 Coefficients  Standard Error t Stat P-value  Lower 95%  Upper 95%  Intercept 26.484337353.6746687737.207270.01871342.2951719810.67350271 Time 2.1686746994.114065320.527140.65073219.870081415.532732 Materials 8.1421686751.0946815837.4379330.01763.43213069312.85220666\begin{array}{lllllll}\hline & \text { Coefficients } & \text { Standard Error } & t \text { Stat } & P \text {-value } & \text { Lower 95\% } & \text { Upper 95\% } \\\hline \text { Intercept } & -26.48433735 & 3.674668773 & -7.20727 & 0.018713 & -42.29517198 & -10.67350271 \\\text { Time } & -2.168674699 & 4.11406532 & -0.52714 & 0.650732 & -19.8700814 & 15.532732 \\\text { Materials } & 8.142168675 & 1.094681583 & 7.437933 & 0.0176 & 3.432130693 & 12.85220666 \\\hline\end{array}

a. What is the least squares prediction equation?
b. Identify the SSE from the printout.
c. Find the estimator of σ2\sigma ^ { 2 } for the model.
Question
A collector of grandfather clocks believes that the price received for the clocks at an auction increases with the number of bidders, but at an increasing (rather than a constant) rate. Thus, the model proposed to best explain auction price (y, in dollars) by number of bidders (x) is the quadratic model E(y)=β0+β1x+β2x2E ( y ) = \beta _ { 0 } + \beta _ { 1 } x + \beta _ { 2 } x ^ { 2 }
This model was fit to data collected for a sample of 32 clocks sold at auction; the resulting estimate of β1\beta _ { 1 } was .31- .31 .
Interpret this estimate of β1\beta _ { 1 } .

A) We estimate the auction price will increase $.31\$ .31 for each additional bidder at the auction.
B) β1\beta 1 is a shift parameter that has no practical interpretation.
C) We estimate the auction price will be $.31- \$ .31 when there are no bidders at the auction.
D) We estimate the auction price will decrease $.31\$ .31 for each additional bidder at the auction.
Question
The method of fitting first-order models is the same as that of fitting the simple straight-line model, i.e. the method of least squares.
Question
The concessions manager at a beachside park recorded the high temperature, the number of people at the park, and the number of bottles of water sold for each of 12 consecutiveSaturdays. The data are shown below.

 Bottles Sold Temperature (F) People 341731625425792100457802125485802800469812550395821975511832675549832800543852850537882775621892800897913100\begin{array}{ccc}\hline \text { Bottles Sold Temperature }\left({ }^{\circ} \mathrm{F}\right) & \text { People } \\\hline 341 & 73 & 1625 \\425 & 79 & 2100 \\457 & 80 & 2125 \\485 & 80 & 2800 \\469 & 81 & 2550 \\395 & 82 & 1975 \\511 & 83 & 2675 \\549 & 83 & 2800 \\543 & 85 & 2850 \\537 & 88 & 2775 \\621 & 89 & 2800 \\897 & 91 & 3100 \\\hline\end{array}

a. Fit the model E(y)=β0+β1x1+β2x2+β3x1x2E ( y ) = \beta _ { 0 } + \beta _ { 1 } x _ { 1 } + \beta _ { 2 } x _ { 2 } + \beta _ { 3 } x _ { 1 } x _ { 2 } to the data, letting yy represent the number of bottles of water sold, x1x _ { 1 } the temperature, and x2x _ { 2 } the number of people at the park.
b. Identify at least two indicators of multicollinearity in the model.
c. Comment on the usefulness of the model to predict the number of bottles of water sold on a Saturday when the high temperature is 103F103 ^ { \circ } \mathrm { F } and there are 3500 people at the park.
Unlock Deck
Sign up to unlock the cards in this deck!
Unlock Deck
Unlock Deck
1/131
auto play flashcards
Play
simple tutorial
Full screen (f)
exit full mode
Deck 12: Multiple Regression and Model Building
1
<strong>  Which of the following assumptions appears violated based on this plot?</strong> A) The variance of the errors is constant B) The errors are independent C) The mean of the errors is zero D) The errors are normally distributed Which of the following assumptions appears violated based on this plot?

A) The variance of the errors is constant
B) The errors are independent
C) The mean of the errors is zero
D) The errors are normally distributed
C
2
C
3
Consider the second-order model Consider the second-order model
B
4
What relationship between x and y is suggested by the scattergram? <strong>What relationship between x and y is suggested by the scattergram?  </strong> A) a quadratic relationship with downward concavity B) a linear relationship with negative slope C) a linear relationship with positive slope D) a quadratic relationship with upward concavity

A) a quadratic relationship with downward concavity
B) a linear relationship with negative slope
C) a linear relationship with positive slope
D) a quadratic relationship with upward concavity
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
5
A study of the top MBA programs attempted to predict the average starting salary (in $1000's) of graduates of the program based on the amount of tuition (in $1000's) charged by the program and The average GMAT score of the program's students. The results of a regression analysis based on a Sample of 75 MBA programs is shown below:  Least Squares Linear Regression of Salary \text { Least Squares Linear Regression of Salary }

 Predictor  Variables  Coefficient  Std Error  T  P  Constant 169.91026.53506.400.0000 Tuition 3.373730.811714.160.0001 TxT 0.035630.005906.030.0000\begin{array}{l}\text { Predictor }\\\begin{array}{lcccl}\text { Variables } & \text { Coefficient } & \text { Std Error } & \text { T } & \text { P } \\\text { Constant } & 169.910 & 26.5350 & 6.40 & 0.0000 \\\text { Tuition } & -3.37373 & 0.81171 & -4.16 & 0.0001 \\\text { TxT } & 0.03563 & 0.00590 & 6.03 & 0.0000\end{array}\end{array}

 R-Squared 0.7361 Resid. Mean Square (MSE) 358.887 Adjusted R-Squared 0.7288 Standard Deviation 18.9443\begin{array}{lccc}\text { R-Squared } & 0.7361 & \text { Resid. Mean Square (MSE) } & 358.887 \\\text { Adjusted R-Squared } & 0.7288 & \text { Standard Deviation } & 18.9443\end{array}



 Source  DF  SS  MS  F  P  Regression 272081.836040.9100.420.0000 Residual 7225839.8358.9 Total 7497921.7 Cases Included 75 Missing Cases 0\begin{array}{l}\begin{array} { l l l c c c c } \text { Source } & \text { DF } & \text { SS } & \text { MS } & \text { F } & \text { P } \\\text { Regression } & 2 & & 72081.8 & 36040.9 & 100.42 & 0.0000 \\\text { Residual } & & 72 & 25839.8 & 358.9 & \\\text { Total } & & 74 & 97921.7 & & &\end{array}\\\\\text { Cases Included } 75 \text { Missing Cases } 0\end{array}
The global-f test statistic is shown on the printout to be the value F=100.42\mathrm { F } = 100.42 . Interpret this value.

A) There is insufficient evidence, at α=0.05\alpha = 0.05 , to indicate that at least one of the variables proposed in the interaction model is useful at predicting the average starting salary of graduates of MBA programs.
B) There is sufficient evidence, at α=0.05\alpha = 0.05 , to indicate that at least one of the variables proposed in the interaction model is useful at predicting the average starting salary of graduates of MBA programs.
C) There is sufficient evidence, at α=0.05\alpha = 0.05 , to indicate that there is a linear relationship between average starting salary of graduates of MBA programs and the tuition of the MBA program.
D) There is sufficient evidence, at α=0.05\alpha = 0.05 , to indicate that there is a curvilinear relationship between average starting salary of graduates of MBA programs and the tuition of the MBA program.
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
6
A study of the top MBA programs attempted to predict the average starting salary (in $1000's) of graduates of the program based on the amount of tuition (in $1000's) charged by the program and the average GMAT score of the program's students. The results of a regression analysis based on a sample of 75 MBA programs is shown below: Least Squares Linear Regression of Salary

Predictor Variables  Coefficient  Std Error  T  P  VIF  Constant 203.40251.65733.940.00020.0 Gmat 0.394120.090394.360.00002.0 Tuition 0.920120.178755.150.00002.0\begin{array}{lcccccccc}\text {Predictor}\\\text { Variables } & \text { Coefficient } & \text { Std Error } & \text { T } & & \text { P } & {\text { VIF }} \\\text { Constant } & -203.402 & 51.6573 & -3.94 & & 0.0002 & 0.0 \\\text { Gmat } & 0.39412 & 0.09039 & 4.36 & & 0.0000 & 2.0 & \\\text { Tuition } & 0.92012 & 0.17875 & 5.15 & & 0.0000 & 2.0 &\end{array}


 Source  DF  SS  MS  F  P  Regression 267140.933570.578.530.0000 Residual 7230780.8427.5 Total 7497921.7\begin{array}{lllcccc}\text { Source } & \text { DF } & \text { SS } & \text { MS } & \text { F } & \text { P } \\\text { Regression } & 2 & 67140.9 & 33570.5 & 78.53 & 0.0000 \\\text { Residual } & 72 & 30780.8 & 427.5 & \\\text { Total } & 74 & 97921.7 & & &\end{array}

Interpret the p-value for the global f-test shown on the printout.

A) At ? = 0.05, there is sufficient evidence to indicate that something in the regression model is useful for predicting the average starting salary of the graduates of an MBA program.
B) At ? = 0.05, there is insufficient evidence to indicate that the average GMAT score of the MBA program's students is useful for predicting the average starting salary of the graduates of an
MBA program.
C) At ? = 0.05, there is sufficient evidence to indicate that the average GMAT score of the MBA program's students is useful for predicting the average starting salary of the graduates of an
MBA program.
D) At ? = 0.05, there is insufficient evidence to indicate that something in the regression model is useful for predicting the average starting salary of the graduates of an MBA program.
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
7
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
8
A study of the top MBA programs attempted to predict the average starting salary (in $1000's) of graduates of the program based on the amount of tuition (in $1000's) charged by the program and the average GMAT score of the program's students. The results of a regression analysis based on a sample of 75 MBA programs is shown below: Least Squares Linear Regression of Salary
 Predictor  Variables  Coefficient  Std Error  T  P  VIF  Constant 203.40251.65733.940.00020.0 Gmat 0.394120.090394.360.00002.0\begin{array}{l}\text { Predictor }\\\begin{array}{lccccccc}\text { Variables } & \text { Coefficient } & \text { Std Error } & \text { T } & \text { P } & {\text { VIF }} \\\text { Constant } & -203.402 & 51.6573 & -3.94 & {0.0002} & 0.0 \\\text { Gmat } & 0.39412 & 0.09039 & 4.36 & 0.0000 & 2.0\end{array}\end{array}


 R-Squared 0.6857 Resid. Mean Square (MSE) 427.511 Adjusted R-Squared 0.6769 Standard Deviation 20.6763\begin{array}{lrrr}\text { R-Squared } & 0.6857 & \text { Resid. Mean Square (MSE) } & 427.511 \\\text { Adjusted R-Squared } & 0.6769 & \text { Standard Deviation } & 20.6763\end{array}

Identify the test statistic that should be used to test to determine if the amount of tuition charged by a program is a useful predictor of the average starting salary of the graduates of the program.

A) t=5.15t = 5.15
B) t=20.67t = 20.67
C) t=3.94t = - 3.94
D) t=4.36t = 4.36
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
9
A public health researcher wants to use regression to predict the sun safety knowledge of pre-school children. The researcher randomly sampled 35 preschoolers, assigned them to one of Two groups, and then measured the following three variables:
SUNSCORE: y=\quad \mathrm { y } = Score on sun-safety comprehension test
READING: x1=\quad \mathrm { x } _ { 1 } = Reading comprehension score
GROUP: x2=1\quad\quad x _ { 2 } = 1 if child received a Be Sun Safe demonstration, 0 if not

The following two models were hypothesized:
Model 1: E(y)=β0+β1x1+β2x12+β3x2+β4x1x2+β5x12x2E ( y ) = \beta _ { 0 } + \beta _ { 1 } x _ { 1 } + \beta _ { 2 } x _ { 1 } ^ { 2 } + \beta _ { 3 } x _ { 2 } + \beta _ { 4 } x _ { 1 } x _ { 2 } + \beta _ { 5 } x _ { 1 } ^ { 2 } x _ { 2 }
Model 2: E(y)=β0+β1x1+β3x2+β4x1x2E ( y ) = \beta _ { 0 } + \beta _ { 1 } x _ { 1 } + \beta _ { 3 } x _ { 2 } + \beta _ { 4 } x _ { 1 } x _ { 2 }

A partial f-test was conducted to compare the two models and the resulting p-value was found to be 0.0023. Fill in the blank. The results lead us to conclude that there is _____  (at α=0.05)\text { (at } \alpha = 0.05 )

A) insufficient evidence of quadratic relationship between sun-safety score to reading score.
B) sufficient evidence of a statistically useful model for sun-safety score.
C) sufficient evidence of interaction between sun-safety score and reading score.
D) sufficient evidence of a quadratic relationship between sun-safety score to reading score.
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
10
We decide to conduct a multiple regression analysis to predict the attendance at a major league baseball game. We use the size of the stadium as a quantitative independent variable and the type Of game as a qualitative variable (with two levels - day game or night game). We hypothesize the
Following model: E(y)=β0+β1x1+β2x2+β3x3\mathrm { E } ( \mathrm { y } ) = \beta _ { 0 } + \beta _ { 1 ^ { \mathrm { x } } 1 } + \beta _ { 2 \mathrm { x } _ { 2 } } + \beta _ { 3 } \mathrm { x } _ { 3 }
Where \quad x1=x _ { 1 } = size of the stadium
\quad \quad \quad x2=1x _ { 2 } = 1 if a day game, 0 if a night game

A plot of the yx1y - x _ { 1 } relationship would show:

A) Two non-parallel curves
B) Two parallel lines
C) Two parallel curves
D) Two non-parallel lines
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
11
Which equation represents a complete second-order model for two quantitative independent variables?

A) E(y)=β0+β1x12+β2x22+β3x12x2+β4x1x22+β5x12x22E ( y ) = \beta _ { 0 } + \beta _ { 1 } x _ { 1 } ^ { 2 } + \beta _ { 2 } x _ { 2 } ^ { 2 } + \beta _ { 3 } x _ { 1 } ^ { 2 } x _ { 2 } + \beta _ { 4 } x _ { 1 } x _ { 2 } ^ { 2 } + \beta _ { 5 } x _ { 1 } ^ { 2 } x _ { 2 } ^ { 2 }
B) E(y)=β0+β1x1+β2x2+β3x1x2+β4x12+β5x22E ( y ) = \beta _ { 0 } + \beta _ { 1 } x _ { 1 } + \beta _ { 2 } x _ { 2 } + \beta _ { 3 } x _ { 1 } x _ { 2 } + \beta _ { 4 } x _ { 1 } ^ { 2 } + \beta _ { 5 } x _ { 2 } ^ { 2 }
C) E(y)=β0+β1x1+β2x2+β3x12+β4x22E ( y ) = \beta _ { 0 } + \beta _ { 1 } x _ { 1 } + \beta _ { 2 } x _ { 2 } + \beta _ { 3 } x _ { 1 } ^ { 2 } + \beta _ { 4 } x _ { 2 } ^ { 2 }
D) E(y)=β0+β1x1x2+β2x12+β3x22E ( y ) = \beta _ { 0 } + \beta _ { 1 } x _ { 1 } x _ { 2 } + \beta _ { 2 } x _ { 1 } ^ { 2 } + \beta _ { 3 } x _ { 2 } ^ { 2 }
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
12
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
13
<strong> </strong> A) 11 B) .9286 C) 5.5 D) .9405

A) 11
B) .9286
C) 5.5
D) .9405
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
14
Which of the following is not a possible indicator of multicollinearity?

A) significant correlations between pairs of independent variables
B) non-significant t-tests for individual β parameters when the F-test for overall model adequacy is significant
C) signs opposite from what is expected in the estimated β parameters
D) non-random patterns in the plot of the residuals versus the fitted values
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
15
A study of the top MBA programs attempted to predict the average starting salary (in $1000's) of graduates of the program based on the amount of tuition (in $1000's) charged by the program and The average GMAT score of the program's students. The results of a regression analysis based on a Sample of 75 MBA programs is shown below: Least Squares Linear Regression of Salary

<strong>A study of the top MBA programs attempted to predict the average starting salary (in $1000's) of graduates of the program based on the amount of tuition (in $1000's) charged by the program and The average GMAT score of the program's students. The results of a regression analysis based on a Sample of 75 MBA programs is shown below: Least Squares Linear Regression of Salary     The model was then used to create 95% confidence and prediction intervals for y and for E(Y) when The tuition charged by the MBA program was $75,000 and the GMAT score was 675. The results are Shown here:  95% confidence interval for E(Y): ($126,610, $136,640) 95% prediction interval for Y: ($90,113, $173,160)  Which of the following interpretations is correct if you want to use the model to estimate E(Y) for All MBA programs?</strong> A) We are 95% confident that the average starting salary for graduates of a single MBA program that charges $75,000 in tuition and has an average GMAT score of 675 will fall between $90,113 and $173,16,30. B) We are 95% confident that the average starting salary for graduates of a single MBA program that charges $75,000 in tuition and has an average GMAT score of 675 will fall between $126,610 and $136,640. C) We are 95% confident that the average of all starting salaries for graduates of all MBA programs that charge $75,000 in tuition and have an average GMAT score of 675 will fall Between $126,610 and $136,640. D) We are 95% confident that the average of all starting salaries for graduates of all MBA programs that charge $75,000 in tuition and have an average GMAT score of 675 will fall Between $90,113 and $173,16,30.

The model was then used to create 95% confidence and prediction intervals for y and for E(Y) when The tuition charged by the MBA program was $75,000 and the GMAT score was 675. The results are Shown here:

95% confidence interval for E(Y): ($126,610, $136,640)
95% prediction interval for Y: ($90,113, $173,160)

Which of the following interpretations is correct if you want to use the model to estimate E(Y) for All MBA programs?

A) We are 95% confident that the average starting salary for graduates of a single MBA program that charges $75,000 in tuition and has an average GMAT score of 675 will fall between
$90,113 and $173,16,30.
B) We are 95% confident that the average starting salary for graduates of a single MBA program that charges $75,000 in tuition and has an average GMAT score of 675 will fall between
$126,610 and $136,640.
C) We are 95% confident that the average of all starting salaries for graduates of all MBA programs that charge $75,000 in tuition and have an average GMAT score of 675 will fall
Between $126,610 and $136,640.
D) We are 95% confident that the average of all starting salaries for graduates of all MBA programs that charge $75,000 in tuition and have an average GMAT score of 675 will fall
Between $90,113 and $173,16,30.
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
16
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
17
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
18
A study of the top MBA programs attempted to predict the average starting salary (in $1000's) of graduates of the program based on the amount of tuition (in $1000's) charged by the program and the average GMAT score of the program's students. The results of a regression analysis based on a sample of 75 MBA programs is shown below:

 Least Squares Linear Regression of Salary \text { Least Squares Linear Regression of Salary }

 Predictor  Variables  Coefficient  Std Error  T  P  VIF  Constant 203.40251.65733.940.00020.0 Gmat 0.394120.090394.360.00002.0 Tuition 0.920120.178755.150.00002.0\begin{array}{l}\text { Predictor }\\\begin{array}{lcccccccc}\text { Variables } & \text { Coefficient } & {\text { Std Error }} & \text { T } & \text { P } &{\text { VIF }} \\\text { Constant } & -203.402 & 51.6573 & -3.94 & 0.0002 & 0.0 \\\text { Gmat } & 0.39412 & 0.09039 & 4.36 & 0.0000 & 2.0 & \\\text { Tuition } & 0.92012 & 0.17875 & 5.15 & 0.0000 & 2.0 &\end{array}\end{array}


 R-Squared 0.6857 Resid. Mean Square (MSE) 427.511 Adjusted R-Squared 0.6769 Standard Deviation 20.6763\begin{array}{lccc}\text { R-Squared } & 0.6857 & \text { Resid. Mean Square (MSE) } & 427.511 \\\text { Adjusted R-Squared } & 0.6769 & \text { Standard Deviation } & 20.6763\end{array}

Interpret the coefficient for the tuition variable shown on the printout.

A) For every $1000 increase in the tuition charged by the MBA program, we estimate that the average starting salary will increase by $920.12, holding the GMAT score constant
B) For every $1000 increase in the tuition charged by the MBA program, we estimate that the average starting salary will increase by $394.12, holding the GMAT score constant
C) For every $1000 increase in the tuition charged by the MBA program, we estimate that the average starting salary will decrease by $203,402, holding the GMAT score constant.
D) For every $1000 increase in the average starting salary, we estimate that the tuition charged by the MBA program will increase by $920.12.
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
19
A study of the top MBA programs attempted to predict the average starting salary (in $1000's) of graduates of the program based on the amount of tuition (in $1000's) charged by the program and the average GMAT score of the program's students. The results of a regression analysis based on a sample of 75 MBA programs is shown below:

Least Squares Linear Regression of Salary


Predictor Variables  Coefficient  Std Error  T  P  Constant 169.91026.53506.400.0000 Tuition 3.373730.811714.160.0001 TxT 0.035630.005906.030.0000\begin{array} { l c c c l }\text {Predictor}\\ \text { Variables } & \text { Coefficient } & \text { Std Error } & \text { T } & \text { P } \\ \text { Constant } & 169.910 & 26.5350 & 6.40 & 0.0000 \\ \text { Tuition } & - 3.37373 & 0.81171 & - 4.16 & 0.0001 \\ \text { TxT } & 0.03563 & 0.00590 & 6.03 & 0.0000 \end{array}


 R-Squared 0.7361 Resid. Mean Square (MSE) 358.887 Adjusted R-Squared 0.7288 Standard Deviation 18.9443\begin{array} { l c c r } \text { R-Squared } & 0.7361 & \text { Resid. Mean Square (MSE) } & 358.887 \\ \text { Adjusted R-Squared } & 0.7288 & \text { Standard Deviation } & 18.9443 \end{array}


 Source  DF  SS  Regression 272081.8 Residual 7225839.8 Total 7497921.7 Cases Included 75 Missing Cases 0 \begin{array} { l l c c } \text { Source } & \text { DF } & \text { SS } \\ \text { Regression } & 2 & & 72081.8 \\ \text { Residual } & & 72 & 25839.8 \\ \text { Total } & & 74 & 97921.7 \\ & & & \\ \text { Cases Included } 75 & \text { Missing Cases 0 } \end{array}
One of the t-test test statistics is shown on the printout to be the value t=6.03t = 6.03 . Interpret this value.

A) There is sufficient evidence, at α=0.05\alpha = 0.05 , to indicate that at least one of the variables proposed in the interaction model is useful at predicting the average starting salary of graduates of MBA programs.
B) There is sufficient evidence, at α=0.05\alpha = 0.05 , to indicate that there is a linear relationship between average starting salary of graduates of MBA programs and the tuition of the MBA program.
C) There is insufficient evidence, at α=0.05\alpha = 0.05 , to indicate that at least one of the variables proposed in the interaction model is useful at predicting the average starting salary of graduates of MBA programs.
D) There is sufficient evidence, at α=0.05\alpha = 0.05 , to indicate that there is a curvilinear relationship between average starting salary of graduates of MBA programs and the tuition of the MBA program.
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
20
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
21
Consider the partial printout below.  Coefficients  Standard Error t Stat  P-value  Lower 95%  Upper 95%  Intercept 63.1487393125.091151122.5167733040.045484943124.54461921.752859365 X1 114.725078648.1135817411.8148678490.1194666995.12815519734.57831248 X2 12.487845464.6860637432.6648902240.0372798791.02145216523.95423875 X1X2 1.8869351351.3449998341.4029259240.2102101415.1780335751.404163305 Is there evidence (at α=.05 ) that x1 and x2 interact? Explain. \begin{array}{l}\begin{array} { l c l l l l l } \hline & \text { Coefficients } & \text { Standard Error } & t \text { Stat } & \text { P-value } & \text { Lower 95\% } & \text { Upper 95\% } \\\hline \text { Intercept } &- 63.14873931 & 25.09115112 & - 2.516773304 & 0.045484943 & - 124.5446192 & - 1.752859365 \\\text { X1 } _ { 1 } & 14.72507864 & 8.113581741 & 1.814867849 & 0.119466699 & - 5.128155197 & 34.57831248 \\\text { X2 } & 12.48784546 & 4.686063743 & 2.664890224 & 0.037279879 & 1.021452165 & 23.95423875 \\\text { X1X2 } & - 1.886935135 & 1.344999834 & - 1.402925924 & 0.210210141 & - 5.178033575 & 1.404163305 \\\hline\end{array}\\\text { Is there evidence (at } \alpha = .05 \text { ) that } x _ { 1 } \text { and } x _ { 2 } \text { interact? Explain. }\end{array}
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
22
<strong> </strong> A) 4.2 B) 10.8 C) 11.4 D) 1.8

A) 4.2
B) 10.8
C) 11.4
D) 1.8
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
23
It is dangerous to predict outside the range of the data collected in a regression analysis. For instance, we shouldn't predict the price of a 5000 square foot home if all our sample homes were smaller than 4500 square feet. Which of the following multiple regression pitfalls does this example describe?

A) Estimability
B) Multicollinearity
C) Stepwise Regression
D) Extrapolation
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
24
A study of the top MBA programs attempted to predict the average starting salary (in $1000's) of graduates of the program based on the amount of tuition (in $1000's) charged by the program and the average GMAT score of the program's students. The results of a regression analysis based on a sample of 75 MBA programs is shown below:

Least Squares Linear Regression of Salary

 Predictor  Variables  Coefficient  Std Error  T  P  VIF  Constant 203.40251.65733.940.00020.0 Gmat 0.394120.090394.360.00002.0 Tuition 0.920120.178755.150.00002.0\begin{array}{l}\text { Predictor }\\\begin{array}{lcccccccc}\text { Variables } & \text { Coefficient } & {\text { Std Error }} & \text { T } & \text { P } & {\text { VIF }} \\\text { Constant } & -203.402 & 51.6573 & -3.94 & 0.0002 & 0.0 \\\text { Gmat } & 0.39412 & 0.09039 & 4.36 & 0.0000 & 2.0 & \\\text { Tuition } & 0.92012 & 0.17875 & 5.15 & 0.0000 & 2.0 &\end{array}\end{array}

 R-Squared 0.6857 Resid. Mean Square (MSE) 427.511 Adjusted R-Squared 0.6769 Standard Deviation 20.6763\begin{array}{lccc}\text { R-Squared } & 0.6857 & \text { Resid. Mean Square (MSE) } & 427.511 \\\text { Adjusted R-Squared } & 0.6769 & \text { Standard Deviation } & 20.6763\end{array}

 Source  DF  SS  MS  F  P  Regression 267140.933570.578.530.0000 Residual 7230780.8427.5 Total 7497921.7\begin{array}{lllcccc}\text { Source } & \text { DF } & \text { SS } & \text { MS } & \text { F } & \text { P } \\\text { Regression } & 2 & 67140.9 & 33570.5 & 78.53 & 0.0000 \\\text { Residual } & 72 & 30780.8 & 427.5 & \\\text { Total } & 74 & 97921.7 & &\end{array}

A) At α=0.05\alpha = 0.05 , there is insufficient evidence to indicate that something in the regression model is useful for predicting the average starting salary of the graduates of an MBA program.
B) We expect most of the average starting salaries to fall within $20,676\$ 20,676 of their least squares predicted values.
C) We expect most of the average starting salaries to fall within $41,353\$ 41,353 of their least squares predicted values.
D) We can explain 68.57%68.57 \% of the variation in the average starting salaries around their mean using the model that includes the average GMAT score and the tuition for the MBA program.
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
25
Retail price data for n=60n = 60 hard disk drives were recently reported in a computer magazine. Three variables were recorded for each hard disk drive:
y=y = Retail PRICE (measured in dollars)
x1=x _ { 1 } = Microprocessor SPEED (measured in megahertz)
(Values in sample range from 10 to 40 )
x2=CHIPx _ { 2 } = \mathrm { CHIP } size (measured in computer processing units)
(Values in sample range from 286 to 486 )

A first-order regression model was fit to the data. Part of the printout follows:

\quad \quad \quad \quad \quad \quad \quad \quad Parameter Estimates
\quad \quad \quad PARAMETER STANDARD \quad \quad T FOR 0:
VARIABLE DF ESTIMATE ERROR PARAMETER =0= 0 PROB >T> | T |

 INTERCEPT 1373.5263921258.12433960.2970.7676 SPEED 1104.83894022.362981954.6880.0001 CHIP 13.5718503.894229350.9170.3629\begin{array} { l l l l l l } \text { INTERCEPT } &1 & - 373.526392 & 1258.1243396 & - 0.297 & 0.7676 \\\text { SPEED } & 1 & 104.838940 & 22.36298195 & 4.688 & 0.0001 \\\text { CHIP } & 1 & 3.571850 & 3.89422935 & 0.917 & 0.3629\end{array}


Identify and interpret the estimate for the SPEED β\beta -coefficient, β^1\hat { \beta } _ { 1 } .

A) β^1=3.57\hat { \beta } _ { 1 } = 3.57 ; For every 1-megahertz increase in SPEED, we estimate PRICE to increase $3,57\$ 3,57 , holding CHIP fixed.
B) β^1=105\hat { \beta } _ { 1 } = 105 ; For every 1-megahertz increase in SPEED, we estimate PRICE (y) to increase $105\$ 105 , holding CHIP fixed.
C) β^1=105\hat { \beta } _ { 1 } = 105 ; For every $1\$ 1 increase in PRICE, we estimate SPEED to increase 105 megahertz, holding CHIP fixed.
D) β^1=3.57\hat { \beta } _ { 1 } = 3.57 ; For every $1\$ 1 increase in PRICE, we estimate SPPED to increase by about 4 megahertz, holding CHIP fixed.
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
26
The first-order model below was fit to a set of data. The first-order model below was fit to a set of data.   Explain how to determine if the constant variance assumption is satisfied. Explain how to determine if the constant variance assumption is satisfied.
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
27
Twenty colleges each recommended one of its graduating seniors for a prestigious graduate fellowship. The process to determine which student will receive the fellowship includes several interviews. The gender of each student and his or her score on the first interview are shown below.

 Student  Gender  Score 1 Male 182 Female 173 Female 194 Female 165 Male 126 Female 157 Female 188 Male 169 Male 1810 Female 20\begin{array}{clc}\hline \text { Student } & \text { Gender } & \text { Score } \\\hline 1 & \text { Male } & 18 \\2 & \text { Female } & 17 \\3 & \text { Female } & 19 \\4 & \text { Female } & 16 \\5 & \text { Male } & 12 \\6 & \text { Female } & 15 \\7 & \text { Female } & 18 \\8 & \text { Male } & 16 \\9 & \text { Male } & 18 \\10 & \text { Female } & 20\end{array}

 Student  Gender  Score 11 Female 1712 Male 1613 Male 1614 Female 1915 Female 1616 Male 1517 Female 1218 Male 1419 Female 1620 Female 18\begin{array}{clc}\hline \text { Student } & \text { Gender } & \text { Score } \\\hline 11 & \text { Female } & 17 \\12 & \text { Male } & 16 \\13 & \text { Male } & 16 \\14 & \text { Female } & 19 \\15 & \text { Female } & 16 \\16 & \text { Male } & 15 \\17 & \text { Female } & 12 \\18 & \text { Male } & 14 \\19 & \text { Female } & 16 \\20 & \text { Female } & 18\end{array}
a. Suppose you want to use gender to model the score on the interview y. Create the
appropriate number of dummy variables for gender and write the model.
b. Fit the model to the data.
c. Give the null hypothesis for testing whether gender is a useful predictor of the score y.
d. Conduct the test and give the appropriate conclusion  Use α=.05\text { Use } \alpha = .05
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
28
Retail price data for n = 60 hard disk drives were recently reported in a computer magazine. Three variables were recorded for each hard disk drive:
y= Retail PRICE (measured in dollars) x1= Microprocessor SPEED (measured in megahertz)  (Values in sample range from 10 to 40 ) x2= CHIP size (measured in computer processing units)  (Values in sample range from 286 to 486 ) \begin{aligned} y = & \text { Retail PRICE (measured in dollars) } \\ x _ { 1 } = & \text { Microprocessor SPEED (measured in megahertz) } \\ & \text { (Values in sample range from } 10 \text { to } 40 \text { ) } \\ x _ { 2 } = & \text { CHIP size (measured in computer processing units) } \\ & \text { (Values in sample range from } 286 \text { to } 486 \text { ) } \end{aligned}
A first-order regression model. was fit to the data. Part of the printout follows:
\quad \quad \quad \quad \quad \quad \quad \quad \quad Parameter Estimates
\quad \quad \quad \quad \quad PARAMETER STANDARD \quad T FOR 0 :
VARIABLE DF ESTIMATE ERROR PARAMETER =0= 0 PROB >T> | \mathrm { T } |
 INTERCEPT 1373.5263921258.12433960.2970.7676 SPEED 1104.83894022.362981954.6880.0001 CHIP 13.5718503.894229350.9170.3629\begin{array} { l r l l l l } \text { INTERCEPT } &1 & - 373.526392 & 1258.1243396 & - 0.297 & 0.7676 \\\text { SPEED } & 1 & 104.838940 & 22.36298195 & 4.688 & 0.0001 \\\text { CHIP } & 1 & 3.571850 & 3.89422935 & 0.917 & 0.3629\end{array}


 Identify and interpret the estimate of β2\text { Identify and interpret the estimate of } \beta_{2} \text {. }
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
29
A public health researcher wants to use regression to predict the sun safety knowledge of pre-school children. The researcher randomly sampled 35 preschoolers, assigned them to one of two groups, and then measured the following three variables:
SUNSCORE: y=\quad \mathrm { y } = Score on sun-safety comprehension test
READING: x1=\quad \mathrm { x } _ { 1 } = Reading comprehension score
GROUP: x2=1\quad \quad x _ { 2 } = 1 if child received a Be Sun Safe demonstration, 0 if not

A regression model was fit and the following residual plot was observed.
Predicted value of yy
 <strong>A public health researcher wants to use regression to predict the sun safety knowledge of pre-school children. The researcher randomly sampled 35 preschoolers, assigned them to one of two groups, and then measured the following three variables: SUNSCORE:  \quad \mathrm { y } =  Score on sun-safety comprehension test READING:  \quad \mathrm { x } _ { 1 } =  Reading comprehension score GROUP:  \quad \quad x _ { 2 } = 1  if child received a Be Sun Safe demonstration, 0 if not  A regression model was fit and the following residual plot was observed. Predicted value of  y    Which of the following assumptions appears violated based on this plot?</strong> A) The errors are normally distributed B) The errors are independent C) The mean of the errors is zero D) The variance of the errors is constant
Which of the following assumptions appears violated based on this plot?

A) The errors are normally distributed
B) The errors are independent
C) The mean of the errors is zero
D) The variance of the errors is constant
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
30
Consider the partial printout for an interaction regression analysis of the relationship between a dependent variable yy and two independent variables x1x _ { 1 } and x2x _ { 2 } .
ANOVA
 df  SS  MS F Significance F  Regression 33393.6773241131.2257759391.9747822.11084E11 Residual 60.7226759870.120445998 Total 93394.4\begin{array}{llllll}\hline & \text { df } & \text { SS } & \text { MS } & F & \text { Significance F } \\\hline \text { Regression } & 3 & 3393.677324 & 1131.225775 & 9391.974782 & 2.11084 \mathrm{E}-11 \\\text { Residual } & 6 & 0.722675987 & 0.120445998 & & \\\text { Total } & 9 & 3394.4 & & & \\\hline\end{array}


 Coefficients  Standard Error t Stat  P-value  Lower 95%  Upper 95%  Intercept 16.721970148.2839972192.0185871260.090076543.54825565936.99219593 X1 13.0373177592.6787487051.1338569210.3001163829.5919845063.517348987 X2 21.0465227541.5471326450.6764272970.5239739884.8322227272.73917722 X1X2 14.0716851470.4440599339.1692243459.47663E052.985108845.158261454\begin{array}{lllllll} & \text { Coefficients } & \text { Standard Error } & t \text { Stat } & \text { P-value } & \text { Lower 95\% } & \text { Upper 95\% } \\\hline \text { Intercept } & 16.72197014 & 8.283997219 & 2.018587126 & 0.09007654 & -3.548255659 & 36.99219593 \\\text { X1 }_{1} & -3.037317759 & 2.678748705 & -1.133856921 & 0.300116382 & -9.591984506 & 3.517348987 \\\text { X2 }_{2} & -1.046522754 & 1.547132645 & -0.676427297 & 0.523973988 & -4.832222727 & 2.73917722 \\\text { X1X2 }_{1} & 4.071685147 & 0.444059933 & 9.169224345 & 9.47663 \mathrm{E}-05 & 2.98510884 & 5.158261454\end{array}


a. Write the prediction equation for the interaction model.
b. Test the overall utility of the interaction model using the global FF -test at α=.05\alpha = .05 .
c. Test the hypothesis (at α=.05\alpha = .05 ) that x1x _ { 1 } and x2x _ { 2 } interact positively.
d. Estimate the change in yy for each additional 1-unit increase in x1x _ { 1 } when x2=6x _ { 2 } = 6 .
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
31
A study of the top MBA programs attempted to predict the average starting salary (in $1000's) of graduates of the program based on the amount of tuition (in $1000's) charged by the program and the average GMAT score of the program's students. The results of a regression analysis based on a sample of 75 MBA programs is shown below:
 Least Squares Linear Regression of Salary  Predictor  Variables  Coefficient  Std Error  T  P \multicolumn2c VIF  Constant 203.40251.65733.940.00020.0 Gmat 0.394120.090394.360.00002.0 Tuition 0.920120.178755.150.00002.0\begin{array}{l}\text { Least Squares Linear Regression of Salary }\\\begin{array} { l c c c c c c c c } \text { Predictor } & & & & & & & \\\text { Variables } & \text { Coefficient } &{ \text { Std Error } } & \text { T } & & \text { P } & \multicolumn{2}{c} { \text { VIF } } \\\text { Constant } & - 203.402 & 51.6573 & - 3.94 & & 0.0002 & 0.0 \\\text { Gmat } & 0.39412 & 0.09039 & 4.36 & & 0.0000 & 2.0 & \\\text { Tuition } & 0.92012 & 0.17875 & 5.15 & & 0.0000 & 2.0 &\end{array}\end{array} The model was then used to create 95% confidence and prediction intervals for y and for E(Y) when the tuition charged by the MBA program was $75,000 and the GMAT score was 675. The results are shown here:

95% confidence interval for E(Y): ($126,610, $136,640)
95% prediction interval for Y: ($90,113, $173,160)

Which of the following interpretations is correct if you want to use the model to estimate Y for a single MBA program?

A) We are 95% confident that the average of all starting salaries for graduates of all MBA programs that charge $75,000 in tuition and have an average GMAT score of 675 will fall
Between $126,610 and $136,640.
B) We are 95% confident that the average starting salary for graduates of a single MBA program that charges $75,000 in tuition and has an average GMAT score of 675 will fall between
$126,610 and $136,640.
C) We are 95% confident that the average starting salary for graduates of a single MBA program that charges $75,000 in tuition and has an average GMAT score of 675 will fall between
$90,113 and $173,16,30.
D) We are 95% confident that the average of all starting salaries for graduates of all MBA programs that charge $75,000 in tuition and have an average GMAT score of 675 will fall
Between $90,113 and $173,16,30.
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
32
During its manufacture, a product is subjected to four different tests in sequential order. An efficiency expert claims that the fourth (and last) test is unnecessary since its results can be predicted based on the first three tests. To test this claim, multiple regression will be used to model Test4 score (y)( y ) , as a function of Test1 score (x1)\left( x _ { 1 } \right) , Test 2 score (x2)\left( x _ { 2 } \right) , and Test3 score (x3)\left( x _ { 3 } \right) . [Note: All test scores range from 200 to 800 , with higher scores indicative of a higher quality product.] Consider the model:
E(y)=β1+β1x1+β2x2+β3x3E ( y ) = \beta _ { 1 } + \beta _ { 1 } x _ { 1 } + \beta _ { 2 } x _ { 2 } + \beta _ { 3 } x _ { 3 }
The first-order model was fit to the data for each of 12 units sampled from the production line. The results are summarized in the printout.
 SOURCE  DF  SS  MS  F VALUE  PROB > F  MODEL 31514175047218.16.0075 ERROR 8222312779 TOTAL 12173648\begin{array}{lrrrrr}\text { SOURCE } & \text { DF } & \text { SS } & \text { MS } & \text { F VALUE } & \text { PROB > F } \\\text { MODEL } & 3 & 151417 & 50472 & 18.16 & .0075 \\\text { ERROR } & 8 & 22231 & 2779 & & \\\text { TOTAL } & 12 & 173648 & & &\end{array}


 ROOT MSE 52.72 R-SQUARE 0.872 DEP MEAN 645.8 ADJ R-SQ 0.824\begin{array}{llll}\text { ROOT MSE } & 52.72 & \text { R-SQUARE } & 0.872 \\\text { DEP MEAN } & 645.8 & \text { ADJ R-SQ } & 0.824\end{array}

 PARAMETER  STANDARD  T FOR 0:  VARIABLE  ESTIMATE  ERROR  PARAMETER =0 PROB > T INTERCEPT 11.9880.500.150.885 X1(TEST1) 0.27450.11112.470.039 X2(TEST2) 0.37620.09863.820.005 X3(TEST3) 0.32650.08084.040.004\begin{array}{lrrrr} & \text { PARAMETER } & \text { STANDARD } & \text { T FOR 0: } & \\\text { VARIABLE } & \text { ESTIMATE } & \text { ERROR } & \text { PARAMETER }=0 & \text { PROB > }|\mathrm{T}| \\\text { INTERCEPT } & 11.98 & 80.50 & 0.15 & 0.885 \\\text { X1(TEST1) } & 0.2745 & 0.1111 & 2.47 & 0.039 \\\text { X2(TEST2) } & 0.3762 & 0.0986 & 3.82 & 0.005 \\\text { X3(TEST3) } & 0.3265 & 0.0808 & 4.04 & 0.004\end{array}

Suppose the 95%95 \% confidence interval for β3\beta _ { 3 } is (.15,.47)( .15 , .47 ) . Which of the following statements is incorrect?

A) We are 95%95 \% confident that the Test 3 is a useful linear predictor of Test 4 score, holding Test1 and Test2 fixed.
B) At α=.05\alpha = .05 , there is insufficient evidence to reject H0:β3=0H _ { 0 } : \beta _ { 3 } = 0 in favor of Ha:β30H _ { \mathrm { a } } : \beta 3 \neq 0 .
C) We are 95%95 \% confident that the increase in Test4 score for every 1-point increase in Test3 score falls between .15.15 and .47.47 , holding Test1 and Test 2 fixed.
D) We are 95%95 \% confident that the estimated slope for the Test4-Test3 line falls between .15.15 and .47.47 holding Test1 and Test2 fixed.
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
33
The confidence interval for the mean E(y) is narrower that the prediction interval for y.
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
34
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
35
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
36
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
37
In regression, it is desired to predict the dependent variable based on values of other related independent variables. Occasionally, there are relationships that exist between the independent variables. Which of the following multiple regression pitfalls does this example describe?

A) Multicollinearity
B) Extrapolation
C) Stepwise Regression
D) Estimability
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
38
<strong> </strong> A) 1 B) 10 C) 16 D) 13

A) 1
B) 10
C) 16
D) 13
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
39
A study of the top MBA programs attempted to predict the average starting salary (in $1000's) of graduates of the program based on the amount of tuition (in $1000's) charged by the program and the average GMAT score of the program's students. The results of a regression analysis based on a sample of 75 MBA programs is shown below:  Least Squares Linear Regression of Salary \text { Least Squares Linear Regression of Salary }

 Predictor  Variables  Coefficient  Std Error  T  P  Constant 687.851165.4064.160.0001 Tuition 11.31972.197245.150.0000 GMAT 0.967270.255353.790.0003 TxG 0.018500.003315.580.0000\begin{array}{l}\text { Predictor }\\\begin{array}{lccccc}\text { Variables } & \text { Coefficient } & \text { Std Error } & \text { T } & \text { P } \\\text { Constant } & -687.851 & 165.406 & 4.16 & 0.0001 \\\text { Tuition } & -11.3197 & 2.19724 & -5.15 & 0.0000 \\\text { GMAT } & -0.96727 & 0.25535 & -3.79 & 0.0003 \\\text { TxG } & 0.01850 & 0.00331 & 5.58 & 0.0000\end{array}\end{array}

 R-Squared 0.7816 Resid. Mean Square (MSE) 301.251 Adjusted R-Squared 0.7723 Standard Deviation 17.3566\begin{array}{lccc}\text { R-Squared } & 0.7816 & \text { Resid. Mean Square (MSE) } & 301.251 \\\text { Adjusted R-Squared } & 0.7723 & \text { Standard Deviation } & 17.3566\end{array}

 Source  DF  SS  MS  F  P  Regression 376523.825510.984.680.0000 Residual 7121388.8301.3 Total 7497921.7\begin{array}{lllcccc}\text { Source } & \text { DF } & & \text { SS } & \text { MS } & \text { F } & \text { P } \\\text { Regression } & 3 & & 76523.8 & 25510.9 & 84.68 & 0.0000 \\\text { Residual } & & 71 & 21388.8 & 301.3 & \\\text { Total } & & 74 & 97921.7 & &\end{array}

Cases Included 75 Missing Cases 0 One of the t-test test statistics is shown on the printout to be value t=5.58t = 5.58 . Interpret this value.

A) There is sufficient evidence, at α=0.05\alpha = 0.05 , to indicate that the interaction between average tuition and average GMAT score is a useful predictor of the average starting salary of graduates of MBA programs.
B) There is insufficient evidence, at α=0.05\alpha = 0.05 , to indicate that at least one of the variables proposed in the interaction model is useful at predicting the average starting salary of graduates of MBA programs.
C) There is sufficient evidence, at α=0.05\alpha = 0.05 , to indicate that at least one of the variables proposed in the interaction model is useful at predicting the average starting salary of graduates of MBA programs.
D) There is insufficient evidence, at α=0.05\alpha = 0.05 , to indicate that the interaction between average tuition and average GMAT score is a useful predictor of the average starting salary of graduates of MBA programs.
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
40
A study of the top MBA programs attempted to predict the average starting salary (in $1000's) of graduates of the program based on the amount of tuition (in $1000's) charged by the program and the average GMAT score of the program's students. The results of a regression analysis based on a sample of 75 MBA programs is shown below:  Least Squares Linear Regression of Salary  Predictor  Variables  Coefficient  Std Error  T  P  Constant 687.851165.4064.160.0001 Tuition 11.31972.197245.150.0000 GMAT 0.967270.255353.790.0003\begin{array}{l}\text { Least Squares Linear Regression of Salary }\\\begin{array} { l c c c c c } \text { Predictor } & & & & & \\\text { Variables } & \text { Coefficient } & \text { Std Error } & \text { T } & \text { P } \\\text { Constant } & - 687.851 & 165.406 & 4.16 & 0.0001 \\\text { Tuition } & - 11.3197 & 2.19724 & - 5.15 & 0.0000 \\\text { GMAT } & - 0.96727 & 0.25535 & - 3.79 & 0.0003\end{array}\end{array}

 TxG 0.018500.003315.580.0000 \begin{array}{lllll}\text { TxG } & 0.01850 & 0.00331 & 5.58 & 0.0000\end{array}

 R-Squared 0.7816 Resid. Mean Square (MSE) 301.251 Adjusted R-Squared 0.7723 Standard Deviation 17.3566\begin{array}{lccc}\text { R-Squared } & 0.7816 & \text { Resid. Mean Square (MSE) } & 301.251\\\text { Adjusted R-Squared } & 0.7723& \text { Standard Deviation } & 17.3566\end{array}

 Source  DF  SS  MS  F  P  Regression 376523.825510.984.680.0000 Residual 7121388.8301.3 Total 7497921.7\begin{array}{lllcccc}\text { Source } & \text { DF } & \text { SS } & \text { MS } & \text { F } & \text { P } \\\text { Regression } & 3 &76523.8& 25510.9 & 84.68& 0.0000 \\\text { Residual } & 71 &21388.8& 301.3 & \\\text { Total } & 74 & 97921.7 & &\end{array}


Cases Included 75 Missing Cases 0

The global-f test statistic is shown on the printout to be the value F=84.68F = 84.68 . Interpret this value.

A) There is sufficient evidence, at α=0.05\alpha = 0.05 , to indicate that at least one of the variables proposed in the interaction model is useful at predicting the average starting salary of graduates of MBA programs.
B) There is insufficient evidence, at α=0.05\alpha = 0.05 , to indicate that at least one of the variables proposed in the interaction model is useful at predicting the average starting salary of graduates of MBA programs.
C) There is insufficient evidence, at α=0.05\alpha = 0.05 , to indicate that the interaction between average tuition and average GMAT score is a useful predictor of the average starting salary of graduates of MBA programs.
D) There is sufficient evidence, at α=0.05\alpha = 0.05 , to indicate that the interaction between average tuition and average GMAT score is a useful predictor of the average starting salary of graduates of MBA programs.
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
41
A fast food chain test marketing a new sandwich chose 18 of its stores in one major
metropolitan area. Nine of the stores were in malls and nine were free standing. The sandwich was offered at three different introductory prices. The table shows the number of new sandwiches sold at each location for each location type and price combination.

Number of New Sandwiches Sold
 A fast food chain test marketing a new sandwich chose 18 of its stores in one major metropolitan area. Nine of the stores were in malls and nine were free standing. The sandwich was offered at three different introductory prices. The table shows the number of new sandwiches sold at each location for each location type and price combination.  Number of New Sandwiches Sold    a. Write a model for the mean number of sandwiches sold,  E ( y ) , assuming that the relationship between  E ( y )  and price,  x _ { 1 } , is first-order. b. Fit the model to the data. c. Write the prediction equations for mall and free-standing stores. d. Do the data provide sufficient evidence that the change in number of sandwiches sold with respect to price is different for mall and free-standing stores? Use  \alpha = .01 .

a. Write a model for the mean number of sandwiches sold, E(y)E ( y ) , assuming that the relationship between E(y)E ( y ) and price, x1x _ { 1 } , is first-order.
b. Fit the model to the data.
c. Write the prediction equations for mall and free-standing stores.
d. Do the data provide sufficient evidence that the change in number of sandwiches sold with respect to price is different for mall and free-standing stores? Use α=.01\alpha = .01 .
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
42
The concessions manager at a beachside park recorded the high temperature, the number of people at the park, and the number of bottles of water sold for each of 12 consecutive saturdays. The data are shown below.  Bottles Sold  Temperature (F) People 341731625425792100457802125485802800469812550395821975511832675549832800543852850537882775621892800897913100\begin{array} { c c c } \hline \text { Bottles Sold } & \text { Temperature } \left( { } ^ { \circ } \mathrm { F } \right) & \text { People } \\\hline 341 & 73 & 1625 \\425 & 79 & 2100 \\457 & 80 & 2125 \\485 & 80 & 2800 \\469 & 81 & 2550 \\395 & 82 & 1975 \\511 & 83 & 2675 \\549 & 83 & 2800 \\543 & 85 & 2850 \\537 & 88 & 2775 \\621 & 89 & 2800 \\897 & 91 & 3100 \\\hline\end{array} a. Fit the model E(y)=β0+β1x1+β2x2E ( y ) = \beta _ { 0 } + \beta _ { 1 } x _ { 1 } + \beta _ { 2 } x _ { 2 } to the data, letting yy represent the number of bottles of water sold, x1x _ { 1 } the temperature, and x2x _ { 2 } the number of people at the park.
b. Find the 95%95 \% confidence interval for the mean number of bottles of water sold when the temperature is 84F84 ^ { \circ } \mathrm { F } and there are 2700 people at the park.
c. Find the 95%95 \% prediction interval for the number of bottles of water sold when the temperature is 84F84 ^ { \circ } \mathrm { F } and there are 2700 people at the park.
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
43
In Hawaii, proceedings are under way to enable private citizens to own the property that their homes are built on. In prior years, only estates were permitted to own land, and homeowners leased the land from the estate. In order to comply with the new law, a large Hawaiian estate wants to use regression analysis to estimate the fair market value of the land. The following variables are proposed: y= Sale price of property ($ thousands) y = \text { Sale price of property (\$ thousands) }
x2=1x _ { 2 } = 1 if property near Cove, 0 if not Write a regression model relating the sale price of a property to the qualitative variable x. Interpret all the ?s in the model.
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
44
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
45
  Interpret the residual plot. Interpret the residual plot.
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
46
A college admissions officer proposes to use regression to model a student's college GPA at graduation in terms of the following two variables: x1= high school GPA x2= SAT score \begin{array} { l } x _ { 1 } = \text { high school GPA } \\x _ { 2 } = \text { SAT score }\end{array} The admissions officer believes the relationship between college GPA and high school GPA is linear and the relationship between SAT score and college GPA is linear. She also believes that the relationship between college GPA and high school GPA depends on the student's SAT score. Write the regression model she should fit.
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
47
A certain type of rare gem serves as a status symbol for many of its owners. In theory, for low prices, the demand decreases as the price of the gem increases. However, experts hypothesize that when the gem is valued at very high prices, the demand increases with price due to the status the owners believe they gain by obtaining the gem. Thus, the model proposed to best explain the demand for the gem by its price is the quadratic model E(y)=β0+β1x+β2x2E ( y ) = \beta _ { 0 } + \beta _ { 1 } x + \beta _ { 2 } x ^ { 2 }
where y=y = Demand (in thousands) and x=x = Retail price per carat (dollars).
This model was fit to data collected for a sample of 12 rare gems. A portion of the printout is given below:
 SOURCE  DF  SS  MS  F  PR >F  Model 211514557573373.0001 Error 91388154 TOTAL 11116533\begin{array} { l r r r r r } \text { SOURCE } & \text { DF } & \text { SS } & \text { MS } & \text { F } & \text { PR >F } \\ \text { Model } & 2 & 115145 & 57573 & 373 & .0001 \\ \text { Error } & 9 & 1388 & 154 & & \\ \text { TOTAL } & 11 & 116533 & & & \end{array}

 Root MSE 12.42 R-Square .988\begin{array} { l l l l } \text { Root MSE } & 12.42 & \text { R-Square } & .988\end{array}

PARAMETERT for HO : VARIABLES  ESTIMATES  STD. ERROR  PARAMETER =0 PR >T\begin{array} { l l l l l }&\text {PARAMETER}&&\text {T for \(H O\) :}\\ \text { VARIABLES } & \text { ESTIMATES } & \text { STD. ERROR } & \text { PARAMETER } = 0 & \text { PR } > | T | \end{array}

 INTERPCEP 286.429.6629.64.0001X.31.065.14.0006XX.000067.00007.95.3647\begin{array}{lrrrr}\text { INTERPCEP } & 286.42 & 9.66 & 29.64 & .0001 \\X & -.31 & .06 & -5.14 & .0006 \\X \cdot X & .000067 & .00007 & .95 & .3647\end{array}


Is there sufficient evidence to indicate the model is useful for predicting the demand for the gem? Use α=.01\alpha = .01 .
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
48
In any production process in which one or more workers are engaged in a variety of tasks, the total time spent in production varies as a function of the size of the workpool and the level of output of the various activities. In a large metropolitan department store, it is believed that the number of man-hours worked (y)( y ) per day by the clerical staff depends on the number of pieces of mail processed per day (x1)\left( x _ { 1 } \right) and the number of checks cashed per day (x2)\left( x _ { 2 } \right) . Data collected for n=20n = 20 working days were used to fit the model:
E(y)=β0+β1x1+β2x2E ( y ) = \beta _ { 0 } + \beta _ { 1 } x _ { 1 } + \beta _ { 2 } x _ { 2 }
A printout for the analysis follows:

 Analysis of Variance  SOURCE  DF  SS  MS  F VALUE  PROB > F  MODEL 27089.065123544.5325613.2670.0003 ERROR 174541.72142267.16008 C TOTAL 1911630.78654\begin{array}{l}\text { Analysis of Variance }\\\begin{array} { l r r r r r } \text { SOURCE } & \text { DF } & \text { SS } & \text { MS } & \text { F VALUE } & \text { PROB > F } \\\\\text { MODEL } & 2 & 7089.06512 & 3544.53256 & 13.267 & 0.0003 \\\text { ERROR } & 17 & 4541.72142 & 267.16008 & & \\\text { C TOTAL } & 19 & 11630.78654 & & &\end{array}\end{array}

 ROOT MSE 16.34503 R-SQUARE 0.6095 DEP MEAN 93.92682 ADJR-SQ 0.5636 C.V. 17.40188\begin{array}{llll}\text { ROOT MSE } & 16.34503 & \text { R-SQUARE } & 0.6095 \\\text { DEP MEAN } & 93.92682 & \text { ADJR-SQ } & 0.5636 \\\text { C.V. } & 17.40188 & &\end{array}

Parameter Estimates
PARAMETER STANDARD T FOR 0:
VARIABLE DF ESTIMATE ERROR PARAMETER =0 =0 \quad PROB >T >|\mathrm{T}|

 INTERCEPT 1114.42097218.684857446.1240.0001 X1 10.0071020.001713754.1440.0007 X2 10.0372900.020439371.8240.0857\begin{array}{lrrrrr}\text { INTERCEPT } & 1 & 114.420972 & 18.68485744 & 6.124 & 0.0001 \\\text { X1 } & 1 & -0.007102 & 0.00171375 & -4.144 & 0.0007 \\\text { X2 } & 1 & 0.037290 & 0.02043937 & 1.824 & 0.0857\end{array}




 Actual  Predict  Lower 95% CL  Upper 95% CL  OBS X1X2 Value  Value  Residual  Predict  Predict 1778164474.70783.1758.46847.224119.126\begin{array}{rrrrrrrr} & & & \text { Actual } & \text { Predict } & & \text { Lower 95\% CL } & \text { Upper 95\% CL } \\\text { OBS } & \mathrm{X} 1 & \mathrm{X} 2 & \text { Value } & \text { Value } & \text { Residual } & \text { Predict } & \text { Predict } \\1 & 7781 & 644 & 74.707 & 83.175 & -8.468 & 47.224 & 119.126 \\\hline\end{array}

Test to determine if there is a positive linear relationship between the number of man-hours worked, yy , and the number of checks cashed per day, x2x _ { 2 } . Use α=.05\alpha = .05 .
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
49
  Interpret the residual plot. Interpret the residual plot.
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
50
The model E(y)=β0+β1x1+β2x2+β3x3+β4x4E ( y ) = \beta _ { 0 } + \beta _ { 1 } x _ { 1 } + \beta _ { 2 } x _ { 2 } + \beta _ { 3 } x _ { 3 } + \beta _ { 4 } x _ { 4 } was used to relate E(y)E ( y ) to a single qualitative variable, where
x1={1 if level 20 if not x2={1 if level 30 if not x _ { 1 } = \left\{ \begin{array} { l l } 1 & \text { if level } 2 \\ 0 & \text { if not } \end{array} \quad x _ { 2 } = \left\{ \begin{array} { l l } 1 & \text { if level } 3 \\ 0 & \text { if not } \end{array} \right. \right.

x3={1 if level 40 if not x4={1 if level 50 if not x _ { 3 } = \left\{ \begin{array} { l l } 1 & \text { if level } 4 \\ 0 & \text { if not } \end{array} \quad x _ { 4 } = \left\{ \begin{array} { l l } 1 & \text { if level } 5 \\ 0 & \text { if not } \end{array} \right. \right.
This model was fit to n=40n = 40 data points and the following result was obtained:
y^=14.5+3x14x2+10x3+8x4\hat { y } = 14.5 + 3 x _ { 1 } - 4 x _ { 2 } + 10 x _ { 3 } + 8 x _ { 4 }
a. Use the least squares prediction equation to find the estimate of E(y)E ( y ) for each level of the qualitative variable.
b. Specify the null and alternative hypothesis you would use to test whether E(y)E ( y ) is the same for all levels of the independent variable.
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
51
Retail price data for n=60n = 60 hard disk drives were recently reported in a computer magazine. Three variables were recorded for each hard disk drive:
y=y = Retail PRICE (measured in dollars)
x1=\mathrm { x } _ { 1 } = Microprocessor SPEED (measured in megahertz)
(Values in sample range from 10 to 40 )
x2=CHIP\mathrm { x } _ { 2 } = \mathrm { CHIP } size (measured in computer processing units)
(Values in sample range from 286 to 486 )

A first-order regression model was fit to the data. Part of the printout follows:

 Dep Var  Predict  Std Err  Lower 95%  Upper 95%  OBS  SPEED  CHIP  PRICE  Value  Predict  Predict  Predict  Residual 1333865099.04464.9260.7683942.74987.1634.1\begin{array} { r r r r r r r r r } \hline& & & \text { Dep Var } & \text { Predict } & \text { Std Err } & \text { Lower 95\% } & \text { Upper 95\% } & \\\text { OBS } & \text { SPEED } & \text { CHIP } & \text { PRICE } & \text { Value } & \text { Predict } & \text { Predict } & \text { Predict } & \text { Residual } \\& & & & & & & & \\1 & 33 & 386 & 5099.0 & 4464.9 & 260.768 & 3942.7 & 4987.1 & 634.1\\\hline\end{array}


Interpret the 95%95 \% prediction interval for yy when x1=33x _ { 1 } = 33 and x2=386x _ { 2 } = 386 .
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
52
As part of a study at a large university, data were collected on n=224n = 224 freshmen computer science (CS) majors in a particular year. The researchers were interested in modeling yy , a student's grade point average (GPA) after three semesters, as a function of the following independent variables (recorded at the time the students enrolled in the university):
x1=x _ { 1 } = average high school grade in mathematics (HSM)
x2=x _ { 2 } = average high school grade in science (HSS)
x3=x _ { 3 } = average high school grade in English (HSE)
x4=x _ { 4 } = SAT mathematics score (SATM)
x5=x _ { 5 } = SAT verbal score (SATV)

A first-order model was fit to data with Ra2=.193R _ { a } ^ { 2 } = .193 .

Interpret the value of the adjusted coefficient of determination Ra2R _ { a } ^ { 2 } .
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
53
Consider the data given in the table below. XY142625374746545563\begin{array} { c c } \hline \mathrm { X } & \mathrm { Y } \\\hline 1 & 4 \\2 & 6 \\2 & 5 \\3 & 7 \\4 & 7 \\4 & 6 \\5 & 4 \\5 & 5 \\6 & 3 \\\hline\end{array} a. Plot the data on a scattergram. Does a quadratic model seem to be a good fit for the
data? Explain.
b. Use the method of least squares to find a quadratic prediction equation.
c. Graph the prediction equation on your scattergram.
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
54
A certain type of rare gem serves as a status symbol for many of its owners. In theory, for low prices, the demand decreases as the price of the gem increases. However, experts hypothesize that when the gem is valued at very high prices, the demand increases with price due to the status the owners believe they gain by obtaining the gem. Thus, the model proposed to best explain the demand for the gem by its price is the quadratic model E(y)=β0+β1x+β2x2E ( y ) = \beta _ { 0 } + \beta _ { 1 } x + \beta _ { 2 } x ^ { 2 }
where y=y = Demand (in thousands) and x=x = Retail price per carat (dollars). This model was fit to data collected for a sample of 12 rare gems. A portion of the printout is given below: Does the quadratic term contribute useful information for predicting the demand for the gem? Use α=.10\alpha = .10 .

 SOURCE  DF  SS  MS  F  PR > F  Model 211514557573373.0001 Error 91388154 TOTAL 11116533\begin{array}{lrrrrr}\text { SOURCE } & \text { DF } & \text { SS } & \text { MS } & \text { F } & \text { PR > F } \\\text { Model } & 2 & 115145 & 57573 & 373 & .0001 \\\text { Error } & 9 & 1388 & 154 & & \\\text { TOTAL } & 11 & 116533 & & &\end{array}


 Root MSE 12.42 R-Square .988\begin{array}{llll}\text { Root MSE } & 12.42 & \text { R-Square } & .988\end{array}


 PARAMETER  T for HO:  VARIABLES  ESTIMATES  STD. ERROR  PARAMETER =0 PR > > INTERPCEP 286.429.6629.64.0001 X .31.065.14.0006 X.X .000067.00007.95.3647\begin{array}{lrrrr} & \text { PARAMETER } & \text { T for HO: } \\\text { VARIABLES } & \text { ESTIMATES } & \text { STD. ERROR } & \text { PARAMETER }=0 & \text { PR > }>\mid \\\text { INTERPCEP } & 286.42 & 9.66 & 29.64 & .0001 \\\text { X } & -.31 & .06 & -5.14 & .0006 \\\text { X.X } & .000067 & .00007 & .95 & .3647\end{array}

Does the quadratic term contribute useful information for predicting the demand for the gem? Use α=10 \alpha=10 .


Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
55
  Is there evidence of multicollinearity in the printout? Explain. Is there evidence of multicollinearity in the printout? Explain.
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
56
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
57
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
58
Why is the random error term ? added to a multiple regression model?
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
59
A college admissions officer proposes to use regression to model a student's college GPA at graduation in terms of the following two variables: x1= high school GPA x2= SAT score \begin{array} { l } x _ { 1 } = \text { high school GPA } \\x _ { 2 } = \text { SAT score }\end{array} The admissions officer believes the relationship between college GPA and high school GPA is linear and the relationship between SAT score and college GPA is linear. She also believes that the relationship between college GPA and high school GPA depends on the student's SAT score. She proposes the regression model: E(y)=β0+β1x1+β2x2+β3x1x2E ( y ) = \beta _ { 0 } + \beta _ { 1 } x _ { 1 } + \beta _ { 2 } x _ { 2 } + \beta _ { 3 } x _ { 1 } x _ { 2 } Explain how to determine if the relationship between college GPA and SAT score depends on the high school GPA.
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
60
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
61
Consider the data given in the table below. XY1726253534444342545566\begin{array} { c c } \hline \mathrm { X } & \mathrm { Y } \\\hline 1 & 7 \\2 & 6 \\2 & 5 \\3 & 5 \\3 & 4 \\4 & 4 \\4 & 3 \\4 & 2 \\5 & 4 \\5 & 5 \\6 & 6 \\\hline\end{array} Plot the data on a scattergram. Does a second-order model seem to be a good fit for the data? Explain.
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
62
The table shows the profit y (in thousands of dollars) that a company made during a month when the price of its product was x dollars per unit.

 Profit, y Price, x121.20171.25201.29211.30241.35261.39271.40231.45211.49201.50151.55111.59101.6051.65\begin{array}{cc}\hline \text { Profit, } y & \text { Price, } x \\\hline 12 & 1.20 \\17 & 1.25 \\20 & 1.29 \\21 & 1.30 \\24 & 1.35 \\26 & 1.39 \\27 & 1.40 \\23 & 1.45 \\21 & 1.49 \\20 & 1.50 \\15 & 1.55 \\11 & 1.59 \\10 & 1.60 \\5 & 1.65 \\\hline\end{array}

a. Fit the model y=β0+β1x+β2x2+εy = \beta _ { 0 } + \beta _ { 1 } x + \beta _ { 2 } x 2 + \varepsilon to the data and give the least squares prediction equation.
b. Plot the fitted equation on a scattergram of the data.
c. Is there sufficient evidence of downward curvature in the relationship between profit and price? Use α=.05\alpha = .05 .
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
63
The model E(y)=β0+β1x1+β2x2+β3x3E ( y ) = \beta _ { 0 } + \beta _ { 1 } x _ { 1 } + \beta _ { 2 } x _ { 2 } + \beta _ { 3 } x _ { 3 } was used to relate E(y)E ( y ) to a single qualitative variable. How many levels does the qualitative variable have?
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
64
The table below shows data for n=20n = 20 observations.

yx1x2183823510152731612244928511172719383071028581436327111728245102661127611213631713192825510\begin{array}{ccc}\hline \mathrm{y} & \mathrm{x} 1 & \mathrm{x} 2 \\\hline 18 & 3 & 8 \\23 & 5 & 10 \\15 & 2 & 7 \\31 & 6 & 12 \\24 & 4 & 9 \\28 & 5 & 11 \\17 & 2 & 7 \\19 & 3 & 8 \\30 & 7 & 10 \\28 & 5 & 8 \\14 & 3 & 6 \\32 & 7 & 11 \\17 & 2 & 8 \\24 & 5 & 10 \\26 & 6 & 11 \\27 & 6 & 11 \\21 & 3 & 6 \\31 & 7 & 13 \\19 & 2 & 8 \\25 & 5 & 10 \\\hline\end{array}

a. Use a first-order regression model to find a least squares prediction equation for the model.
b. Find a 95%95 \% confidence interval for the coefficient of x1x _ { 1 } in your model. Interpret the result.
c. Find a 95%95 \% confidence interval for the coefficient of x2x _ { 2 } in your model. Interpret the result.
d. Find R2R ^ { 2 } and Ra2R _ { a } 2 and interpret these values.
e. Test the null hypothesis H0:β1=β2=0H _ { 0 } : \beta _ { 1 } = \beta _ { 2 } = 0 against the alternative hypothesis Ha:H _ { \mathrm { a } } : at least one βi0\beta _ { i } \neq 0 . Use α=.05\alpha = .05 . Interpret the result.
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
65
A statistics professor gave three quizzes leading up to the first test in his class. The quiz grades and test grade for each of eight students are given in the table.

 Student TestGradeQuiz 1  Quiz 2  Quiz 3 175895289107637398749187105649666788767831087871946\begin{array}{ccccc}\text { Student}& \text { TestGrade}& \text {Quiz 1 } & \text { Quiz 2 } & \text { Quiz 3 } \\\hline 1 & 75 & 8 & 9 & 5 \\2 & 89 & 10 & 7 & 6 \\3 & 73 & 9 & 8 & 7 \\4 & 91 & 8 & 7 & 10 \\5 & 64 & 9 & 6 & 6 \\6 & 78 & 8 & 7 & 6 \\7 & 83 & 10 & 8 & 7 \\8 & 71 & 9 & 4 & 6 \\ \hline\end{array}
The professor would like to use the data to find a first-order model that he might use to predict a student's grade on the first test using that student's grades on the first threequizzes.
a. Identify the dependent and independent variables for the model.
b. What is the least squares prediction equation?
c. Find the SSE and the estimator of σ2\sigma ^ { 2 } for the model.
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
66
Retail price data for n=60n = 60 hard disk drives were recently reported in a computer magazine. Three variables were recorded for each hard disk drive:
y=y = Retail PRICE (measured in dollars)
x1=x _ { 1 } = Microprocessor SPEED (measured in megahertz)
(Values in sample range from 10 to 40 )
x2=x _ { 2 } = CHIP size (measured in computer processing units)
(Values in sample range from 286 to 486 )

A first-order regression model was fit to the data. Part of the printout follows:
 Analysis of Variance  SOURCE  DF  SS  MS  F VALUE  PROB > F  MODEL 234593103.00817296051.50419.0180.0001 ERROR 5751840202.926909477.24431 CTOTAL 5986432305.933\begin{array}{lrrrrr} & {\text { Analysis of Variance }} \\\text { SOURCE } & \text { DF } & \text { SS } & \text { MS } & \text { F VALUE } & \text { PROB > F } \\\text { MODEL } & 2 & 34593103.008 & 17296051.504 & 19.018 & 0.0001 \\\text { ERROR } & 57 & 51840202.926 & 909477.24431 & & \\\text { CTOTAL } & 59 & 86432305.933 & & &\end{array}


 ROOT MSE 953.66516 R-SQUARE 0.4002 DEP MEAN 3197.96667 ADJ R-SQ 0.3792 C.V. 29.82099\begin{array}{llll}\text { ROOT MSE } & 953.66516 & \text { R-SQUARE } & 0.4002 \\\text { DEP MEAN } & 3197.96667 & \text { ADJ R-SQ } & 0.3792 \\\text { C.V. } & 29.82099 & &\end{array}


Test to determine if the model is adequate for predicting the price of a computer. Use α=\alpha = .01.01 .
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
67
The sum of squared errors (SSE) of a least squares regression model decreases when new terms are added to the model.
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
68
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
69
An elections officer wants to model voter turnout (y) in a precinct as a function of type of election, national or state.

Write a model for mean voter turnout, E(y), as a function of type of election.

A) E(y)=β0+β1xE ( y ) = \beta _ { 0 } + \beta 1 ^ { x } , where x=1x = 1 if national, 0 if state
B) E(y)=β0+β1xE ( y ) = \beta _ { 0 } + \beta _ { 1 } x , where x=x = voter turnout
C) E(y)=β0+β1x1+β2x2E ( y ) = \beta _ { 0 } + \beta _ { 1 } x _ { 1 } + \beta _ { 2 } x _ { 2 } , where x1=1x _ { 1 } = 1 if national, 0 if not and x2=1x _ { 2 } = 1 if state, 0 if not
D) E(y)=β0+β1x+β2x2E ( y ) = \beta _ { 0 } + \beta _ { 1 } x + \beta _ { 2 } x ^ { 2 } , where x=x = voter turnout
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
70
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
71
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
72
<strong> </strong> A) The model is statistically useful for predicting Test4 score. B) The model is not statistically useful for predicting Test4 score. C) The first three test scores are reliable predictors of Test4 score. D) The first three test scores are poor predictors of Test4 score.

A) The model is statistically useful for predicting Test4 score.
B) The model is not statistically useful for predicting Test4 score.
C) The first three test scores are reliable predictors of Test4 score.
D) The first three test scores are poor predictors of Test4 score.
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
73
In any production process in which one or more workers are engaged in a variety of tasks, the total time spent in production varies as a function of the size of the workpool and the level of output of the various activities. In a large metropolitan department store, it is believed that the number of man-hours worked (y)( y ) per day by the clerical staff depends on the number of pieces of mail processed per day (x1)\left( x _ { 1 } \right) and the number of checks cashed per day (x2)\left( x _ { 2 } \right) . Data collected for n=20n = 20 working days were used to fit the model:
E(y)=β0+β1x1+β2x2E ( y ) = \beta _ { 0 } + \beta _ { 1 } x _ { 1 } + \beta _ { 2 } x _ { 2 }
A partial printout for the analysis follows:
 Actual  Predict  Lower 95% CL  Upper 95% CL  OBS  X1  X2  Value  Value  Residual  Predict  Predict 1778164474.70783.1758.46847.224119.126\begin{array} { r r r r r r r r } \hline & & & \text { Actual } & \text { Predict } & & \text { Lower 95\% CL } & \text { Upper 95\% CL } \\\text { OBS } & \text { X1 } & \text { X2 } & \text { Value } & \text { Value } & \text { Residual } & \text { Predict } & \text { Predict } \\& & & & & & & \\1 & 7781 & 644 & 74.707 & 83.175 & - 8.468 & 47.224 & 119.126 \\\hline\end{array}

Interpret the 95% prediction interval for y shown on the printout.

A) We are 95% confident that the mean number of man-hours worked per day falls between 47.224 and 119.126 for all days in which 7,781 pieces of mail are processed and 644 checks are
Cashed.
B) We are 95% confident that the number of man-hours worked per day falls between 47.224 and 119.126.
C) We are 95% confident that between 47.224 and 119.126 man-hours will be worked during a single day in which 7,781 pieces of mail are processed and 644 checks are cashed.
D) We expect to predict number of man-hours worked per day to within an amount between 47.224 and 119.126 of the true value.
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
74
Operations managers often use work sampling to estimate how much time workers spend on each operation. Work sampling-which involves observing workers at random points in time-was applied to the staff of the catalog sales department of a clothing manufacturer.
The department applied regression to the following data collected for 40 consecutive working days:
TIME: y=\quad\quad y = Time spent (in hours) taking telephone orders during the day
ORDERS: x1=\quad x _ { 1 } = Number of telephone orders received during the day
WEEK: x2=1\quad\quad x _ { 2 } = 1 weekday, 0 if Saturday or Sunday

Consider the complete 2nd-order model:
E(y)=β0+β1x1+β2(x1)2+β3x2+β4x1x2+β5(x1)2x2E ( y ) = \beta _ { 0 } + \beta _ { 1 } x _ { 1 } + \beta _ { 2 } \left( x _ { 1 } \right) ^ { 2 } + \beta _ { 3 } x _ { 2 } + \beta _ { 4 } x _ { 1 } x _ { 2 } + \beta _ { 5 } \left( x _ { 1 } \right) ^ { 2 } x _ { 2 } Explain how to conduct a test to determine if a quadratic relationship between total order time and the number of orders taken is necessary in the regression model above. Specify the null and alternative hypotheses that are to be tested.
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
75
A statistics professor gave three quizzes leading up to the first test in his class. The quiz grades and test grade for each of eight students are given in the table.

 Student Test Grade  Quiz 1 Quiz 2 Quiz 3175895289107637398749187105649666788767831087871946\begin{array}{ccccc}\hline \text { Student Test Grade } & \text { Quiz } 1 & \text { Quiz } 2 & \text { Quiz } 3 \\\hline 1 & 75 & 8 & 9 & 5 \\2 & 89 & 10 & 7 & 6 \\3 & 73 & 9 & 8 & 7 \\4 & 91 & 8 & 7 & 10 \\5 & 64 & 9 & 6 & 6 \\6 & 78 & 8 & 7 & 6 \\7 & 83 & 10 & 8 & 7 \\8 & 71 & 9 & 4 & 6 \\\hline\end{array}
The professor fit a first-order model to the data that he intends to use to predict a student's grade on the first test using that student's grades on the first three quizzes.

Test the null hypothesis H0:β1=β2=β3=0H _ { 0 } : \beta _ { 1 } = \beta _ { 2 } = \beta _ { 3 } = 0 against the alternative hypothesis HaH _ { \mathrm { a } } : at least one βi0\beta _ { i } \neq 0 . Use α=.05\alpha = .05 . Interpret the result.
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
76
In stepwise regression, the probability of making one or more Type I or Type II errors is quite small.
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
77
The printout shows the results of a first-order regression analysis relating the sales price yy of a product to the time in hours x1x _ { 1 } and the cost of raw materials x2x _ { 2 } needed to make the product.
SUMMARY OUTPUT
 Regression Statistics  Multiple R 0.997578302 R Square 0.995162468 Adjusted R Square 0.990324936 Standard Error 1.185250723 Observations 5\begin{array}{ll}\hline \text { Regression Statistics } & \\\hline \text { Multiple R } & 0.997578302 \\\text { R Square } & 0.995162468 \\\text { Adjusted R Square } & 0.990324936 \\\text { Standard Error } & 1.185250723 \\\text { Observations } & 5 \\\hline\end{array}

ANOVA
df SS  MS F Significance F  Regression 2577.9903614288.9952205.7170.004837532 Residual 22.8096385541.404819 Total 4580.8\begin{array} { l l l l l l } & d f & \text { SS } & \text { MS } & F & \text { Significance F } \\\hline \text { Regression } & 2 & 577.9903614 & 288.9952 & 205.717 & 0.004837532 \\\text { Residual } & 2 & 2.809638554 & 1.404819 & & \\\text { Total } & 4 & 580.8 & & & \\\hline\end{array}


 Coefficients  Standard Error t Stat P-value  Lower 95%  Upper 95%  Intercept 26.484337353.6746687737.207270.01871342.2951719810.67350271 Time 2.1686746994.114065320.527140.65073219.870081415.532732 Materials 8.1421686751.0946815837.4379330.01763.43213069312.85220666\begin{array}{lllllll}\hline & \text { Coefficients } & \text { Standard Error } & t \text { Stat } & P \text {-value } & \text { Lower 95\% } & \text { Upper 95\% } \\\hline \text { Intercept } & -26.48433735 & 3.674668773 & -7.20727 & 0.018713 & -42.29517198 & -10.67350271 \\\text { Time } & -2.168674699 & 4.11406532 & -0.52714 & 0.650732 & -19.8700814 & 15.532732 \\\text { Materials } & 8.142168675 & 1.094681583 & 7.437933 & 0.0176 & 3.432130693 & 12.85220666 \\\hline\end{array}

a. What is the least squares prediction equation?
b. Identify the SSE from the printout.
c. Find the estimator of σ2\sigma ^ { 2 } for the model.
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
78
A collector of grandfather clocks believes that the price received for the clocks at an auction increases with the number of bidders, but at an increasing (rather than a constant) rate. Thus, the model proposed to best explain auction price (y, in dollars) by number of bidders (x) is the quadratic model E(y)=β0+β1x+β2x2E ( y ) = \beta _ { 0 } + \beta _ { 1 } x + \beta _ { 2 } x ^ { 2 }
This model was fit to data collected for a sample of 32 clocks sold at auction; the resulting estimate of β1\beta _ { 1 } was .31- .31 .
Interpret this estimate of β1\beta _ { 1 } .

A) We estimate the auction price will increase $.31\$ .31 for each additional bidder at the auction.
B) β1\beta 1 is a shift parameter that has no practical interpretation.
C) We estimate the auction price will be $.31- \$ .31 when there are no bidders at the auction.
D) We estimate the auction price will decrease $.31\$ .31 for each additional bidder at the auction.
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
79
The method of fitting first-order models is the same as that of fitting the simple straight-line model, i.e. the method of least squares.
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
80
The concessions manager at a beachside park recorded the high temperature, the number of people at the park, and the number of bottles of water sold for each of 12 consecutiveSaturdays. The data are shown below.

 Bottles Sold Temperature (F) People 341731625425792100457802125485802800469812550395821975511832675549832800543852850537882775621892800897913100\begin{array}{ccc}\hline \text { Bottles Sold Temperature }\left({ }^{\circ} \mathrm{F}\right) & \text { People } \\\hline 341 & 73 & 1625 \\425 & 79 & 2100 \\457 & 80 & 2125 \\485 & 80 & 2800 \\469 & 81 & 2550 \\395 & 82 & 1975 \\511 & 83 & 2675 \\549 & 83 & 2800 \\543 & 85 & 2850 \\537 & 88 & 2775 \\621 & 89 & 2800 \\897 & 91 & 3100 \\\hline\end{array}

a. Fit the model E(y)=β0+β1x1+β2x2+β3x1x2E ( y ) = \beta _ { 0 } + \beta _ { 1 } x _ { 1 } + \beta _ { 2 } x _ { 2 } + \beta _ { 3 } x _ { 1 } x _ { 2 } to the data, letting yy represent the number of bottles of water sold, x1x _ { 1 } the temperature, and x2x _ { 2 } the number of people at the park.
b. Identify at least two indicators of multicollinearity in the model.
c. Comment on the usefulness of the model to predict the number of bottles of water sold on a Saturday when the high temperature is 103F103 ^ { \circ } \mathrm { F } and there are 3500 people at the park.
Unlock Deck
Unlock for access to all 131 flashcards in this deck.
Unlock Deck
k this deck
locked card icon
Unlock Deck
Unlock for access to all 131 flashcards in this deck.