Services
Discover
Homeschooling
Ask a Question
Log in
Sign up
Filters
Done
Question type:
Essay
Multiple Choice
Short Answer
True False
Matching
Topic
Statistics
Study Set
Basic Business Statistics Study Set 4
Quiz 15: Multiple Regression Model Building
Path 4
Access For Free
Share
All types
Filters
Study Flashcards
Practice Exam
Learn
Question 61
True/False
TABLE 15-7 A chemist employed by a pharmaceutical firm has developed a muscle relaxant. She took a sample of 14 people suffering from extreme muscle constriction. She gave each a vial containing a dose (X) of the drug and recorded the time to relief (Y) measured in seconds for each. She fit a "centered" curvilinear model to this data. The results obtained by Microsoft Excel follow, where the dose (X) given has been "centered." SUMMARY OUTPUT
Ā RegressionĀ StatisticsĀ
Ā MultipleĀ RĀ
0.747
Ā RSquareĀ
0.558
Ā AdjustedĀ RĀ SquareĀ
0.478
Ā StandardĀ ErrorĀ
863.1
Ā ObservationsĀ
14
Ā ANOVAĀ
d
f
Ā SSĀ
Ā MSĀ
F
Ā SignificanceĀ
F
Ā RegressionĀ
2
10344797
5172399
6.94
0.0110
Ā ResidualĀ
11
8193929
744903
Ā TotalĀ
13
18538726
Ā CoeffĀ
Ā StdĀ ErrorĀ
t
Ā StutĀ
p
-valueĀ
Ā InterceptĀ
1283.0
352.0
3.65
0.0040
Ā CenDoseĀ
25.228
8.631
2.92
0.0140
Ā CenDoseSqĀ
0.8604
0.3722
2.31
0.0410
\begin{array}{l}\begin{array} { l r } \begin{array} { l } \end{array} \\\hline\text { Regression Statistics }\\\hline \text { Multiple R } & 0.747 \\\text { RSquare } & 0.558 \\\text { Adjusted R Square } & 0.478 \\\text { Standard Error } & 863.1 \\\text { Observations } & 14 \\\hline\end{array}\\\text { ANOVA }\\\\\begin{array} { l r r r l l } \hline & d f & { \text { SS } } & \text { MS } & F & \text { Significance } F \\\hline \text { Regression } & 2 & 10344797 & 5172399 & 6.94 & 0.0110 \\\text { Residual } & 11 & 8193929 & 744903 & & \\\text { Total } & 13 & 18538726 & & & \\\hline\end{array}\\\\\begin{array} { l c c c c } \hline & \text { Coeff } & \text { Std Error } & t \text { Stut } & p \text {-value } \\\hline \text { Intercept } & 1283.0 & 352.0 & 3.65 & 0.0040 \\\text { CenDose } & 25.228 & 8.631 & 2.92 & 0.0140 \\\text { CenDoseSq } & 0.8604 & 0.3722 & 2.31 & 0.0410 \\\hline\end{array}\end{array}
Ā RegressionĀ StatisticsĀ
Ā MultipleĀ RĀ
Ā RSquareĀ
Ā AdjustedĀ RĀ SquareĀ
Ā StandardĀ ErrorĀ
Ā ObservationsĀ
ā
0.747
0.558
0.478
863.1
14
ā
ā
Ā ANOVAĀ
Ā RegressionĀ
Ā ResidualĀ
Ā TotalĀ
ā
df
2
11
13
ā
Ā SSĀ
10344797
8193929
18538726
ā
Ā MSĀ
5172399
744903
ā
F
6.94
ā
Ā SignificanceĀ
F
0.0110
ā
ā
Ā InterceptĀ
Ā CenDoseĀ
Ā CenDoseSqĀ
ā
Ā CoeffĀ
1283.0
25.228
0.8604
ā
Ā StdĀ ErrorĀ
352.0
8.631
0.3722
ā
t
Ā StutĀ
3.65
2.92
2.31
ā
p
-valueĀ
0.0040
0.0140
0.0410
ā
ā
ā
-Referring to Table 15-7, suppose the chemist decides to use an F test to determine if there is a significant curvilinear relationship between time and dose. If she chooses to use a level of significance of 0.01 she would decide that there is a significant curvilinear relationship.
Question 62
True/False
TABLE 15-9 Many factors determine the attendance at Major League Baseball games. These factors can include when the game is played, the weather, the opponent, whether or not the team is having a good season, and whether or not a marketing promotion is held. Data from 80 games of the Kansas City Royals for the following variables are collected. ATTENDANCE = Paid attendance for the game TEMP = High temperature for the day WIN% = Team's winning percentage at the time of the game OPWIN% = Opponent team's winning percentage at the time of the game WEEKEND - 1 if game played on Friday, Saturday or Sunday; 0 otherwise PROMOTION - 1 = if a promotion was held; 0 = if no promotion was held The regression results using attendance as the dependent variable and the remaining five variables as the independent variables are presented below.
Ā RegressionĀ StatisticsĀ
Ā MultipleĀ RĀ
0.5487
Ā RĀ SquareĀ
0.3011
Ā AdjustedĀ RĀ SquareĀ
0.2538
Ā StandardĀ ErrorĀ
6442.4456
Ā ObservationsĀ
80
\begin{array}{l}\text { Regression Statistics }\\\begin{array} { l r } \hline \text { Multiple R } & 0.5487 \\\text { R Square } & 0.3011 \\\text { Adjusted R Square } & 0.2538 \\\text { Standard Error } & 6442.4456 \\\text { Observations } & 80 \\\hline\end{array}\end{array}
Ā RegressionĀ StatisticsĀ
Ā MultipleĀ RĀ
Ā RĀ SquareĀ
Ā AdjustedĀ RĀ SquareĀ
Ā StandardĀ ErrorĀ
Ā ObservationsĀ
ā
0.5487
0.3011
0.2538
6442.4456
80
ā
ā
ā
Ā ANOVAĀ
d
f
Ā SSĀ
Ā MSĀ
Ā FĀ
Ā SignificanceĀ
F
Ā RegressionĀ
5
1322911703.0671
264582340.6134
6.3747
0.0001
Ā ResidualĀ
74
3071377751.1204
41505104.7449
Ā TotalĀ
79
4394289454.1875
\begin{array}{l}\text { ANOVA }\\\begin{array} { l c c c c c } \hline & \mathrm { df } & \text { SS } & \text { MS } & \text { F } & \text { Significance } \mathrm { F } \\\hline \text { Regression } & 5 & 1322911703.0671 & 264582340.6134 & 6.3747 & 0.0001 \\\text { Residual } & 74 & 3071377751.1204 & 41505104.7449 & & \\\text { Total } & 79 & 4394289454.1875 & & & \\\hline\end{array}\end{array}
Ā ANOVAĀ
Ā RegressionĀ
Ā ResidualĀ
Ā TotalĀ
ā
df
5
74
79
ā
Ā SSĀ
1322911703.0671
3071377751.1204
4394289454.1875
ā
Ā MSĀ
264582340.6134
41505104.7449
ā
Ā FĀ
6.3747
ā
Ā SignificanceĀ
F
0.0001
ā
ā
ā
Coefficients
Ā StandardĀ Error
Ā tĀ Stat
p-value
Intercept
ā
3862.4808
6180.9452
ā
0.6249
0.5340
Ā TempĀ
51.7031
62.9439
0.8214
0.4140
Ā Win%Ā
21.1085
16.2338
1.3003
0.1975
Ā OpWin%Ā
11.3453
6.4617
1.7558
0.0833
Ā WeekendĀ
367.5377
2786.2639
0.1319
0.8954
Ā PromotionĀ
6927.8820
2784.3442
2.4882
0.0151
\begin{array}{lrrrr}\hline&\text{Coefficients}&\text{ Standard Error}&\text{ t Stat}&\text{p-value}\\\hline\text{Intercept}&-3862.4808&6180.9452&-0.6249&0.5340\\\text { Temp } & 51.7031 & 62.9439 & 0.8214 & 0.4140 \\\text { Win\% } & 21.1085 & 16.2338 & 1.3003 & 0.1975 \\\text { OpWin\% } & 11.3453 & 6.4617 & 1.7558 & 0.0833 \\\text { Weekend } & 367.5377 & 2786.2639 & 0.1319 & 0.8954 \\\text { Promotion } & 6927.8820 & 2784.3442 & 2.4882 & 0.0151 \\\hline\end{array}
Intercept
Ā TempĀ
Ā Win%Ā
Ā OpWin%Ā
Ā WeekendĀ
Ā PromotionĀ
ā
Coefficients
ā
3862.4808
51.7031
21.1085
11.3453
367.5377
6927.8820
ā
Ā StandardĀ Error
6180.9452
62.9439
16.2338
6.4617
2786.2639
2784.3442
ā
Ā tĀ Stat
ā
0.6249
0.8214
1.3003
1.7558
0.1319
2.4882
ā
p-value
0.5340
0.4140
0.1975
0.0833
0.8954
0.0151
ā
ā
The coefficient of multiple determination ( R
2
j
) of each of the 5 predictors with all the other remaining predictors are, respectively, 0.2675, 0.3101, 0.1038, 0.7325, and 0.7308 -Referring to Table 15-9, there is reason to suspect collinearity between some pairs of predictors.
Question 63
True/False
One of the consequences of collinearity in multiple regression is inflated standard errors in some or all of the estimated slope coefficients.
Question 64
True/False
TABLE 15- 8 The superintendent of a school district wanted to predict the percentage of students passing a sixth- grade proficiency test. She obtained the data on percentage of students passing the proficiency test (% Passing), daily average of the percentage of students attending class (% Attendance), average teacher salary in dollars (Salaries), and instructional spending per pupil in dollars (Spending) of 47 schools in the state. Let Y = % Passing as the dependent variable, X
1
= % Attendance, X
2
= Salaries and X
3
= Spending. The coefficient of multiple determination (R
2
j
) of each of the 3 predictors with all the other remaining predictors are, respectively, 0.0338, 0.4669, and 0.4743. The output from the best- subset regressions is given below:
Adjusted
ModelĀ
Ā Variables
C
p
k
RĀ Square
RĀ Square
Std.Ā ErrorĀ
1
X
1
3.05
2
0.6024
0.5936
10.5787
2
X
1
X
2
3.66
3
0.6145
0.5970
10.5350
3
X
1
X
2
X
3
4.00
4
0.6288
0.6029
10.4570
4
X
1
X
3
2.00
3
0.6288
0.6119
10.3375
5
X
2
67.35
2
0.0474
0.0262
16.3755
6
X
2
X
3
64.30
3
0.0910
0.0497
16.1768
7
X
3
62.33
2
0.0907
0.0705
15.9984
\begin{array}{llcclcc} & & & && \text {Adjusted} \\\text {Model }&\text { Variables} & \mathrm{Cp} & \mathrm{k} &\text {R Square} & \text {R Square} & \text {Std. Error }\\\hline 1 & X1 & 3.05 & 2 & 0.6024 & 0.5936 & 10.5787 \\2 & X1X2 & 3.66 & 3 & 0.6145 & 0.5970 & 10.5350 \\3 & X1X2X3 & 4.00 & 4 & 0.6288 & 0.6029 & 10.4570 \\4 & X1X3 & 2.00 & 3 & 0.6288 & 0.6119 & 10.3375 \\5 & X2 & 67.35 & 2 & 0.0474 & 0.0262 & 16.3755 \\6 & X2X3 & 64.30 & 3 & 0.0910 & 0.0497 & 16.1768 \\7 & X3 & 62.33 & 2 & 0.0907 & 0.0705 & 15.9984 \\\hline\end{array}
ModelĀ
1
2
3
4
5
6
7
ā
Ā Variables
X
1
X
1
X
2
X
1
X
2
X
3
X
1
X
3
X
2
X
2
X
3
X
3
ā
Cp
3.05
3.66
4.00
2.00
67.35
64.30
62.33
ā
k
2
3
4
3
2
3
2
ā
RĀ Square
0.6024
0.6145
0.6288
0.6288
0.0474
0.0910
0.0907
ā
Adjusted
RĀ Square
0.5936
0.5970
0.6029
0.6119
0.0262
0.0497
0.0705
ā
Std.Ā ErrorĀ
10.5787
10.5350
10.4570
10.3375
16.3755
16.1768
15.9984
ā
ā
Following is the residual plot for % Attendance:
Following is the output of several multiple regression models:
ModelĀ (I):
\text {Model (I):}
ModelĀ (I):
CoefficientsĀ
StdĀ Error
StatĀ
p-value
Ā LowerĀ 95%Ā
Ā UpperĀ 95%
Ā Intercept
ā
753.4225
101.1149
ā
7.4511
2.88
E
ā
09
ā
957.3401
ā
549.5050
%
AttendĀ
8.5014
1.0771
7.8929
6.73
E
ā
10
6.3292
10.6735
SalaryĀ
6.85
E
ā
07
0.0006
0.0011
0.9991
ā
0.0013
0.0013
Spending
0.0060
0.0046
1.2879
0.2047
ā
0.0034
0.0153
\begin{array}{lcrclcr}\hline & \text {Coefficients }& \text {Std Error} & \text {Stat } & \text {p-value} & \text { Lower 95\% }& \text { Upper 95\%} \\\hline \text { Intercept} & -753.4225 & 101.1149 & -7.4511 & 2.88 \mathrm{E}-09 & -957.3401 & -549.5050 \\\% \text {Attend }& 8.5014 & 1.0771 & 7.8929 &6.73 \mathrm{E}-10 & 6.3292 & 10.6735 \\ \text {Salary }& 6.85 \mathrm{E}-07 & 0.0006 & 0.0011 & 0.9991 & -0.0013 & 0.0013 \\ \text {Spending} & 0.0060 & 0.0046 & 1.2879 & 0.2047 & -0.0034 & 0.0153 \\\hline\end{array}
Ā Intercept
%
AttendĀ
SalaryĀ
Spending
ā
CoefficientsĀ
ā
753.4225
8.5014
6.85
E
ā
07
0.0060
ā
StdĀ Error
101.1149
1.0771
0.0006
0.0046
ā
StatĀ
ā
7.4511
7.8929
0.0011
1.2879
ā
p-value
2.88
E
ā
09
6.73
E
ā
10
0.9991
0.2047
ā
Ā LowerĀ 95%Ā
ā
957.3401
6.3292
ā
0.0013
ā
0.0034
ā
Ā UpperĀ 95%
ā
549.5050
10.6735
0.0013
0.0153
ā
ā
ModelĀ (II):
\text {Model (II):}
ModelĀ (II):
Coefficients
StandardĀ ErrorĀ
Ā tĀ Stat
Ā pĀ -valueĀ
InterceptĀ
ā
753.4086
99.1451
ā
7.5991
1.5291
E
ā
09
%
Attendance
8.5014
1.0645
7.9862
4.223
E
ā
10
Spending
0.0060
0.0034
1.7676
0.0840
\begin{array}{lcccc}\hline & \text {Coefficients} & \text {Standard Error }& \text { t Stat} & \text { p -value } \\\hline \text {Intercept }& -753.4086 & 99.1451 & -7.5991 & 1.5291 \mathrm{E}-09 \\\% \text {Attendance} & 8.5014 & 1.0645 & 7.9862 & 4.223 \mathrm{E}-10 \\ \text {Spending} & 0.0060 & 0.0034 & 1.7676 & 0.0840 \\\hline\end{array}
InterceptĀ
%
Attendance
Spending
ā
Coefficients
ā
753.4086
8.5014
0.0060
ā
StandardĀ ErrorĀ
99.1451
1.0645
0.0034
ā
Ā tĀ Stat
ā
7.5991
7.9862
1.7676
ā
Ā pĀ -valueĀ
1.5291
E
ā
09
4.223
E
ā
10
0.0840
ā
ā
ModelĀ (III):
\text {Model (III):}
ModelĀ (III):
Ā dĀ fĀ
Ā SSĀ
Ā MSĀ
Ā FĀ
Ā SignificanceĀ FĀ
Ā Regression
2
8162.9429
4081.4714
39.8708
1.3201
E
ā
10
Ā Residual
44
4504.1635
102.3674
Ā Total
46
12667.1064
\begin{array}{lrrrrl}\hline & \text { d f } & \text { SS } & \text { MS } & \text { F } & \text { Significance F } \\\hline \text { Regression} & 2 & 8162.9429 & 4081.4714 & 39.8708 &1.3201 \mathrm{E}-10 \\ \text { Residual} & 44 & 4504.1635 & 102.3674 & & \\ \text { Total} & 46 & 12667.1064 & & & \\\hline\end{array}
Ā Regression
Ā Residual
Ā Total
ā
Ā dĀ fĀ
2
44
46
ā
Ā SSĀ
8162.9429
4504.1635
12667.1064
ā
Ā MSĀ
4081.4714
102.3674
ā
Ā FĀ
39.8708
ā
Ā SignificanceĀ FĀ
1.3201
E
ā
10
ā
ā
CoefficientsĀ
StandardĀ Error
Ā tĀ StatĀ
pĀ -value
InterceptĀ
6672.8367
3267.7349
2.0420
0.0472
%
Ā Attendance
ā
150.5694
69.9519
ā
2.1525
0.0369
%
AttendanceĀ Squared
0.8532
0.3743
2.2792
0.0276
\begin{array}{lrcrr}\hline & \text {Coefficients }& \text {Standard Error} & \text { t Stat }& \text {p -value} \\\hline \text {Intercept }& 6672.8367 & 3267.7349 & 2.0420 & 0.0472 \\\% \text { Attendance} & -150.5694 & 69.9519 & -2.1525 & 0.0369 \\\% \text {Attendance Squared}& 0.8532 & 0.3743 & 2.2792 & 0.0276 \\\hline\end{array}
InterceptĀ
%
Ā Attendance
%
AttendanceĀ Squared
ā
CoefficientsĀ
6672.8367
ā
150.5694
0.8532
ā
StandardĀ Error
3267.7349
69.9519
0.3743
ā
Ā tĀ StatĀ
2.0420
ā
2.1525
2.2792
ā
pĀ -value
0.0472
0.0369
0.0276
ā
ā
-Referring to Table 15-8, the residual plot suggests that a nonlinear model on % attendance may be a better model.
Question 65
True/False
One of the consequences of collinearity in multiple regression is biased estimates on the slope coefficients.
Question 66
True/False
Only when all three of the hat matrix elements hi, the Studentized deleted residuals ti and the Cook's distance statistic D
i
reveal consistent result should an observation be removed from the regression analysis.
Question 67
True/False
Two simple regression models were used to predict a single dependent variable. Both models were highly significant, but when the two independent variables were placed in the same multiple regression model for the dependent variable, R2 did not increase substantially and the parameter estimates for the model were not significantly different from 0. This is probably an example of collinearity.
Question 68
True/False
TABLE 15-7 A chemist employed by a pharmaceutical firm has developed a muscle relaxant. She took a sample of 14 people suffering from extreme muscle constriction. She gave each a vial containing a dose (X) of the drug and recorded the time to relief (Y) measured in seconds for each. She fit a "centered" curvilinear model to this data. The results obtained by Microsoft Excel follow, where the dose (X) given has been "centered." SUMMARY OUTPUT
Ā RegressionĀ StatisticsĀ
Ā MultipleĀ RĀ
0.747
Ā RSquareĀ
0.558
Ā AdjustedĀ RĀ SquareĀ
0.478
Ā StandardĀ ErrorĀ
863.1
Ā ObservationsĀ
14
Ā ANOVAĀ
d
f
Ā SSĀ
Ā MSĀ
F
Ā SignificanceĀ
F
Ā RegressionĀ
2
10344797
5172399
6.94
0.0110
Ā ResidualĀ
11
8193929
744903
Ā TotalĀ
13
18538726
Ā CoeffĀ
Ā StdĀ ErrorĀ
t
Ā StutĀ
p
-valueĀ
Ā InterceptĀ
1283.0
352.0
3.65
0.0040
Ā CenDoseĀ
25.228
8.631
2.92
0.0140
Ā CenDoseSqĀ
0.8604
0.3722
2.31
0.0410
\begin{array}{l}\begin{array} { l r } \begin{array} { l } \end{array} \\\hline\text { Regression Statistics }\\\hline \text { Multiple R } & 0.747 \\\text { RSquare } & 0.558 \\\text { Adjusted R Square } & 0.478 \\\text { Standard Error } & 863.1 \\\text { Observations } & 14 \\\hline\end{array}\\\text { ANOVA }\\\\\begin{array} { l r r r l l } \hline & d f & { \text { SS } } & \text { MS } & F & \text { Significance } F \\\hline \text { Regression } & 2 & 10344797 & 5172399 & 6.94 & 0.0110 \\\text { Residual } & 11 & 8193929 & 744903 & & \\\text { Total } & 13 & 18538726 & & & \\\hline\end{array}\\\\\begin{array} { l c c c c } \hline & \text { Coeff } & \text { Std Error } & t \text { Stut } & p \text {-value } \\\hline \text { Intercept } & 1283.0 & 352.0 & 3.65 & 0.0040 \\\text { CenDose } & 25.228 & 8.631 & 2.92 & 0.0140 \\\text { CenDoseSq } & 0.8604 & 0.3722 & 2.31 & 0.0410 \\\hline\end{array}\end{array}
Ā RegressionĀ StatisticsĀ
Ā MultipleĀ RĀ
Ā RSquareĀ
Ā AdjustedĀ RĀ SquareĀ
Ā StandardĀ ErrorĀ
Ā ObservationsĀ
ā
0.747
0.558
0.478
863.1
14
ā
ā
Ā ANOVAĀ
Ā RegressionĀ
Ā ResidualĀ
Ā TotalĀ
ā
df
2
11
13
ā
Ā SSĀ
10344797
8193929
18538726
ā
Ā MSĀ
5172399
744903
ā
F
6.94
ā
Ā SignificanceĀ
F
0.0110
ā
ā
Ā InterceptĀ
Ā CenDoseĀ
Ā CenDoseSqĀ
ā
Ā CoeffĀ
1283.0
25.228
0.8604
ā
Ā StdĀ ErrorĀ
352.0
8.631
0.3722
ā
t
Ā StutĀ
3.65
2.92
2.31
ā
p
-valueĀ
0.0040
0.0140
0.0410
ā
ā
ā
-Referring to Table 15-7, suppose the chemist decides to use an F test to determine if there is a significant curvilinear relationship between time and dose. If she chooses to use a level of significance of 0.05, she would decide that there is a significant curvilinear relationship.
Question 69
True/False
TABLE 15-3 A certain type of rare gem serves as a status symbol for many of its owners. In theory, for low prices, the demand increases and it decreases as the price of the gem increases. However, experts hypothesize that when the gem is valued at very high prices, the demand increases with price due to the status owners believe they gain in obtaining the gem. Thus, the model proposed to best explain the demand for the gem by its price is the quadratic model:
Y
=
β
0
+
β
1
X
+
β
2
X
2
+
ε
Y=\beta_{0}+\beta_{1} X+\beta_{2} X^{2}+\varepsilon
Y
=
β
0
ā
+
β
1
ā
X
+
β
2
ā
X
2
+
ε
where Y = demand (in thousands) and X = retail price per carat. This model was fit to data collected for a sample of 12 rare gems of this type. A portion of the computer analysis obtained from Microsoft Excel is shown below: SUMMARY OUTPUT
Ā RegressionĀ Statistics
Ā MultipleĀ RĀ
0.994
Ā RĀ SquareĀ
0.988
Ā StandardĀ ErrorĀ
12.42
Ā ObservationsĀ
12
\begin{array}{lc}\text { Regression Statistics}\\\hline\text { Multiple R } & 0.994 \\\text { R Square } & 0.988 \\\text { Standard Error } & 12.42 \\\text { Observations } & 12 \\\hline\end{array}
Ā RegressionĀ Statistics
Ā MultipleĀ RĀ
Ā RĀ SquareĀ
Ā StandardĀ ErrorĀ
Ā ObservationsĀ
ā
0.994
0.988
12.42
12
ā
ā
A
N
O
V
A
ANOVA
A
NO
V
A
d
f
S
S
Ā MSĀ
F
Ā SignifcanceĀ
Ā RegressionĀ
2
115145
57573
373
0.0001
Ā ResidualĀ
9
1388
154
Ā TotalĀ
11
116533
\begin{array}{lrrrrc}\hline & d f & S S & \text { MS } & F & \text { Signifcance } \\\hline \text { Regression } & 2 & 115145 & 57573 & 373 & 0.0001 \\\text { Residual } & 9 & 1388 & 154 & & \\\text { Total } & 11 & 116533 & & & \\\hline\end{array}
Ā RegressionĀ
Ā ResidualĀ
Ā TotalĀ
ā
df
2
9
11
ā
SS
115145
1388
116533
ā
Ā MSĀ
57573
154
ā
F
373
ā
Ā SignifcanceĀ
0.0001
ā
ā
Ā CoeffĀ
Ā StdĀ ErrorĀ
t
Ā StatĀ
Ā p-valueĀ
Ā InterceptĀ
286.42
9.66
29.64
0.0001
Ā PriceĀ
ā
0.31
0.06
ā
5.14
0.0006
Ā PriceĀ SqĀ
0.000067
0.00007
0.95
0.3647
\begin{array}{lrccc}\hline & \text { Coeff } & \text { Std Error } & t \text { Stat } & \text { p-value } \\\hline \text { Intercept } & 286.42 & 9.66 & 29.64 & 0.0001 \\\text { Price } & -0.31 & 0.06 & -5.14 & 0.0006 \\\text { Price Sq } & 0.000067 & 0.00007 & 0.95 & 0.3647\end{array}
Ā InterceptĀ
Ā PriceĀ
Ā PriceĀ SqĀ
ā
Ā CoeffĀ
286.42
ā
0.31
0.000067
ā
Ā StdĀ ErrorĀ
9.66
0.06
0.00007
ā
t
Ā StatĀ
29.64
ā
5.14
0.95
ā
Ā p-valueĀ
0.0001
0.0006
0.3647
ā
ā
-Referring to Table 15-3, a more parsimonious simple linear model is likely to be statistically superior to the fitted curvilinear for predicting sale price (Y).
Question 70
True/False
Collinearity is present when there is a high degree of correlation between independent variables.
Question 71
True/False
The goals of model building are to find a good model with the fewest independent variables that is easier to interpret and has lower probability of collinearity.
Question 72
True/False
TABLE 15-7 A chemist employed by a pharmaceutical firm has developed a muscle relaxant. She took a sample of 14 people suffering from extreme muscle constriction. She gave each a vial containing a dose (X) of the drug and recorded the time to relief (Y) measured in seconds for each. She fit a "centered" curvilinear model to this data. The results obtained by Microsoft Excel follow, where the dose (X) given has been "centered." SUMMARY OUTPUT
Ā RegressionĀ StatisticsĀ
Ā MultipleĀ RĀ
0.747
Ā RSquareĀ
0.558
Ā AdjustedĀ RĀ SquareĀ
0.478
Ā StandardĀ ErrorĀ
863.1
Ā ObservationsĀ
14
Ā ANOVAĀ
d
f
Ā SSĀ
Ā MSĀ
F
Ā SignificanceĀ
F
Ā RegressionĀ
2
10344797
5172399
6.94
0.0110
Ā ResidualĀ
11
8193929
744903
Ā TotalĀ
13
18538726
Ā CoeffĀ
Ā StdĀ ErrorĀ
t
Ā StutĀ
p
-valueĀ
Ā InterceptĀ
1283.0
352.0
3.65
0.0040
Ā CenDoseĀ
25.228
8.631
2.92
0.0140
Ā CenDoseSqĀ
0.8604
0.3722
2.31
0.0410
\begin{array}{l}\begin{array} { l r } \begin{array} { l } \end{array} \\\hline\text { Regression Statistics }\\\hline \text { Multiple R } & 0.747 \\\text { RSquare } & 0.558 \\\text { Adjusted R Square } & 0.478 \\\text { Standard Error } & 863.1 \\\text { Observations } & 14 \\\hline\end{array}\\\text { ANOVA }\\\\\begin{array} { l r r r l l } \hline & d f & { \text { SS } } & \text { MS } & F & \text { Significance } F \\\hline \text { Regression } & 2 & 10344797 & 5172399 & 6.94 & 0.0110 \\\text { Residual } & 11 & 8193929 & 744903 & & \\\text { Total } & 13 & 18538726 & & & \\\hline\end{array}\\\\\begin{array} { l c c c c } \hline & \text { Coeff } & \text { Std Error } & t \text { Stut } & p \text {-value } \\\hline \text { Intercept } & 1283.0 & 352.0 & 3.65 & 0.0040 \\\text { CenDose } & 25.228 & 8.631 & 2.92 & 0.0140 \\\text { CenDoseSq } & 0.8604 & 0.3722 & 2.31 & 0.0410 \\\hline\end{array}\end{array}
Ā RegressionĀ StatisticsĀ
Ā MultipleĀ RĀ
Ā RSquareĀ
Ā AdjustedĀ RĀ SquareĀ
Ā StandardĀ ErrorĀ
Ā ObservationsĀ
ā
0.747
0.558
0.478
863.1
14
ā
ā
Ā ANOVAĀ
Ā RegressionĀ
Ā ResidualĀ
Ā TotalĀ
ā
df
2
11
13
ā
Ā SSĀ
10344797
8193929
18538726
ā
Ā MSĀ
5172399
744903
ā
F
6.94
ā
Ā SignificanceĀ
F
0.0110
ā
ā
Ā InterceptĀ
Ā CenDoseĀ
Ā CenDoseSqĀ
ā
Ā CoeffĀ
1283.0
25.228
0.8604
ā
Ā StdĀ ErrorĀ
352.0
8.631
0.3722
ā
t
Ā StutĀ
3.65
2.92
2.31
ā
p
-valueĀ
0.0040
0.0140
0.0410
ā
ā
ā
-Referring to Table 15-7, suppose the chemist decides to use a t test to determine if there is a significant difference between a linear model and a curvilinear model that includes a linear term. If she used a level of significance of 0.02, she would decide that the linear model is sufficient.
Question 73
True/False
In stepwise regression, an independent variable is not allowed to be removed from the model once it has entered into the model.
Question 74
True/False
TABLE 15- 8 The superintendent of a school district wanted to predict the percentage of students passing a sixth- grade proficiency test. She obtained the data on percentage of students passing the proficiency test (% Passing), daily average of the percentage of students attending class (% Attendance), average teacher salary in dollars (Salaries), and instructional spending per pupil in dollars (Spending) of 47 schools in the state. Let Y = % Passing as the dependent variable, X
1
= % Attendance, X
2
= Salaries and X
3
= Spending. The coefficient of multiple determination (R
2
j
) of each of the 3 predictors with all the other remaining predictors are, respectively, 0.0338, 0.4669, and 0.4743. The output from the best- subset regressions is given below: Adjusted Following is the residual plot for % Attendance:
Adjusted
ModelĀ
Ā Variables
C
p
k
RĀ Square
RĀ Square
Std.Ā ErrorĀ
1
X
1
3.05
2
0.6024
0.5936
10.5787
2
X
1
X
2
3.66
3
0.6145
0.5970
10.5350
3
X
1
X
2
X
3
4.00
4
0.6288
0.6029
10.4570
4
X
1
X
3
2.00
3
0.6288
0.6119
10.3375
5
X
2
67.35
2
0.0474
0.0262
16.3755
6
X
2
X
3
64.30
3
0.0910
0.0497
16.1768
7
X
3
62.33
2
0.0907
0.0705
15.9984
\begin{array}{llcclcc} & & & && \text {Adjusted} \\\text {Model }&\text { Variables} & \mathrm{Cp} & \mathrm{k} &\text {R Square} & \text {R Square} & \text {Std. Error }\\\hline 1 & X1 & 3.05 & 2 & 0.6024 & 0.5936 & 10.5787 \\2 & X1X2 & 3.66 & 3 & 0.6145 & 0.5970 & 10.5350 \\3 & X1X2X3 & 4.00 & 4 & 0.6288 & 0.6029 & 10.4570 \\4 & X1X3 & 2.00 & 3 & 0.6288 & 0.6119 & 10.3375 \\5 & X2 & 67.35 & 2 & 0.0474 & 0.0262 & 16.3755 \\6 & X2X3 & 64.30 & 3 & 0.0910 & 0.0497 & 16.1768 \\7 & X3 & 62.33 & 2 & 0.0907 & 0.0705 & 15.9984 \\\hline\end{array}
ModelĀ
1
2
3
4
5
6
7
ā
Ā Variables
X
1
X
1
X
2
X
1
X
2
X
3
X
1
X
3
X
2
X
2
X
3
X
3
ā
Cp
3.05
3.66
4.00
2.00
67.35
64.30
62.33
ā
k
2
3
4
3
2
3
2
ā
RĀ Square
0.6024
0.6145
0.6288
0.6288
0.0474
0.0910
0.0907
ā
Adjusted
RĀ Square
0.5936
0.5970
0.6029
0.6119
0.0262
0.0497
0.0705
ā
Std.Ā ErrorĀ
10.5787
10.5350
10.4570
10.3375
16.3755
16.1768
15.9984
ā
ā
Following is the residual plot for % Attendance:
Following is the output of several multiple regression models:
ModelĀ (I):
\text {Model (I):}
ModelĀ (I):
CoefficientsĀ
StdĀ Error
StatĀ
p-value
Ā LowerĀ 95%Ā
Ā UpperĀ 95%
Ā Intercept
ā
753.4225
101.1149
ā
7.4511
2.88
E
ā
09
ā
957.3401
ā
549.5050
%
AttendĀ
8.5014
1.0771
7.8929
6.73
E
ā
10
6.3292
10.6735
SalaryĀ
6.85
E
ā
07
0.0006
0.0011
0.9991
ā
0.0013
0.0013
Spending
0.0060
0.0046
1.2879
0.2047
ā
0.0034
0.0153
\begin{array}{lcrclcr}\hline & \text {Coefficients }& \text {Std Error} & \text {Stat } & \text {p-value} & \text { Lower 95\% }& \text { Upper 95\%} \\\hline \text { Intercept} & -753.4225 & 101.1149 & -7.4511 & 2.88 \mathrm{E}-09 & -957.3401 & -549.5050 \\\% \text {Attend }& 8.5014 & 1.0771 & 7.8929 &6.73 \mathrm{E}-10 & 6.3292 & 10.6735 \\ \text {Salary }& 6.85 \mathrm{E}-07 & 0.0006 & 0.0011 & 0.9991 & -0.0013 & 0.0013 \\ \text {Spending} & 0.0060 & 0.0046 & 1.2879 & 0.2047 & -0.0034 & 0.0153 \\\hline\end{array}
Ā Intercept
%
AttendĀ
SalaryĀ
Spending
ā
CoefficientsĀ
ā
753.4225
8.5014
6.85
E
ā
07
0.0060
ā
StdĀ Error
101.1149
1.0771
0.0006
0.0046
ā
StatĀ
ā
7.4511
7.8929
0.0011
1.2879
ā
p-value
2.88
E
ā
09
6.73
E
ā
10
0.9991
0.2047
ā
Ā LowerĀ 95%Ā
ā
957.3401
6.3292
ā
0.0013
ā
0.0034
ā
Ā UpperĀ 95%
ā
549.5050
10.6735
0.0013
0.0153
ā
ā
ModelĀ (II):
\text {Model (II):}
ModelĀ (II):
Coefficients
StandardĀ ErrorĀ
Ā tĀ Stat
Ā pĀ -valueĀ
InterceptĀ
ā
753.4086
99.1451
ā
7.5991
1.5291
E
ā
09
%
Attendance
8.5014
1.0645
7.9862
4.223
E
ā
10
Spending
0.0060
0.0034
1.7676
0.0840
\begin{array}{lcccc}\hline & \text {Coefficients} & \text {Standard Error }& \text { t Stat} & \text { p -value } \\\hline \text {Intercept }& -753.4086 & 99.1451 & -7.5991 & 1.5291 \mathrm{E}-09 \\\% \text {Attendance} & 8.5014 & 1.0645 & 7.9862 & 4.223 \mathrm{E}-10 \\ \text {Spending} & 0.0060 & 0.0034 & 1.7676 & 0.0840 \\\hline\end{array}
InterceptĀ
%
Attendance
Spending
ā
Coefficients
ā
753.4086
8.5014
0.0060
ā
StandardĀ ErrorĀ
99.1451
1.0645
0.0034
ā
Ā tĀ Stat
ā
7.5991
7.9862
1.7676
ā
Ā pĀ -valueĀ
1.5291
E
ā
09
4.223
E
ā
10
0.0840
ā
ā
ModelĀ (III):
\text {Model (III):}
ModelĀ (III):
Ā dĀ fĀ
Ā SSĀ
Ā MSĀ
Ā FĀ
Ā SignificanceĀ FĀ
Ā Regression
2
8162.9429
4081.4714
39.8708
1.3201
E
ā
10
Ā Residual
44
4504.1635
102.3674
Ā Total
46
12667.1064
\begin{array}{lrrrrl}\hline & \text { d f } & \text { SS } & \text { MS } & \text { F } & \text { Significance F } \\\hline \text { Regression} & 2 & 8162.9429 & 4081.4714 & 39.8708 &1.3201 \mathrm{E}-10 \\ \text { Residual} & 44 & 4504.1635 & 102.3674 & & \\ \text { Total} & 46 & 12667.1064 & & & \\\hline\end{array}
Ā Regression
Ā Residual
Ā Total
ā
Ā dĀ fĀ
2
44
46
ā
Ā SSĀ
8162.9429
4504.1635
12667.1064
ā
Ā MSĀ
4081.4714
102.3674
ā
Ā FĀ
39.8708
ā
Ā SignificanceĀ FĀ
1.3201
E
ā
10
ā
ā
CoefficientsĀ
StandardĀ Error
Ā tĀ StatĀ
pĀ -value
InterceptĀ
6672.8367
3267.7349
2.0420
0.0472
%
Ā Attendance
ā
150.5694
69.9519
ā
2.1525
0.0369
%
AttendanceĀ Squared
0.8532
0.3743
2.2792
0.0276
\begin{array}{lrcrr}\hline & \text {Coefficients }& \text {Standard Error} & \text { t Stat }& \text {p -value} \\\hline \text {Intercept }& 6672.8367 & 3267.7349 & 2.0420 & 0.0472 \\\% \text { Attendance} & -150.5694 & 69.9519 & -2.1525 & 0.0369 \\\% \text {Attendance Squared}& 0.8532 & 0.3743 & 2.2792 & 0.0276 \\\hline\end{array}
InterceptĀ
%
Ā Attendance
%
AttendanceĀ Squared
ā
CoefficientsĀ
6672.8367
ā
150.5694
0.8532
ā
StandardĀ Error
3267.7349
69.9519
0.3743
ā
Ā tĀ StatĀ
2.0420
ā
2.1525
2.2792
ā
pĀ -value
0.0472
0.0369
0.0276
ā
ā
-Referring to Table 15-8, there is reason to suspect collinearity between some pairs of predictors.
Question 75
True/False
Collinearity is present when there is a high degree of correlation between the dependent variable and any of the independent variables.
Question 76
True/False
Collinearity is present if the dependent variable is linearly related to one of the explanatory variables.
Question 77
True/False
A high value of R2 significantly above 0 in multiple regression accompanied by insignificant t-values on all parameter estimates very often indicates a high correlation between independent variables in the model.