Question 1

Complete the missing information for this table (Y is a dichotomous variable).

\begin{array}{ccc}\hline P(Y=1) & P(Y=0) & O d d s(Y=1) \\\hline 0.10 & & \\0.25 & & \\0.40 & & \\0.20 & & \\0.90 & & \\0.75 & & \\0.60 & & \\\hline\end{array}

Accepted Answer

P(Y = 0) = 1 -P(Y = 1). Odds(Y = 1) = P(Y = 1)/P(Y = 0)
The complete table can be obtained as follows.

\begin{array}{ccc}\hline P(Y=1) & 1-P(Y=1) & O d d s(Y=1) \\\hline 0.10 & 0.90 & 0.11 \\0.25 & 0.75 & 0.33 \\0.40 & 0.60 & 0.67 \\0.20 & 0.80 & 0.25 \\0.90 & 0.10 & 9.00 \\0.75 & 0.25 & 3.00 \\0.60 & 0.40 & 1.50 \\\hline\end{array}

Question 2

You are given the following data, where X₁ (high school cumulative GPA) and X₂ (having repeated grade; 0 = never repeated any grade and 1 = have repeated at least one grade; use 0 as the reference category) are used to predict Y (dropping out of high school, "1," vs. graduating high school, "0").
(

\alpha

= .05)

\begin{array}{ccc}\hline X_{\mathbf{1}} & X_{\mathbf{2}} & \boldsymbol{Y} \\\hline 2.50 & 1 & 0 \\2.60 & 0 & 0 \\2.75 & 0 & 0 \\1.33 & 1 & 1 \\3.00 & 1 & 0 \\3.42 & 0 & 0 \\2.70 & 1 & 1 \\2.33 & 1 & 1 \\1.75 & 0 & 1 \\2.80 & 0 & 0 \\\hline\end{array}

Determine the following values based on simultaneous entry of the independent variables: -2LL, constant, b₁, b₂, se(b₁), se(b₂), odds ratios, Wald₁, Wald₂.

Accepted Answer

-2LL = 5.048;
b₁₍_GPA₎ = ?6.617, b_2(Repeat) = 3.204, b_constant = 14.123;
se(b₁₍_GPA₎) = 4.308, se(b₂_(Repeat)) = 3.387;
odds ratio₁_(GPA) = .001, odds ratio_2(Repeat) = 24.631;
Wald₁₍_GPA₎ = 2.359, Wald₂_(Athletics) = .895.

Procedure:
Create a data set with three variables: GPA (X₁), Repeat (X₂), and Dropout (Y). The data set should have 10 cases.

Step 1: Go to Analyze

\rightarrow

Regression

\rightarrow

Binary Logistic.

Step 2: Select Dropout to the Dependent box. Select GPA and Repeat to the Covariate(s) list.

Step 3: Click Categorical. Move Repeat into the Categorical Covariates box. Select Reference Category: First. Click Change. Click Continue.

Step 4: Click Save. Check Probabilities and Group membership under Predicted Values. Check Standardized under Residuals. Check Cook's, Leverage values, and DfBeta(s) under Influences. Click Continue.

Step 5: Click Options. Check Classification plots, Hosmer-Lemeshow goodness-of -fit, Casewise listing of residuals, and CI for exp(B) under Statistics and Plots. Click Continue. Click OK.
Selected SPSS output:

\begin{array}{l}\text { Model Summary }\\\begin{array}{cccc}\hline \text { Step } & -2 \text { Log likelihood } & \text { Cox \& Snell R Square } & \text { Nagelkerke R Square } \\\hline 1 & 5.048^{\mathbf{a}} & .569 & .769 \\\hline\end{array}\end{array}

\begin{array}{l}\text { Hosmer and Lemeshow Test }\\\begin{array}{cccc}\hline \text { Step } & \text { Chi-square } & d f & \text { Sig. } \\\hline 1 & 4.288 & 8 & .830 \\\hline\end{array}\end{array}

\begin{array}{l}\text { Variables in the Equation }\\\hline\begin{array}{r}\underline { 95 \% \text { C.I. for } \mathrm{EXP}(\mathrm{B})}\\\begin{array}{cccccccccc}&&\text { B } & \text { S.E. } & \text { Wald } & d f & \text { Sig. } & \operatorname{Exp}(B)&\text { Lower }&\text { Upper } \\\hline & \text { GPA } & -6.617 & \mathbf{4 . 3 0 8} & \mathbf{2 . 3 5 9} & 1 & .125 & \mathbf{0 0 1} & .000 & 6.209 \\{\text { Step }} & \text { Repeat(1) } & \mathbf{3 . 2 0 4} & \mathbf{3 . 3 8 7} & \mathbf{. 8 9 5} & 1 & .344 & \mathbf{2 4 . 6 3 1} & .032 & 18817.875 \\1^{1}& \text { Constant } & \mathbf{1 4 . 1 2 3} & 9.936 & 2.020 & 1 & .155 & 1359604.313 & & \\\hline\end{array}\end{array}\end{array}

\text { a. Variable(s) entered on step 1: GPA, Repeat. }

Question 3

Complete the missing information for Table 1, using 0.50 as the cut value. Then complete the classification table (Table 2). Compute sensitivity, specificity, false positive rate, and false negative rate.

\begin{array}{l}\text { Table } 1 .\\\begin{array}{ccc}\hline \begin{array}{c}\text { Observed group } \\\text { membership }\end{array} & \begin{array}{c}\text { Predicted } \\\text { Probability }\end{array} & \begin{array}{c}\text { Predicted group } \\\text { membership }\end{array} \\\hline 1 & 0.88 & \\1 & 0.72 & \\0 & 0.62 & \\1 & 0.49 & \\0 & 0.34 & \\1 & 0.40 & \\1 & 0.60 & \\0 & 0.21 & \\0 & 0.05 & \\1 & 0.57 & \\\hline\end{array}\end{array}

\begin{array}{l}\text { Table } 2 .\\\begin{array}{cccc}\hline & &&\underline { \text { Predicted }} \\& & .00 & &1.00 \\\hline{\text { Observed }} & .00 & & \\& 1.00 & & \\\hline\end{array}\end{array}

Accepted Answer

Assuming 0.5 is the cut value, cases with predicted probabilities at .5 or above are predicted as 1 and predicted probabilities below .5 are predicted as 0. There are four cases with observed value 1 and predicted value 1. There are three cases with observed value 0 and predicted value 0. There is one case with observed value 0 yet predicted value 1 (false positive).
There are two cases with observed value 1 yet predicted value 0 (false negative).

Sensitivity = 4/(2+4) = 0.67 = 67%
Specificity = 3/(3+1) = 0.75 = 75%
False positive rate = 1/(3+1) = 0.25 -25%
False negative rate = 2/(2+4) = 0.33 = 33%

\begin{array}{l}\text { Table } 1 .\\\begin{array}{ccc}\hline \begin{array}{c}\text { Observed group } \\\text { membership }\end{array} & \begin{array}{c}\text { Predicted } \\\text { Probability }\end{array} & \begin{array}{c}\text { Predicted group } \\\text { membership }\end{array} \\\hline 1 & 0.88 & 1 \\1 & 0.72 & 1 \\0 & 0.62 & 1 \\1 & 0.49 & 0 \\0 & 0.34 & 0 \\1 & 0.40 & 0 \\1 & 0.60 & 1 \\0 & 0.21 & 0 \\0 & 0.05 & 0 \\1 & 0.57 & 1 \\\hline\end{array}\end{array}

\begin{array}{l}\text { Table } 2 .\\\begin{array}{cccc}\hline & &&\underline { \text { Predicted }} \\& & .00 & &1.00 \\\hline{\text { Observed }} & .00 & 3&& 1\\& 1.00 &2& & 4\\\hline\end{array}\end{array}

Question 4

You are given the following data, where X₁ (high school cumulative GPA) and X₂ (having repeated grade; 0 = never repeated any grade and 1 = have repeated at least one grade; use 0 as the reference category) are used to predict Y (dropping out of high school, "1," vs. graduating high school, "0").
(

\alpha

= .05)

\begin{array}{ccc}\hline X_{\mathbf{1}} & X_{\mathbf{2}} & \boldsymbol{Y} \\\hline 2.50 & 1 & 0 \\2.60 & 0 & 0 \\2.75 & 0 & 0 \\1.33 & 1 & 1 \\3.00 & 1 & 0 \\3.42 & 0 & 0 \\2.70 & 1 & 1 \\2.33 & 1 & 1 \\1.75 & 0 & 1 \\2.80 & 0 & 0 \\\hline\end{array}

Determine the following values based on simultaneous entry of the independent variables: -2LL, constant, b₁, b₂, se(b₁), se(b₂), odds ratios, Wald₁, Wald₂.

Accepted Answer

The answer of You are given the following data, where...

Question 5

Professor Pruefung wanted to examine if performance in quizzes can predict whether a student will pass or fail the final exam. The independent variables are scores in two pop quizzes (Quiz1, Quiz2), and the dependent variable is a dichotomous variable (pass = 1 vs. fail = 0). Below is part of the output of the analysis. a. Professor Pruefung assumed that the better a student performed in the quizzes (a higher score indicates better performance), the higher the odds that he/she will pass the final exam. If that is the case, what are the expected signs for b₁ and b₂? Do the results confirm the expectation? b. Based on the tables, is there any indication of assumptions violation? If so, which assumption(s) has (have) been violated? c. What are the possible consequences of the assumption violation? $\text { Om nibus Tests of Model Coeffcients }$ $\begin{array}{ccccc} \hline & & \text { Chi-square } & d f & \text { Sig. } \ \hline{\text { Step }} & \text { Step } & 24.055 & 2 & .000 \ 1 & \text { Block } & 24.055 & 2 & .000 \ & \text { Model } & 24.055 & 2 & .000 \end{array}$ $\text { Model Summary }$ $\begin{array}{cccc} \hline {\text { Step }} & {-2 \mathrm{Log}} & \text { Cox \& Snell} & \text {Nagelkerke } \ & \text { likelihood } & R \text { Square } & R \text { Square } \ \hline 1 & 22.998 & .452 & .653 \end{array}$ $\begin{array}{l} \text { Variables in the Equation }\ \begin{array}{llllllll} \hline & & \text { B } & \text { S.E. } & \text { Wald } & d f & \text { Sig. } & \operatorname{Exp}(\mathrm{B}) \ \hline {\text { Step 1 }} & \text { Quiz1 } & 1.557 & 1.064 & 2.140 & 1 & .143 & 4.745 \ & \text { Quiz2 } & -.535 & 1.023 & .273 & 1 & .601 & .586 \ & \text { Constant } & -21.721 & 8.990 & 5.838 & 1 & .016 & .000 \ \hline \end{array} \end{array}$

Accepted Answer

The answer of Professor Pruefung wanted to examine if performance...

Question 6

Which one of the following can be used as an appropriate dependent variable for binary logistic regression?&#10;A) Dichotomous variable&#10;B) Multinomial variable&#10;C) Continuous variable&#10;D) None of the above&#10;E) All of the above

Accepted Answer

(The outcome for binary logistic regression must have only two categories.)

Question 7

A study was conducted to investigate variables associated with dropping out of high school.
The following logistic regression model was obtained:
Logit(Y_i) = 3.5 - 1.3X₁ + 2.3X₂.

Y: 1 = dropped out of high school; 0 = did not drop out of high school;
X₁: cumulative high school GPA obtained;
X₂: 1 = retained in at least one grade; 0 = never retained in any grade.

-If Mindy has a high school GPA of 3, and has never repeated a grade, which of the following predictions can be derived from the model?

A) Mindy has more than 50% probability of dropping out of high school.
B) Mindy has less than 50% probability of dropping out of high school.
C) Mindy has exactly 50% probability of dropping out of high school.
D) Mindy will drop out of high school.
E) Mindy will not drop out of high school.

Accepted Answer

(Logistic regression predicts the odds that a unit of analysis belongs to one of two groups defined by the dependent variable. In this case, it predicts the odds that a student belongs to the group "drop out of high school.")

Question 8

A study was conducted to investigate variables associated with dropping out of high school.
The following logistic regression model was obtained:
Logit(Y_i) = 3.5 - 1.3X₁ + 2.3X₂.

Y: 1 = dropped out of high school; 0 = did not drop out of high school;
X₁: cumulative high school GPA obtained;
X₂: 1 = retained in at least one grade; 0 = never retained in any grade.

-If Mindy has a high school GPA of 3, and has never repeated a grade, which of the following predictions can be derived from the model?

A) Mindy has more than 50% probability of dropping out of high school.
B) Mindy has less than 50% probability of dropping out of high school.
C) Mindy has exactly 50% probability of dropping out of high school.
D) Mindy will drop out of high school.
E) Mindy will not drop out of high school.

Accepted Answer

(The regression coefficient for X₂ is positive, indicating that if a student has been retained in at least one grade (X₂ = 1), the log odds of dropping out of high school will increase.)

Question 9

A study was conducted to investigate variables associated with dropping out of high school.
The following logistic regression model was obtained:
Logit(Y_i) = 3.5 - 1.3X₁ + 2.3X₂.

Y: 1 = dropped out of high school; 0 = did not drop out of high school;
X₁: cumulative high school GPA obtained;
X₂: 1 = retained in at least one grade; 0 = never retained in any grade.

-If Mindy has a high school GPA of 3, and has never repeated a grade, which of the following predictions can be derived from the model?

A) Mindy has more than 50% probability of dropping out of high school.
B) Mindy has less than 50% probability of dropping out of high school.
C) Mindy has exactly 50% probability of dropping out of high school.
D) Mindy will drop out of high school.
E) Mindy will not drop out of high school.

Unlock Deck

Unlock for access to all 15 flashcards in this deck.

Unlock Deck

k this deck

Accepted Answer

(For Mindy, X₁ = 3, X₂ = 0, logit(Y) = 3.7 ? 1.3(3) < 0. That means the odds of Mindy dropping out of high school are smaller than 1, so Mindy has less than 50% probability of dropping out.)

Question 10

&#10;&#10;-Aaron is studying smoking behavior and has coded &#34;smoker&#34; as &#34;1&#34; and &#34;nonsmoker&#34; as &#34;0.&#34; The predictor is the number of family members who smoke. Which of the following is a correct interpretation of an odds ratio of +2?&#10;A) For every additional family member who smokes, the odds of being a smoker increase by 100%.&#10;B) For every additional family member who smokes, the odds of being a smoker decrease by 100%.&#10;C) For every one-unit increase in being a smoker, the odds of having a family member who smoke increase by 100%.&#10;D) For every one-unit increase in being a smoker, the odds of having a family member who smoke decrease by 100%.

Accepted Answer

(The number of family members who smoke is the independent variable. The positive sign of the odds ratio indicates that the odds of being a smoker increase as the independent variable increases in value.)
In the smoking study, Aaron has obtained the following classification table.

Question 11

&#10;&#10;-What is the false positive rate?&#10;A) 20%&#10;B) 28%&#10;C) 48%&#10;D) 52%

Accepted Answer

The answer of  &#10;&#10;-What is the false positive rate?&#10;A)...

Question 12

&#10;&#10;-What is the false negative rate?&#10;A) 20%&#10;B) 28%&#10;C) 48%&#10;D) 52%

Accepted Answer

The answer of  &#10;&#10;-What is the false negative rate?&#10;A)...

Question 13

If a person is predicted to be a smoker, we would expect that&#10;A) the person has a good chance of actually being a smoker, because the model has an adequate overall predictive accuracy.&#10;B) the person has a good chance of actually being a smoker, because the model has high specificity.&#10;C) the person may or may not be a smoker, because the model has low sensitivity.&#10;D) the person may or may not be a smoker, because the false positive rate is high.

Accepted Answer

The answer of If a person is predicted to be...

Question 14

The odds ratio is computed by which of the following?&#10;A) b&#7497;&#10;B) b&#8315;&#185;/&#178;&#10;C) e&#7495;&#7503;&#10;D) e&#691;&#7495;

Accepted Answer

The answer of The odds ratio is computed by which...

Question 15

Which one of the following can occur when the number of variables equals, or nearly equals, the number of cases in the data?&#10;A) Extremely small regression coefficients and standard errors.&#10;B) The dependent variable is perfectly predicted.&#10;C) The maximum likelihood estimator reduces to zero.&#10;D) The outcome is constant for one or more categories of a nominal independent variable.

Accepted Answer

The answer of Which one of the following can occur...

Deck 19: Logistic Regression