Question 1

Which of the following is the relevant sampling distribution for regression coefficients?&#10;A) normal distribution&#10;B) t-distribution with n-1 degrees of freedom&#10;C) t-distribution with n-1-k degrees of freedom&#10;D) F-distribution with n-1-k degrees of freedom

Accepted Answer

The sampling distribution for regression coefficients follows a t-distribution with n-1-k degrees of freedom, where n is the sample size and k is the number of predictors.

Question 2

Suppose you run a regression of a person's height on his/her right and left foot sizes, and you suspect that there may be multicollinearity between the foot sizes. What types of problems might you see if your suspicions are true?

A) "wrong" values for the coefficients for the left and right foot size
B) large p-values for the coefficients for the left and right foot size
C) small t-values for the coefficients for the left and right foot size
D) all of these choices

all of these choices

Accepted Answer

Multicollinearity can lead to inflated standard errors, which result in large p-values and small t-values. It can also lead to "wrong" coefficient values, as the model may have difficulty distinguishing the independent effects of each variable. Therefore, all of the choices are possible problems when there is multicollinearity.

Question 3

The term autocorrelation refers to the observation that:&#10;A) analyzed data refers to itself&#10;B) sample is related too closely to the population&#10;C) data are in a loop (values repeat themselves)&#10;D) time series variables are usually related to their own past values

Accepted Answer

Autocorrelation refers to the relationship between a time series variable and its own past values, indicating that there is a dependence between the observations in the time series. It can be measured using a correlation coefficient or a partial autocorrelation function. Options A, B and C do not accurately describe the concept of autocorrelation.

Question 4

In regression analysis, the ANOVA table analyzes:&#10;A) the variation of the response variable Y&#10;B) the variation of the explanatory variable X&#10;C) the total variation of all variables&#10;D) all of these choices

Accepted Answer

The ANOVA table in regression analysis is used to analyze the variation of the response variable Y by partitioning it into components due to the regression model and due to random error.

Question 5

The value k in the number of degrees of freedom, n-k-1, for the sampling distribution of the regression coefficients represents the:&#10;A) sample size&#10;B) population size&#10;C) number of coefficients in the regression equation, including the constant&#10;D) number of independent variables included in the equation

Accepted Answer

The value k represents the number of independent variables included in the regression equation.

Question 6

In the standardized value   , the symbol   represents the:&#10;A) mean of  &#10;B) variance of  &#10;C) standard error of  &#10;D) degrees of freedom of

Accepted Answer

The answer of In the standardized value   ,...

Question 7

The appropriate hypothesis test for an ANOVA test is:&#10;A)  &#10;B)  &#10;C)  &#10;D)

Accepted Answer

The appropriate hypothesis test for an ANOVA test is the F-test (also known as the ANOVA test). This is because ANOVA is used to compare the means of three or more groups, and the F-test is used to determine whether there is a significant difference between the means of two or more groups. Choice B represents the F-test.

Question 8

The appropriate hypothesis test for a regression coefficient is:&#10;A)  &#10;B)  &#10;C)  &#10;D) none of these choices

Accepted Answer

The appropriate hypothesis test for a regression coefficient is a t-test. Choice B (11eb06d9_cc7a_9e9e_8177_b79faa48e836_TB1387_11) corresponds to a t-test for a regression coefficient.

Question 9

There is evidence that the regression equation provides little explanatory power when the F-ratio:&#10;A) is large&#10;B) equals the regression coefficient&#10;C) is small&#10;D) is the constant

Accepted Answer

The F-ratio measures the overall fit of the regression equation and compares the variability explained by the model to the residual variability. When the F-ratio is small, it indicates that the model is not providing much explanatory power and that the residual variability is relatively large. Therefore, choice C is the best answer.

Question 10

The ANOVA table splits the total variation into two parts. They are the:&#10;A) acceptable and unacceptable variation&#10;B) adequate and inadequate variation&#10;C) resolved and unresolved variation&#10;D) explained and unexplained variation

Accepted Answer

The ANOVA table splits the total variation into two parts - explained and unexplained variation. The explained variation is due to the factor being tested (such as a treatment in an experiment) and the unexplained variation is due to random error or other factors not accounted for in the experiment.

Question 11

Which definition best describes parsimony?&#10;A) explaining the most with the least&#10;B) explaining the least with the most&#10;C) being able to explain all of the change in the response variable&#10;D) being able to predict the value of the response variable far into the future

Accepted Answer

The answer of Which definition best describes parsimony?&#10;A) explaining the...

Question 12

An error term represents the vertical distance from any point to the:&#10;A) estimated regression line&#10;B) population regression line&#10;C) value of the Y's&#10;D) mean value of the X's

Accepted Answer

The answer of An error term represents the vertical distance...

Question 13

A scatterplot that exhibits a &#34;fan&#34; shape (the variation of Y increases as X increases) is an example of:&#10;A) homoscedasticity&#10;B) heteroscedasticity&#10;C) autocorrelation&#10;D) multicollinearity

Accepted Answer

The answer of A scatterplot that exhibits a &#34;fan&#34; shape...

Question 14

In regression analysis, multicollinearity refers to the:&#10;A) response variables being highly correlated&#10;B) explanatory variables being highly correlated&#10;C) response variable(s) and the explanatory variable(s) being highly correlated with one another&#10;D) response variables being highly correlated over time

Accepted Answer

The answer of In regression analysis, multicollinearity refers to the:&#10;A)...

Question 15

Which of the following is not one of the assumptions of regression?&#10;A) There is a population regression line that joins the SDs of all possible distributions of results.&#10;B) The response variable is normally distributed.&#10;C) The standard deviation of the response variable increases as the explanatory variables increase.&#10;D) The errors are probabilistically independent.

Accepted Answer

The answer of Which of the following is not one...

Question 16

Time series data often exhibits which of the following characteristics?&#10;A) homoscedasticity&#10;B) heteroscedasticity&#10;C) autocorrelation&#10;D) multicollinearity

Accepted Answer

The answer of Time series data often exhibits which of...

Question 17

Another term for constant error variance is:&#10;A) homoscedasticity&#10;B) heteroscedasticity&#10;C) autocorrelation&#10;D) multicollinearity

Accepted Answer

The answer of Another term for constant error variance is:&#10;A)...

Question 18

What is not one of the guidelines for including/excluding variables in a regression equation?&#10;A) Look at the t-value and associated p-value.&#10;B) Check whether the t-value is less than or greater than 1.0.&#10;C) The variables are logically related to one another.&#10;D) Use economic or physical theory to make the decision.&#10;E) All of these choices are guidelines.

Accepted Answer

The answer of What is not one of the guidelines...

Question 19

Which statement is true regarding regression error, &#949;?&#10;A) It is the same as a residual.&#10;B) It can be calculated from the observed data.&#10;C) It cannot be calculated from the observed data.&#10;D) It is unbiased.

Accepted Answer

The answer of Which statement is true regarding regression error,...

Question 20

The t-value for testing   is calculated using which of the following equations?&#10;A) n - k - 1&#10;B)  &#10;C)  &#10;D)

Accepted Answer

The answer of The t-value for testing   is...

Question 21

The objective typically used in the tree types of equation-building procedures is to: A) find the equation with a small s_e B) find the equation with a large R² C) find the equation with a small s_e and a large R² D) find the equation with the smallest F-ratio

Accepted Answer

The answer of The objective typically used in the tree...

Question 22

Which of the following would be considered a definition of an outlier?&#10;A) an extreme value for one or more variables&#10;B) a value whose residual is abnormally large in magnitude&#10;C) values for individual explanatory variables that fall outside the general pattern of the other observations&#10;D) all of these choices

Accepted Answer

The answer of Which of the following would be considered...

Question 23

Determining which variables to include in regression analysis by estimating a series of regression equations by successively adding or deleting variables according to prescribed rules is referred to as:

A) elimination regression
B) forward regression
C) backward regression
D) stepwise regression

Accepted Answer

The answer of Determining which variables to include in regression...

Question 24

Which approach can be used to test for autocorrelation?&#10;A) regression coefficient&#10;B) correlation coefficient&#10;C) Durbin-Watson statistic&#10;D) F-test or t-test

Accepted Answer

The answer of Which approach can be used to test...

Question 25

When the error variance is nonconstant, it is common to see the variation increases as the explanatory variable increases (you will see a "fan shape" in the scatterplot). There are two ways you can deal with this phenomenon. These are:

A) the weighted least squares and a logarithmic transformation
B) the partial F and a logarithmic transformation
C) the weighted least squares and the partial F
D) stepwise regression and the partial F

Accepted Answer

The answer of When the error variance is nonconstant, it...

Question 26

Suppose you forecast the values of all of the independent variables and insert them into a multiple regression equation and obtain a point prediction for the dependent variable. You could then use the standard error of the estimate to obtain an approximate:

A) confidence interval
B) prediction interval
C) hypothesis test
D) independence test

Accepted Answer

The answer of Suppose you forecast the values of all...

Question 27

In regression analysis, homoscedasticity refers to constant error variance.

Accepted Answer

The answer of In regression analysis, homoscedasticity refers to constant...

Question 28

In regression analysis, extrapolation is performed when you:&#10;A) attempt to predict beyond the limits of the sample&#10;B) have to estimate some of the explanatory variable values&#10;C) have to use a lag variable as an explanatory variable in the model&#10;D) do not have observations for every period in the sample

Accepted Answer

The answer of In regression analysis, extrapolation is performed when...

Question 29

A point that &#34;tilts&#34; the regression line toward it, is referred to as a(n):&#10;A) magnetic point&#10;B) influential point&#10;C) extreme point&#10;D) explanatory point

Accepted Answer

The answer of A point that &#34;tilts&#34; the regression line...

Question 30

The assumptions of regression are: 1) there is a population regression line, 2) the dependent variable is normally distributed, 3) the standard deviation of the response variable remains constant as the explanatory variables increase, and 4) the errors are probabilistically independent.

Accepted Answer

The answer of The assumptions of regression are: 1) there...

Question 31

If you can determine that the outlier is not really a member of the relevant population, then it is appropriate and probably best to:&#10;A) average it&#10;B) reduce it&#10;C) delete it&#10;D) leave it

Accepted Answer

The answer of If you can determine that the outlier...

Question 32

When determining whether to include or exclude a variable in regression analysis, if the p-value associated with the variable's t-value is above some accepted significance value, such as 0.05, then the variable:

A) is a candidate for inclusion
B) is a candidate for exclusion
C) is redundant
D) does not fit the guidelines of parsimony

Accepted Answer

The answer of When determining whether to include or exclude...

Question 33

Multiple regression represents an improvement over simple regression because it allows any number of response variables to be included in the analysis.

Accepted Answer

The answer of Multiple regression represents an improvement over simple...

Question 34

In a simple linear regression model, testing whether the slope   of the population regression line could be zero is the same as testing whether or not the linear relationship between the response variable Y and the explanatory variable X is significant.

Accepted Answer

The answer of In a simple linear regression model, testing...

Question 35

Residuals separated by one period that are autocorrelated indicate:&#10;A) simple autocorrelation&#10;B) redundant autocorrelation&#10;C) time 1 autocorrelation&#10;D) lag 1 autocorrelation

Accepted Answer

The answer of Residuals separated by one period that are...

Question 36

A researcher can check whether the errors are normally distributed by using:&#10;A) a t-test or an F-test&#10;B) the Durbin-Watson statistic&#10;C) a frequency distribution or the value of the regression coefficient&#10;D) a histogram or a Q-Q plot

Accepted Answer

The answer of A researcher can check whether the errors...

Question 37

Forward regression:&#10;A) begins with all potential explanatory variables in the equation and deletes them one at a time until further deletion would do more harm than good&#10;B) adds and deletes variables until an optimal equation is achieved&#10;C) begins with no explanatory variables in the equation and successively adds one at a time until no remaining variables make a significant contribution&#10;D) randomly selects the optimal number of explanatory variables to be used

Accepted Answer

The answer of Forward regression:&#10;A) begins with all potential explanatory...

Question 38

In time series data, errors are often not probabilistically independent.

Accepted Answer

The answer of In time series data, errors are often...

Question 39

Many statistical packages have three types of equation-building procedures. They are:&#10;A) forward, linear, and non-linear&#10;B) forward, backward, and stepwise&#10;C) simple, complex, and stepwise&#10;D) inclusion, exclusion, and linear

Accepted Answer

The answer of Many statistical packages have three types of...

Question 40

If exact multicollinearity exists, redundancy exists in the data.

Accepted Answer

The answer of If exact multicollinearity exists, redundancy exists in...

Question 41

A forward procedure is a type of equation building procedure that begins with only one explanatory variable in the regression equation and successively adds one variable at a time until no remaining variables make a significant contribution.

Accepted Answer

The answer of A forward procedure is a type of...

Question 42

In multiple regression, the problem of multicollinearity affects the t-tests of the individual coefficients as well as the F-test in the analysis of variance for regression, since the F-test combines these t-tests into a single test.

Accepted Answer

The answer of In multiple regression, the problem of multicollinearity...

Question 43

&#8203;In multiple regressions, if the F-ratio is large, the explained variation is large relative to the unexplained variation.

Accepted Answer

The answer of &#8203;In multiple regressions, if the F-ratio is...

Question 44

In simple linear regression, if the error variable   is normally distributed, the test statistic for testing   is t-distributed with n - 2 degrees of freedom.

Accepted Answer

The answer of In simple linear regression, if the error...

Question 45

When there is a group of explanatory variables that are in some sense logically related, all of them must be included in the regression equation.

Accepted Answer

The answer of When there is a group of explanatory...

Question 46

In a simple linear regression problem, if the standard error of estimate   = 15 and n = 8, then the sum of squares for error, SSE, is 1,350.

Accepted Answer

The answer of In a simple linear regression problem, if...

Question 47

The residuals are observations of the error variable   . Consequently, the minimized sum of squared deviations is called the sum of squared error, labeled SSE.

Accepted Answer

The answer of The residuals are observations of the error...

Question 48

Multicollinearity is a situation in which two or more of the explanatory variables are highly correlated with each other.

Accepted Answer

The answer of Multicollinearity is a situation in which two...

Question 49

In order to test the significance of a multiple regression model involving 4 explanatory variables and 40 observations, the numerator and denominator degrees of freedom for the critical value of F are 4 and 35, respectively.

Accepted Answer

The answer of In order to test the significance of...

Question 50

In multiple regression with k explanatory variables, the t-tests of the individual coefficients allows us to determine whether   (for i = 1, 2, &#8230;., k), which tells us whether a linear relationship exists between   and Y.

Accepted Answer

The answer of In multiple regression with k explanatory variables,...

Question 51

In regression analysis, the unexplained part of the total variation in the response variable Y is referred to as the sum of squares due to regression, SSR.

Accepted Answer

The answer of In regression analysis, the unexplained part of...

Question 52

A multiple regression model involves 40 observations and 4 explanatory variables produces SST = 1000 and SSR = 804. The value of MSE is 5.6.

Accepted Answer

The answer of A multiple regression model involves 40 observations...

Question 53

In regression analysis, the total variation in the dependent variable Y, measured by   and referred to as SST, can be decomposed into two parts: the explained variation, measured by SSR, and the unexplained variation, measured by SSE.

Accepted Answer

The answer of In regression analysis, the total variation in...

Question 54

In testing the overall fit of a multiple regression model in which there are three explanatory variables, the null hypothesis is   .

Accepted Answer

The answer of In testing the overall fit of a...

Question 55

In multiple regression, if there is multicollinearity between independent variables, the t-tests of the individual coefficients may indicate that some variables are not linearly related to the dependent variable, when in fact, they are.

Accepted Answer

The answer of In multiple regression, if there is multicollinearity...

Question 56

In a multiple regression analysis involving 4 explanatory variables and 40 data points, the degrees of freedom associated with the sum of squared errors, SSE, is 35.

Accepted Answer

The answer of In a multiple regression analysis involving 4...

Question 57

In multiple regressions, if the F-ratio is small, the explained variation is small relative to the unexplained variation.

Accepted Answer

The answer of In multiple regressions, if the F-ratio is...

Question 58

Suppose that one equation has 3 explanatory variables and an F-ratio of 49. Another equation has 5 explanatory variables and an F-ratio of 38. The first equation will always be considered a better model.

Accepted Answer

The answer of Suppose that one equation has 3 explanatory...

Question 59

The value of the sum of squares due to regression, SSR, can never be larger than the value of the sum of squares total, SST.

Accepted Answer

The answer of The value of the sum of squares...

Question 60

A backward procedure is a type of equation building procedure that begins with all potential explanatory variables in the regression equation and deletes them two at a time until further deletion would reduce the percentage of variation explained to a value less than 0.50.

Accepted Answer

The answer of A backward procedure is a type of...

Question 61

A confidence interval constructed around a point prediction from a regression model is called a prediction interval, because the actual point being estimated is not a population parameter.

Accepted Answer

The answer of A confidence interval constructed around a point...

Question 62

A carpet company, which sells and installs carpet, believes that there should be a relationship between the number of carpet installations that they will have to perform in a given month and the number of building permits that have been issued within the county where they are located. Below you will find a regression model that compares the relationship between the number of monthly carpet installations (Y) and the number of building permits that have been issued in a given month (X). The data represents monthly values for the past 10 months.

(A) Estimate the regression model. How well does this model fit the given data?

(B) Yes, there is a linear relationship between the number of carpet installations and the number of building permits issued at a = 0.10; The p-value = 0.0866 for the F-ratio. You can conclude that there is a significant linear relationship between these two variables.

(C) The Durbin-Watson statistic for this data was 1.2183. Given this information what would you conclude about the data?

(D) Given your answer in (C), would you recommend modifying the original regression model? If so, how would you modify it?

Accepted Answer

The answer of A carpet company, which sells and installs...

Question 63

Many companies manufacture products that are at least partially produced using chemicals (for example, paint). In many cases, the quality of the finished product is a function of the temperature and pressure at which the chemical reactions take place. Suppose that a particular manufacturer in Texas wants to model the quality (Y) of a product as a function of the temperature

and the pressure

at which it is produced. The table below contains data obtained from a designed experiment involving these variables. Note that the assigned quality score can range from a minimum of 0 to a maximum of 100 for each manufactured product.

(A) Estimate a multiple regression model that includes the two given explanatory variables. Assess this set of explanatory variables with an F-test, and report a p-value.

(B) Identify and interpret the percentage of variance explained for the model in (A).

(C) Identify and interpret the percentage of variance explained for the model in (B).

(D) Which regression equation is the most appropriate one for modeling the quality of the given product? Bear in mind that a good statistical model is usually parsimonious.

Accepted Answer

The answer of Many companies manufacture products that are at...

Question 64

Below you will find a scatterplot of data gathered by an online retail company. The company has been able to obtain the annual salaries of their customers and the amount that each of these customers spent on the company's site last year. Based on the scatterplot below, would you conclude that these data meet all four assumptions of regression? Explain your answer.

Accepted Answer

The answer of Below you will find a scatterplot of...

Deck 11: Regression Analysis: Statistical Inference