Deck 10: Identification and Data Assessment

Full screen (f)
exit full mode
Question
When making an interpolation it is possible to use which of the following functional forms?

A) Linear
B) Quadratic
C) Cubic
D) All of these choices are correct.
Use Space or
up arrow
down arrow
to flip the card.
Question
For a parameter that is identified, what should happen to the 95% confidence interval as the sample size gets larger?

A) The right tail should get larger.
B) The left tail should get larger.
C) The width of the interval should get smaller.
D) The width of the interval should get wider.
Question
The key distinction between extrapolation and interpolation is that interpolation:

A) involves causal estimates, while extrapolation involves correlations.
B) involves linear functional forms, while extrapolation involves non-linear function forms.
C) fills data gaps, while extrapolation fills beyond the extent of the data.
D) can be used with control variables, while extrapolation must be used in conjunction with instrumental variables.
Question
Which of the following settings constitutes an interpolation?

A) Using past sales, and price information to make a forecast for next year's sales growth.
B) Using past sales, and price information to estimate last year's price elasticity of demand.
C) Using 2014 and 2016 data on GDP to estimate GDP in the third quarter of 2015, which is missing.
D) Using a randomized control trial to estimate the causal effect of the use of a cancer treatment on health outcomes.
Question
When one conducts interpolation, one is:

A) imputing values that have been mismeasured.
B) correcting values that have been mismeasured.
C) drawing conclusions where there are "gaps" in the data.
D) drawing conclusions beyond the extent of the data.
Question
Which of the following is a step that can remedy an interpolation/extrapolation identification problem?

A) Use a functional form assumption to interpolate/extrapolate.
B) Using an instrumental variable to interpolate/extrapolate.
C) Use two-stage least squares to interpolate/extrapolate.
D) Use a fixed effects model.
Question
A parameter is identified in the event that it can be:

A) rejected to be zero at the 95 percent confidence level.
B) rejected to be zero at the 99 percent confidence level.
C) estimated with any level of precision given a large enough sample from the population.
D) safely assumed to not suffer from heteroscedasticity.
Question
Determining whether a parameter is identified exclusively involves an argument about the data-generating process of the:

A) observations in the sample.
B) observations in a representative sample.
C) total population.
D) control group.
Question
In the event that there is not an acceptable model of the data-generating process within which the treatment effect is identified using samples from the population, the analyst should do what?

A) Collect more data.
B) Re-run the experiment.
C) Consider alternative data populations (e.g., additional variables) before attempting to estimate the effect.
D) Run a probit/logit model.
Question
A data gap is any place where:

A) data is measured with error.
B) the data is heteroscedastic.
C) there are missing data for a variable over an interval of values, but data are not missing for at least some values on both ends of the interval.
D) there is missing data.
Question
Which of the following settings constitutes an extrapolation?

A) Using past sales and price information to make a forecast for next year's sales growth.
B) Using past sales and price information to estimate last year's price elasticity of demand.
C) Using 2014 and 2016 data on GDP to estimate GDP in the third quarter of 2015, which is missing.
D) Using a randomized control trial to estimate the causal effect of the use of a cancer treatment on health outcomes.
Question
In the event that there is an acceptable model of the data-generating process within which the treatment effect is identified using samples from the population, but the 95% confidence level of the treatment effect contains both large negative and large positive values, what might be an appropriate step for the analyst to do?

A) Collect more data.
B) Re-run the experiment.
C) Consider alternative data populations (e.g., additional variables) before attempting to estimate the effect.
D) Run a probit/logit model.
Question
For a parameter that is identified, what should happen to the 99% confidence interval as the sample size gets larger?

A) The right tail should get larger.
B) The left tail should get larger.
C) The width of the interval should get smaller.
D) The width of the interval should get wider.
Question
For a parameter that is not identified, what should happen to the 95% confidence interval as the sample size gets larger?

A) The right tail should get larger.
B) The left tail should get larger.
C) The width of the interval should eventually capture the true parameter value.
D) None of the answers is correct.
Question
If the data-generating process you are working with has a population parameter that is not identified it will be the case that:

A) you will need more data to get a precise estimate.
B) you will always need an instrumental variable to get an unbiased estimate.
C) a fixed effect model will be required.
D) None of the answers is correct.
Question
You are collecting information on prices and attributes of electrical vehicles such as the mile range from a single charge for a bunch of electric cars. Suppose your company has created an electric vehicle that will have a mile range twice as large as an existing electric vehicle on the market. Does the model you've estimated have an identification problem?

A) No, because this data gap does not exist in the population.
B) Yes, the extrapolation required exists in the population of current electric vehicles, not just your sample.
C) Yes, but you can still extrapolate the data gap.
D) No, because data gaps cannot cause identification problems.
Question
When one conducts extrapolation, one is:

A) imputing values that have been mismeasured.
B) correcting values that have been mismeasured.
C) drawing conclusions where there are "gaps" in the data.
D) drawing conclusions beyond the extent of the data.
Question
In the event that you are collecting data on how heights relate to income in the U.S. population and you notice that in your sample of 1,500 individuals you have a data gap at 5'6", will this cause an identification problem?

A) No, because this data gap does not exist in the population.
B) Yes, but you can still interpolate the data gap.
C) Yes, but you can still extrapolate the data gap.
D) No, because data gaps cannot cause identification problems.
Question
For a parameter that is identified, what is the role of having a larger sample size?

A) The estimate of the identified parameter will be more precise.
B) The estimate of the identified parameter will be distributed as a t-distribution.
C) The estimate of the identified parameter will be efficient.
D) The estimate of the identified parameter will be distributed as a chi-squared distribution.
Question
Which of the following is a step that can remedy an interpolation/extrapolation identification problem?

A) Use a probit/logit model to alleviate the data gaps.
B) Collect more/alternative data to close the data gap, extent of the data.
C) Use two-stage least squares to interpolate/extrapolate.
D) Use a fixed effects model.
Question
All of the following will cause an identification challenge except for what?

A) A control variable is correlated with the error term.
B) The treatment variable is correlated with the error term.
C) Perfect multicollinearity
D) Imperfect multicollinearity
Question
In the event that your treatment variable suffers from perfect multicollinearity with one of your controls, a viable remedy is:

A) drop the control variable.
B) drop the treatment variable of interest.
C) change the population to one, where the treatment and control are not exactly linearly related.
D) use the within estimator.
Question
How does making functional form choices put less burden on the data collection process of the analyst?

A) By making functional form choices, an analyst can interpolate/extrapolate measures of treatment effects within gaps/beyond the extent of the data sampled.
B) By making functional form choices, an analyst can avoid having to take random samples from the population.
C) By making functional form choices, an analyst can avoid having to ensure that the observations are carefully measured.
D) By making functional form choices, an analyst can avoid having to ensure the dataset is structured.
Question
What condition best describes the endogeneity problem?

A) The variance of the errors (Ui) depends on Xi.
B) Some variables within Xi are perfectly correlated with other variables in Xi.
C) The distribution of the errors (Ui) is non-normal.
D) One of the Xi variables is correlated with the error term (Ui).
Question
All else equal, the theoretical justifications for a particular function form should be stronger in the cases in which you are:

A) attempting to identify correlations only.
B) running fixed effects regressions.
C) using an instrumental variable.
D) extrapolating further off the support of the data.
Question
Which of the following functional forms will be the most restrictive in terms of the shape of the resulting interpolation of data gaps in a sample data set?

A) Linear
B) Quadratic
C) Cubic
D) Piecewise linear
Question
A usual result from having a model that has imperfect multicollinearity is that:

A) coefficient estimates are large.
B) coefficient estimates are negative.
C) coefficient estimates are positive.
D) standard errors for coefficient estimates are large.
Question
Defining and omitting a base group amongst a set of fixed effects is an attempt to remedy which identification challenge?

A) A control variable is correlated with the error term.
B) The treatment variable is correlated with the error term.
C) Perfect multicollinearity
D) Imperfect multicollinearity
Question
If one was to include in a regression a binary dummy variable for all four regions of the country (East, West, South, North) - what identification challenge would be presented?

A) Endogeneity problem
B) Heteroscedasticity
C) Perfect multicollinearity
D) Variance inflation factor
Question
Perfect multicollinearity is when:

A) two or more independent variables have an exact linear relationship.
B) two independent variables are orthogonal.
C) the r-squared coefficient is zero.
D) the coefficients on two independent variables are equal.
Question
When functional form choice is used to alleviate an interpolation/extrapolation identification challenge the justification represents what sort of element in our data reasoning framework?

A) Empirically testable conclusion
B) Inductive reasoning
C) Assumption
D) Statistically reasoning
Question
A useful diagnostic measure for imperfect multicollinearity is the:

A) z-score.
B) t-test.
C) variance inflation factor.
D) likelihood ratio test.
Question
Imperfect multicollinearity is when:

A) two or more independent variables have nearly an exact linear relationship.
B) two independent variables are nearly orthogonal.
C) the r-squared coefficient is nearly zero.
D) the coefficients on two independent variables are nearly equal.
Question
Consider the regression of Salesi = α0 + α1Pricei + α2Promoi + α3Weekendi + Ui, where Promoi and Weekendi are binary variables for if the particular observation was on promo (1 if promo, 0 otherwise), or comes from a weekend day (1 if weekend day, 0 otherwise). A member of your team informs you that all the promotions run for the products in your population were run on weekends. What identification challenge might you worry about? What identification challenge might you worry about?

A) Heteroscedasticity
B) Endogeneity
C) Perfect multicollinearity
D) Imperfect multicollinearity
Question
When making functional form choices to alleviate an interpolation/extrapolation identification challenge the justification will almost always take what form?

A) A calculation of difference of means across sub-samples of the data
B) Theoretical argument
C) A hypothesis test
D) A p-value from a test statistic
Question
A typical remedy for perfect multicollinearity amongst control variables is to:

A) use an instrumental variable.
B) use the within estimator.
C) drop one of the control variables that is exactly linear with other control variables.
D) use the t-test.
Question
If one suffers from the endogeneity problem for identification, it will result in your coefficient estimates being:

A) estimated with limited precision.
B) inconsistent.
C) efficient.
D) too large.
Question
Of the remedies for an interpolation/extrapolation identification problem, which requires the most trust in the assumptions surrounding the data generation process?

A) Use probit/logit to alleviate the data gaps.
B) Collect more/alternative data to close the data gap, extent of the data.
C) Use two-stage least squares to interpolate/extrapolate.
D) Use functional form choices to interpolate/extrapolate.
Question
Suppose you are estimating the following model: Yi = β0 + β1Xi + Ui. You believe the variance of the unobserved factors (U) varies with X. If this is true, what is the consequence?

A) Your estimate for β1 will be biased.
B) Your estimate for β0 will be biased.
C) Your estimate for β1 and β0 will be biased.
D) None of the answers is correct.
Question
How does collecting more/alternative data put less burden on the modeling choices of the analyst?

A) By collecting more data, an analyst can completely avoid characterizing a determining function.
B) By collecting more data, an analyst can avoid heteroscedasticity.
C) By collecting more data, an analyst can avoid having to make functional form choices to fill in data gaps/extent of the data.
D) By collecting more data, an analyst can implement a fixed effects design.
Question
In the event that your treatment variable is imperfectly multicollinear with one of your control variables, a possible remedy would be to:

A) re-orient your base group.
B) use the within estimator.
C) gather more data, in hopes of getting more independent variation of the treatment.
D) use data reduction methods.
Question
The endogeneity problem that results from when a variable correlated with the treatment is not included as a control variable is known as the:

A) instrumental variable.
B) omitted variable bias.
C) weak instruments problem.
D) least efficient estimator.
Question
Suppose that you observed several key characteristics of a random sample of firms in your industry. You know that the semi-partial correlation of firm Productivity (Y) with R&D (Z) investment holding amount of Labor (X) fixed is positive. Furthermore, suppose you know that the covariance of R&D investment and amount of labor is positive. How will the coefficient on Labor when you run the regression of Productivity on Labor and R&D investment relate to the coefficient on Labor when you run a regression of Productivity on just Labor?

A) It'll be equal to
B) It'll be greater than
C) It'll be less than
D) There is not enough information to determine.
Question
Fortunately, even in cases where we suffer from an omitted variable bias, it is often the case that with careful reasoning it is possible to:

A) sign the bias.
B) make the size of bias not statistically significant from zero.
C) report appropriately adjusted standard errors.
D) construct alternative p-values.
Question
When might imperfect multicollinearity not require the collecting of more data to remedy the likely imprecise estimates of the affected variables?

A) When the model suffers from heteroscedasticity as well
B) When you are only conducting hypothesis tests
C) When the imperfect multicollinearity is confined to control variables only
D) When the treatment effect is positive
Question
Potential remedies for when your model suffers from the endogeneity problem include all of the following except what?

A) Gather (additional) data on a possible instrumental variable.
B) Gather (additional) longitudinal data to allow for a fixed effect approach.
C) Gather (additional) control variables that would limit the endogeneity concern.
D) Check if the residuals from your regression are uncorrelated with your treatment.
Question
The two critical elements required to sign the bias of an omitted variable include the sign of the:

A) effect of the omitted variable on the outcome and the sign of the effect of the treatment variable on the outcome.
B) effect of the omitted variable on the outcome and the sign of the correlation between the omitted variable and the outcome.
C) effect of the omitted variable on the outcome and the sign of the correlation between the omitted variable and the treatment.
D) correlation between the omitted variable and the outcome and the sign of the correlation between the treatment variable and the outcome.
Question
Suppose that you have many observations of the employee level longevity and the wage earned by that employee that year. You run the regression Longevityi = β0 + β1 Wagei +Ui, and get an estimate of β1. Now, suppose that a member of the analytics team suggests that education is an omitted variable in your regression and is likely biasing your estimate of β1. Suppose you knew that, conditional on Wage, more educated employees tended to have shorter stints with the company (lower longevity) and that the error, ηi, in the equation Longevityi = β0 + β1Wagei + β2 Educationi + Ui, was uncorrelated with both Wage and Education. How would you sign the bias on your estimate of β1?

A) Argue that education is positively correlated with wage and that your estimate of β1 is an upper bound.
B) Argue that education is negatively correlated with wage and that your estimate of β1 is a lower bound.
C) Argue that education is positively correlated with wage and that your estimate of β1 is a lower bound.
D) Argue that education is positively correlated with longevity and that your estimate of β1 is an upper bound.
Question
Suppose you have many observations of the scores students got on an exam, with other characteristics of the students including how many hours they studied for the exam that week and what is their current major GPA. Now, suppose you ran the standard linear regression of the exam grade on the number of hours studied: Gradei = β0 + β1 Hours Studiedi + Ui and got an estimate of β1, call it b1. Now, suppose a colleague tells you that the students with above average GPAs were the students who studied more for the exam. How does your estimate of β2* in the following regression, Gradei = β0* + β1 Hours Studiedi + β2* Major GPAi + Ui, inform you of what your estimate of what β1* will be for that same regression?

A) If β2* > 0, then β1* > b1
B) If β2* > 0, then β1* < b1
C) If β2* < 0, then β1* > 0
D) If β2* > 0, then β1* > 0
Question
Suppose you have many observations of the price of refrigerators, with other characteristics including their energy cost as well as whether or not they are stainless steel. Now, suppose you ran the standard linear regression of price on energy cost: Pricei = β0 + β1 Energy Costi + Ui and got an estimate of β1. Now, suppose you know that stainless steel refrigerators are more popular (i.e., sell for higher prices holding all other characteristics constant) and tend to be in more energy efficient (i.e., lower energy cost) refrigerator models. What should you expect your estimate of β1 in the following regression Pricei = β0* + β1*Energy Costi + β2* Stainless Steeli + Ui to be?

A) β1* < β1
B) β1* > β1
C) β1* = β1
D) You cannot tell from the information given.
Unlock Deck
Sign up to unlock the cards in this deck!
Unlock Deck
Unlock Deck
1/50
auto play flashcards
Play
simple tutorial
Full screen (f)
exit full mode
Deck 10: Identification and Data Assessment
1
When making an interpolation it is possible to use which of the following functional forms?

A) Linear
B) Quadratic
C) Cubic
D) All of these choices are correct.
D
2
For a parameter that is identified, what should happen to the 95% confidence interval as the sample size gets larger?

A) The right tail should get larger.
B) The left tail should get larger.
C) The width of the interval should get smaller.
D) The width of the interval should get wider.
C
3
The key distinction between extrapolation and interpolation is that interpolation:

A) involves causal estimates, while extrapolation involves correlations.
B) involves linear functional forms, while extrapolation involves non-linear function forms.
C) fills data gaps, while extrapolation fills beyond the extent of the data.
D) can be used with control variables, while extrapolation must be used in conjunction with instrumental variables.
C
4
Which of the following settings constitutes an interpolation?

A) Using past sales, and price information to make a forecast for next year's sales growth.
B) Using past sales, and price information to estimate last year's price elasticity of demand.
C) Using 2014 and 2016 data on GDP to estimate GDP in the third quarter of 2015, which is missing.
D) Using a randomized control trial to estimate the causal effect of the use of a cancer treatment on health outcomes.
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
5
When one conducts interpolation, one is:

A) imputing values that have been mismeasured.
B) correcting values that have been mismeasured.
C) drawing conclusions where there are "gaps" in the data.
D) drawing conclusions beyond the extent of the data.
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
6
Which of the following is a step that can remedy an interpolation/extrapolation identification problem?

A) Use a functional form assumption to interpolate/extrapolate.
B) Using an instrumental variable to interpolate/extrapolate.
C) Use two-stage least squares to interpolate/extrapolate.
D) Use a fixed effects model.
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
7
A parameter is identified in the event that it can be:

A) rejected to be zero at the 95 percent confidence level.
B) rejected to be zero at the 99 percent confidence level.
C) estimated with any level of precision given a large enough sample from the population.
D) safely assumed to not suffer from heteroscedasticity.
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
8
Determining whether a parameter is identified exclusively involves an argument about the data-generating process of the:

A) observations in the sample.
B) observations in a representative sample.
C) total population.
D) control group.
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
9
In the event that there is not an acceptable model of the data-generating process within which the treatment effect is identified using samples from the population, the analyst should do what?

A) Collect more data.
B) Re-run the experiment.
C) Consider alternative data populations (e.g., additional variables) before attempting to estimate the effect.
D) Run a probit/logit model.
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
10
A data gap is any place where:

A) data is measured with error.
B) the data is heteroscedastic.
C) there are missing data for a variable over an interval of values, but data are not missing for at least some values on both ends of the interval.
D) there is missing data.
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
11
Which of the following settings constitutes an extrapolation?

A) Using past sales and price information to make a forecast for next year's sales growth.
B) Using past sales and price information to estimate last year's price elasticity of demand.
C) Using 2014 and 2016 data on GDP to estimate GDP in the third quarter of 2015, which is missing.
D) Using a randomized control trial to estimate the causal effect of the use of a cancer treatment on health outcomes.
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
12
In the event that there is an acceptable model of the data-generating process within which the treatment effect is identified using samples from the population, but the 95% confidence level of the treatment effect contains both large negative and large positive values, what might be an appropriate step for the analyst to do?

A) Collect more data.
B) Re-run the experiment.
C) Consider alternative data populations (e.g., additional variables) before attempting to estimate the effect.
D) Run a probit/logit model.
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
13
For a parameter that is identified, what should happen to the 99% confidence interval as the sample size gets larger?

A) The right tail should get larger.
B) The left tail should get larger.
C) The width of the interval should get smaller.
D) The width of the interval should get wider.
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
14
For a parameter that is not identified, what should happen to the 95% confidence interval as the sample size gets larger?

A) The right tail should get larger.
B) The left tail should get larger.
C) The width of the interval should eventually capture the true parameter value.
D) None of the answers is correct.
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
15
If the data-generating process you are working with has a population parameter that is not identified it will be the case that:

A) you will need more data to get a precise estimate.
B) you will always need an instrumental variable to get an unbiased estimate.
C) a fixed effect model will be required.
D) None of the answers is correct.
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
16
You are collecting information on prices and attributes of electrical vehicles such as the mile range from a single charge for a bunch of electric cars. Suppose your company has created an electric vehicle that will have a mile range twice as large as an existing electric vehicle on the market. Does the model you've estimated have an identification problem?

A) No, because this data gap does not exist in the population.
B) Yes, the extrapolation required exists in the population of current electric vehicles, not just your sample.
C) Yes, but you can still extrapolate the data gap.
D) No, because data gaps cannot cause identification problems.
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
17
When one conducts extrapolation, one is:

A) imputing values that have been mismeasured.
B) correcting values that have been mismeasured.
C) drawing conclusions where there are "gaps" in the data.
D) drawing conclusions beyond the extent of the data.
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
18
In the event that you are collecting data on how heights relate to income in the U.S. population and you notice that in your sample of 1,500 individuals you have a data gap at 5'6", will this cause an identification problem?

A) No, because this data gap does not exist in the population.
B) Yes, but you can still interpolate the data gap.
C) Yes, but you can still extrapolate the data gap.
D) No, because data gaps cannot cause identification problems.
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
19
For a parameter that is identified, what is the role of having a larger sample size?

A) The estimate of the identified parameter will be more precise.
B) The estimate of the identified parameter will be distributed as a t-distribution.
C) The estimate of the identified parameter will be efficient.
D) The estimate of the identified parameter will be distributed as a chi-squared distribution.
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
20
Which of the following is a step that can remedy an interpolation/extrapolation identification problem?

A) Use a probit/logit model to alleviate the data gaps.
B) Collect more/alternative data to close the data gap, extent of the data.
C) Use two-stage least squares to interpolate/extrapolate.
D) Use a fixed effects model.
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
21
All of the following will cause an identification challenge except for what?

A) A control variable is correlated with the error term.
B) The treatment variable is correlated with the error term.
C) Perfect multicollinearity
D) Imperfect multicollinearity
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
22
In the event that your treatment variable suffers from perfect multicollinearity with one of your controls, a viable remedy is:

A) drop the control variable.
B) drop the treatment variable of interest.
C) change the population to one, where the treatment and control are not exactly linearly related.
D) use the within estimator.
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
23
How does making functional form choices put less burden on the data collection process of the analyst?

A) By making functional form choices, an analyst can interpolate/extrapolate measures of treatment effects within gaps/beyond the extent of the data sampled.
B) By making functional form choices, an analyst can avoid having to take random samples from the population.
C) By making functional form choices, an analyst can avoid having to ensure that the observations are carefully measured.
D) By making functional form choices, an analyst can avoid having to ensure the dataset is structured.
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
24
What condition best describes the endogeneity problem?

A) The variance of the errors (Ui) depends on Xi.
B) Some variables within Xi are perfectly correlated with other variables in Xi.
C) The distribution of the errors (Ui) is non-normal.
D) One of the Xi variables is correlated with the error term (Ui).
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
25
All else equal, the theoretical justifications for a particular function form should be stronger in the cases in which you are:

A) attempting to identify correlations only.
B) running fixed effects regressions.
C) using an instrumental variable.
D) extrapolating further off the support of the data.
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
26
Which of the following functional forms will be the most restrictive in terms of the shape of the resulting interpolation of data gaps in a sample data set?

A) Linear
B) Quadratic
C) Cubic
D) Piecewise linear
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
27
A usual result from having a model that has imperfect multicollinearity is that:

A) coefficient estimates are large.
B) coefficient estimates are negative.
C) coefficient estimates are positive.
D) standard errors for coefficient estimates are large.
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
28
Defining and omitting a base group amongst a set of fixed effects is an attempt to remedy which identification challenge?

A) A control variable is correlated with the error term.
B) The treatment variable is correlated with the error term.
C) Perfect multicollinearity
D) Imperfect multicollinearity
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
29
If one was to include in a regression a binary dummy variable for all four regions of the country (East, West, South, North) - what identification challenge would be presented?

A) Endogeneity problem
B) Heteroscedasticity
C) Perfect multicollinearity
D) Variance inflation factor
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
30
Perfect multicollinearity is when:

A) two or more independent variables have an exact linear relationship.
B) two independent variables are orthogonal.
C) the r-squared coefficient is zero.
D) the coefficients on two independent variables are equal.
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
31
When functional form choice is used to alleviate an interpolation/extrapolation identification challenge the justification represents what sort of element in our data reasoning framework?

A) Empirically testable conclusion
B) Inductive reasoning
C) Assumption
D) Statistically reasoning
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
32
A useful diagnostic measure for imperfect multicollinearity is the:

A) z-score.
B) t-test.
C) variance inflation factor.
D) likelihood ratio test.
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
33
Imperfect multicollinearity is when:

A) two or more independent variables have nearly an exact linear relationship.
B) two independent variables are nearly orthogonal.
C) the r-squared coefficient is nearly zero.
D) the coefficients on two independent variables are nearly equal.
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
34
Consider the regression of Salesi = α0 + α1Pricei + α2Promoi + α3Weekendi + Ui, where Promoi and Weekendi are binary variables for if the particular observation was on promo (1 if promo, 0 otherwise), or comes from a weekend day (1 if weekend day, 0 otherwise). A member of your team informs you that all the promotions run for the products in your population were run on weekends. What identification challenge might you worry about? What identification challenge might you worry about?

A) Heteroscedasticity
B) Endogeneity
C) Perfect multicollinearity
D) Imperfect multicollinearity
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
35
When making functional form choices to alleviate an interpolation/extrapolation identification challenge the justification will almost always take what form?

A) A calculation of difference of means across sub-samples of the data
B) Theoretical argument
C) A hypothesis test
D) A p-value from a test statistic
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
36
A typical remedy for perfect multicollinearity amongst control variables is to:

A) use an instrumental variable.
B) use the within estimator.
C) drop one of the control variables that is exactly linear with other control variables.
D) use the t-test.
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
37
If one suffers from the endogeneity problem for identification, it will result in your coefficient estimates being:

A) estimated with limited precision.
B) inconsistent.
C) efficient.
D) too large.
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
38
Of the remedies for an interpolation/extrapolation identification problem, which requires the most trust in the assumptions surrounding the data generation process?

A) Use probit/logit to alleviate the data gaps.
B) Collect more/alternative data to close the data gap, extent of the data.
C) Use two-stage least squares to interpolate/extrapolate.
D) Use functional form choices to interpolate/extrapolate.
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
39
Suppose you are estimating the following model: Yi = β0 + β1Xi + Ui. You believe the variance of the unobserved factors (U) varies with X. If this is true, what is the consequence?

A) Your estimate for β1 will be biased.
B) Your estimate for β0 will be biased.
C) Your estimate for β1 and β0 will be biased.
D) None of the answers is correct.
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
40
How does collecting more/alternative data put less burden on the modeling choices of the analyst?

A) By collecting more data, an analyst can completely avoid characterizing a determining function.
B) By collecting more data, an analyst can avoid heteroscedasticity.
C) By collecting more data, an analyst can avoid having to make functional form choices to fill in data gaps/extent of the data.
D) By collecting more data, an analyst can implement a fixed effects design.
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
41
In the event that your treatment variable is imperfectly multicollinear with one of your control variables, a possible remedy would be to:

A) re-orient your base group.
B) use the within estimator.
C) gather more data, in hopes of getting more independent variation of the treatment.
D) use data reduction methods.
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
42
The endogeneity problem that results from when a variable correlated with the treatment is not included as a control variable is known as the:

A) instrumental variable.
B) omitted variable bias.
C) weak instruments problem.
D) least efficient estimator.
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
43
Suppose that you observed several key characteristics of a random sample of firms in your industry. You know that the semi-partial correlation of firm Productivity (Y) with R&D (Z) investment holding amount of Labor (X) fixed is positive. Furthermore, suppose you know that the covariance of R&D investment and amount of labor is positive. How will the coefficient on Labor when you run the regression of Productivity on Labor and R&D investment relate to the coefficient on Labor when you run a regression of Productivity on just Labor?

A) It'll be equal to
B) It'll be greater than
C) It'll be less than
D) There is not enough information to determine.
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
44
Fortunately, even in cases where we suffer from an omitted variable bias, it is often the case that with careful reasoning it is possible to:

A) sign the bias.
B) make the size of bias not statistically significant from zero.
C) report appropriately adjusted standard errors.
D) construct alternative p-values.
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
45
When might imperfect multicollinearity not require the collecting of more data to remedy the likely imprecise estimates of the affected variables?

A) When the model suffers from heteroscedasticity as well
B) When you are only conducting hypothesis tests
C) When the imperfect multicollinearity is confined to control variables only
D) When the treatment effect is positive
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
46
Potential remedies for when your model suffers from the endogeneity problem include all of the following except what?

A) Gather (additional) data on a possible instrumental variable.
B) Gather (additional) longitudinal data to allow for a fixed effect approach.
C) Gather (additional) control variables that would limit the endogeneity concern.
D) Check if the residuals from your regression are uncorrelated with your treatment.
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
47
The two critical elements required to sign the bias of an omitted variable include the sign of the:

A) effect of the omitted variable on the outcome and the sign of the effect of the treatment variable on the outcome.
B) effect of the omitted variable on the outcome and the sign of the correlation between the omitted variable and the outcome.
C) effect of the omitted variable on the outcome and the sign of the correlation between the omitted variable and the treatment.
D) correlation between the omitted variable and the outcome and the sign of the correlation between the treatment variable and the outcome.
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
48
Suppose that you have many observations of the employee level longevity and the wage earned by that employee that year. You run the regression Longevityi = β0 + β1 Wagei +Ui, and get an estimate of β1. Now, suppose that a member of the analytics team suggests that education is an omitted variable in your regression and is likely biasing your estimate of β1. Suppose you knew that, conditional on Wage, more educated employees tended to have shorter stints with the company (lower longevity) and that the error, ηi, in the equation Longevityi = β0 + β1Wagei + β2 Educationi + Ui, was uncorrelated with both Wage and Education. How would you sign the bias on your estimate of β1?

A) Argue that education is positively correlated with wage and that your estimate of β1 is an upper bound.
B) Argue that education is negatively correlated with wage and that your estimate of β1 is a lower bound.
C) Argue that education is positively correlated with wage and that your estimate of β1 is a lower bound.
D) Argue that education is positively correlated with longevity and that your estimate of β1 is an upper bound.
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
49
Suppose you have many observations of the scores students got on an exam, with other characteristics of the students including how many hours they studied for the exam that week and what is their current major GPA. Now, suppose you ran the standard linear regression of the exam grade on the number of hours studied: Gradei = β0 + β1 Hours Studiedi + Ui and got an estimate of β1, call it b1. Now, suppose a colleague tells you that the students with above average GPAs were the students who studied more for the exam. How does your estimate of β2* in the following regression, Gradei = β0* + β1 Hours Studiedi + β2* Major GPAi + Ui, inform you of what your estimate of what β1* will be for that same regression?

A) If β2* > 0, then β1* > b1
B) If β2* > 0, then β1* < b1
C) If β2* < 0, then β1* > 0
D) If β2* > 0, then β1* > 0
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
50
Suppose you have many observations of the price of refrigerators, with other characteristics including their energy cost as well as whether or not they are stainless steel. Now, suppose you ran the standard linear regression of price on energy cost: Pricei = β0 + β1 Energy Costi + Ui and got an estimate of β1. Now, suppose you know that stainless steel refrigerators are more popular (i.e., sell for higher prices holding all other characteristics constant) and tend to be in more energy efficient (i.e., lower energy cost) refrigerator models. What should you expect your estimate of β1 in the following regression Pricei = β0* + β1*Energy Costi + β2* Stainless Steeli + Ui to be?

A) β1* < β1
B) β1* > β1
C) β1* = β1
D) You cannot tell from the information given.
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
locked card icon
Unlock Deck
Unlock for access to all 50 flashcards in this deck.