click below
click below
Normal Size Small Size show me how
Econometrics Final
Econometrics Final Exam
| Question | Answer |
|---|---|
| When is the F-statistics used? | To test joint hypotheses about regression coefficients |
| If H0: β1 = 0 and β2 = 0, how many restrictions are there? | Two |
| Homoskedasticity defn. | Variance of error term is constant, Var(Yi I Xi’) = Var(Yi I Xi’’) |
| Heteroskedasticity defn. | Variance of error term is not constant, Var(Yi I Xi’) < Var(Yi I Xi’’) |
| How does adding additional regressors affect R2? | It inflates it, (R2 = 1 - SSR/TSS)...SSR increases. We need the adj. R2 |
| Pitfalls of R2 and adj. R2 | 1. An increase in R2 or adj. R2 does not mean the added variable is statistically significant 2. Correlation does not mean causality 3. A high R2 or adj. R2 does not mean there is not OVB (the reverse is also true) 4. A high R2 or adj. R2 does not mean I |
| How do we know if a regressor is statistically significant? | Perform a t-test |
| General approach for modeling nonlinear regressions | 1. Identify nonlinearity using knowledge of economics 2. Specify a nonlinear function and use OLS to estimate coefficients 3. Test the null (that the regression is linear) against the alternative (that it’s nonlinear) using t-test and f-test. 4. Plot the |
| Polynomials | The regressor, X, stays the same, only the powers of X change (Yi = β0 + β1 Xi + β2 Xi^2 + . . . + βr Xi^r + ui) |
| When using polynomials, what do we know about the coefficients on the regressors? | They don’t have a simply interpretation. |
| Three log models | 1. Yi = β0 + β1 ln(Xi) + ui; 1% ∆X --> ∆Y of 1% β1 2. Ln(Yi) = β0 + β1Xi + ui; 1 unit ∆X --> ∆Y of 100 x β1% 3. Ln(Yi) = β0 + β1 ln(Xi) + ui; 1% ∆X --> ∆Y of β1% |
| Interaction between independent variables | X1 x X2, effect of coefficient on (X1 x X2) above and beyond effects independently |
| Internal validity | Statistical inferences about causality are relevant to the population being studied |
| External validity | Statistical inferences about causality are relevant to the population being studied and the universe of other populations |
| How do we eliminate OVB? | Identify the likely causes of OVB, test whether the questionable variables have nonzero coeffecients |
| Simultaneous bias causality | X causes Y, Y causes X |
| Threats to internal validity | 1. OVB 2. Linear regression used for nonlinear data 3. Measurement errors in the regression 4. Crap sample 5. Simultaneous causality |
| When are instrumental variables used? | When X is correlated with error term, ui |
| IV has two parts | 1. Part correlated with ui 2. Part uncorrelated with ui |
| Goal using IV | To isolate movements in X uncorrelated with ui |
| Ways X is correlated with ui | 1. OVB 2. Measurement errors 3. Simultaneous causality |
| Endogenous variable | X correlated with ui |
| Exogenous variable | X uncorrelated with ui |
| Two conditions for valid instrument | 1. Instrument relevance: corr(Zi, Xi) ≠ 0; 2. Instrument exogeneity: corr(Zi, ui) = 0; **instruments that fulfill these two conditions capture movements in Xi uncorrelated with ui |
| 2SLS, what is Z? | An instrument that estimates ∆Y given unit ∆X |
| What are the two stages in 2SLS and what do they do? | 1. Regress Xi on (Z1i, . . ., Zmi) and the included exogenous variables (W1i, . . ., Wri) using OLS, compute predicted values to get (Xhat1i, . . ., Xhatki); Vi = endogenous component, ** this breaks down Xi into two components --> a. Exogenous component |
| 2SLS estimator with one instrument | β1hat2SLS = Szy / Sx --> Szy = sample cov(z,y); Sx = sample Var(x) |
| General IV regression model, what are the four types of variables? | 1. Y, dependent 2. X, endogenous regressor 3. W, included exogenous variables (not correlated w/ ui) 4. Z, instrumental variable |
| For IV regression to work, what do we know about the relationship between the Ms ( | of IVs) and Ks ( |
| General IV regression model | (Yi = β0 + β1 X1i + . . . + βk Xki + βk + 1 W1i + . . . + βk + r Wri + ui); X1i, . . ., Xki = k endogenous regressors, W1i, . . ., Wri = r included exogenous regressors, β0, β1, . . ., βk = unknown regression coefficients, Z1i, . . ., Zmi = m IVs |
| Quasi-experiments consist of what? | 1. Randomly selected individuals 2. Randomly assigned to treatment groups 3. A comparison of the treatment and control groups |
| Causal effect on Y of treatment level, X | E(Y I X = x) - E(Y I X = 0), where E(Y I X = x) = treatment group expected value and E(Y I X = 0) = control group expected value |
| Causal effect is synonymous with what? | Treatment effect |
| Quasi-experiment | randomization is introduced due to variations in individual circumstances as if the experiment were random |
| Potential problems with experiments | 1. Not random 2. Subject doesn’t follow protocol 3. Subject drops out of study 4. Subject, conscious of experiment, behaves differently |
| What assumptions hold and which are violated with heteroskedasticity? | (A1) - (A3) hold, (A4) violated |
| At a 5% confidence level, we will reject the null (H0) if | β/SE(β) > 1.96 |
| If R2 is low then we know what about the goodness of fit? | Goodness of fit = (ESS/TSS); if it is low then there are a lot of extraneous factors effecting Y other than X. |
| If R2 is low, how does this effect our interpretation of the regression coefficient (the slope)? | Low R2 values do not affect the interpretation of the slope. |
| What is the regressor | The X variable |
| What relation do the fitted residuals and the regressor have to satisfy? | Σ ÛiXi = 0 |
| What is the “true” regression model corresponding with the linear empirical model of Y and X? | Yi = α + β(Xi) + εi where Xi and Yi need to be specifically defined (say where Xi = age and Yi = earnings) |
| What is contained in εi? | Other factors that affect Y other than X |
| Four basic assumptions about the linear model: | (A1) : linearity; (A2) : E (εi I Xi) = 0 (εi cannot be predicted by xi); (A3) : (Yi, Xi) i.i.d.; (A4 − i) : Var (εi | Xi) = σε2; (A4 − ii) : εi and Xi have finite fourth moments |
| What does the i.i.d assumption mean? | i.d. - they should be identically distributed since all the samples were drawn from the same population. i- they should be drawn at random (i.e. each individual has the same chance of being drawn). |
| How is covariance affected if a sample is not i.i.d.? | If the observations are not i.i.d., knowledge about one observation may convey knowledge about another, and thus the covariance of εi and εj may not necessarily be zero for two observations i and j. |
| Which is the crucial assumption that allows you to interpret the slope estimate as the causal effect of age on earnings? | The crucial assumption is E [εi | Xi] = 0. OLS obtains ests. from a corr. of earnings and age among different indiv. in the sample. Only if nothing else that affects earnings varies systematically with age, this comparison yields meaningful results. |
| Var (__) | E(__) - E(__)^2 |
| Solution to multicollinearity | Drop one of the multicollinear regressors or the constant |
| R^2 formula | 1- SSR/TSS |
| Adj. R^2 formula | 1 - [(n -1)/(n - k -1)] - SSR/TSS |
| Five basic assumptions of the multivariable model | Four basic assumptions about the linear model: |
| What are the key assumptions to the multivariable model that cannot be easily relaxed or verified? | (A2) and (A3) |
| What are the key assumptions to the multivariable model that can be relaxed or verified? | (A1) and (A4a); “relaxed” means we can use hetero_____ robust std. errors |
| What are the key assumptions to the multivariable model that can be verified by observing the data? | (A4b) and (A5); “checked” means we can test for linearity and hetero_____ |
| Our key assumptions lead us to what 4 conclusions about the estimators? | 1. Unbiased 2. Consistent, N --> infinity 3. Approximately normally distributed 4. Effeciency, OLS estimates most pecise |
| F-Statistic formula | (SSRr - SSRu)/(SSRu) x (n - k - 1)/q --> k = number of regressors, q = restrictions |
| F-Statistic special case, when TSSr = TSSu | (R^2r - R^2u)/(1 - R^2u) x (n - k - 1)/q |