HOW TO SPECIFY YOUR MODEL?
[1] What is the goal in creating a regression model?
Ans: We want to capture underlying causal mechanisms which relate variables in our model. This is why regression of C on Y is correct, but Y on C is not because causality runs from income to consumption and not the other way around.
[2] Misspecification of causal relationships is VERY common (like running Y on C) but there is no easy way to detect and fix this problem -- all of the complex and confusing discussion about exogeneity, endogeneity, is an attempt to tackle this problem -- but all of these attempts are failures, because they lack understanding of the root of the problem. This is why best definitions of exogeneity in textbooks and econometric articles are just -- plain and simple -- WRONG.
[3] This problem is too complex to handle here, so let us assume it away. ASSUME you have managed to get the right dependent variable Y on the LHS and all variables on the RHS are exogenous. Now the main issue is: IS your model correctly specified? Have you included all relevant variables? If you miss an important variable, then all your results will be wrong. For example, if you run Pakistani Consumption on Guatemala GNP you will get a very good regression with high R-squared significant t statistics, right signs and everything. The Guatemala GNP will be significant because you have omitted an important variable, Pakistan GNP from the equation.
[4] If you have a static model, there is a chance of dynamic misspecification -- that is, maybe the past is relevant, but you don't know about this, because you have not included any lagged variables. If you are omitting an important lagged variable X(t-1) or Y(t-1) from your regression than your equation is misspecified, and the results cannot be trusted -- like Guatemala GNP, irrelevant variables may appear to be important.
[5] The problem to test for is DYNAMIC MIS-SPECIFICATION: I
s any lagged regressor significant? The way to do this is to put all lagged regressors into the model, and do a joint F test for significance of all of them. If this F test fails to reject the null hypothesis that all of the coefficients are jointly zero, this means that there is no strong evidence for dynamic misspecfication -- your model has NOT omitted a significant lagged effect. [6] Serial correlation is just ONE special type of dynamic misspecification which is included as a VERY special case of general dynamic misspecification which we tested by F test. If model is NOT dynamically misspecified, than there can be no serial correlation. No need to separately test for serial correlation [7] If F test in
(5) rejects null, model IS dynamically misspecified and one needs lagged regressors in the model -- there are SEVERAL different possible patterns which could occur in the lagged variables -- several SPECIAL types of dynamic misspecification. Koyck lags is one of them. Serial correlation is another one: This one says the lagged regressors affect current period ONLY in one way, through the error which occurred in the last period. This is one possible case -- there is no reason to make this special case the ONLY type of dynamic misspecification that you should consider, test for, and correct.
No comments:
Post a Comment