Tuesday, September 12, 2006

Model Averaging and trying to get some theory behind my results

For the entirity of my Ph.D thus far, I've been dabbling with Model Averaging. It's helped me get to two conferences in nice places (Oslo and Santander), but it's generally always left me unsatisfied. I don't like rubbishing the work of others, and destroying a modelling/forecasting technique is much easier than proposing one.

However, model averaging is gaining acceptance as a modelling methodology as well as a forecasting methodology, and this is pretty concerning, if one wants to see econometrics telling us what the data tells us, and not what any particular econometrician would like to tell you. Even were model averaging to be disproved as a viable strategy however, it would not alter the fact that the Bible tells us this world is fallen, and as such, people will always try to manipulate and/or hide evidence to back their position up. So in this work I don't hold out some lofty ideal that all the ailments of econometrics will be solved, but I hope to make a positive contribution.

It's a concern because model averaging at its worst provides biased regression coefficients and can tell next to nothing about the effect any particular variable has on the variable/parameter of interest. This is because model averaging takes every possible subset of variables from a K variable dataset, and averages the results of these individual models. It's reasonable to suggest that within the set of models averaged over, there will be a "best" model. The "true" model might not be in there, but there will be a model which best captures the variation in the variable of interest, and is coherently specified with no autocorrelation, heteroskedasticity, and has normal residuals. But model averaging will give this "best" model a weight when it averages, and will give non-zero weights to very bad models.

One can see some sense in the arguments for model averaging: will we ever know which model this "best" model is? Won't there be a good few of these "good" models? Answer: yes. But one then needs to select this "best" models, and only average over these, and cut out the very bad models that one is bound to average over if one simply averages over every model in the space.

This is the crux of my work. Now, using Monte Carlo simulations, this is very easy to show. Even in very benign situations (i.e. perfectly nice datasets with none of the problems real-world datasets face), forecasting appears to be a bad idea using model averaging. But showing theoretically why this is the case is less straightforward, and a lot more messy. However, it's vital if people are to actually read anything I write on this and stick on the internet, and maybe hope to put towards my Ph.D.

Here's hoping...


Post a Comment

<< Home