Predicting football match outcomes...
Well, it took long enough, as these things always do, but now I have a model that predicts football match outcomes. The predictions of the model I'll assess in the weeks to come if I get a moment, but I'm rushing towards a thesis submission in May, so time is getting tight.
Nevertheless, the predictions this model provides are very much in line with what the bookies think, and in line with the predictions of Finktank, the Dectech enterprise that publishes predictions on football matches each week in the Times. What is this model? It's a very simple model which describes the process determining the number of goals each team scores in a match as belonging to a Poisson distribution. The Poisson distribution matches incredibly well the arrival of goals in football matches, and so if X is home goals in a match, and Y is away goals, then both can be described as Poisson processes. It's likely of course that there's some interaction between these two processes, and in this case, a bivariate Poisson distribution of (X,Y) would be sough.
Making predictions on such bivariate models is a trivial extention of what I've done already, but I haven't the time to do that yet. Maybe in May. Once the season's finished...
Anyhow, in the meantime, what does the model predict?
Take the currently running Liverpool v Man U match. Finktank thought beforehand Liverpool had a 43.4% probability of winning, with the probability of a draw at 29% and a Man U win at 28%. The bookies favoured a Man U win over a draw, suggesting that Liverpool had a 36% chance of winning, the draw at 31% and the Man U win at 33%. The simple double Poisson model I've run (on all matches before today this season in the Premiership) suggests that Liverpool have a 37% chance of winning, the draw at 30% and Man U at 33%.
After 65 minutes, the match is locked at 0-0, and the first half was a very tight affair, reflecting these evenly matches odds. A contrast is given for the Arsenal v Reading match, which is a home banker if anyone ever saw one: Finktank see the Arsenal win at 70%, the bookies at 61%, and my model at 71%.
So what about the match I'm most bothered about? That is, Carlisle United vs Oldham Athletic. As things stand, the model can't take into account form. If it did, it would put a stronger probability on Carlisle, as Oldham are in terrible form, with four consecutive defeats severely hampering their promotion hopes. Nonetheless, Oldham have a 39% probability of winning, Carlisle a 32% chance, the draw at 29%. What's the most likely score? The probability of a 1-1 draw is 13.3%, followed by a 1-0 Oldham win at 12.6%, with finally a Carlisle 1-0 win at 11%. There's a 10.6% chance of a scoreless draw. All these orderings are the same as Finktank for this match, although Finktank put Oldham slightly stronger favourites at 41%. As part of my summer research I'd like to look into what exactly Finktank construct their forecasts based on, since they're not massively different from that of a "simple" double Poisson model.
I'm biased. Three or so years back, Oldham recovered from a poor run of defeats to post a battling 1-0 win at Gillingham. I think Oldham will do that today. No, I hope.
Finally, Finktank suggest you should believe their score predictions because they've beaten the bookies on a lot of occasions. Personally I think you should trust the output of a model by whether or not it predicts the score well. So I'll let you know how I do.
Labels: Arsenal, bivariate Poisson distribution, Carlisle United, forecasting, Liverpool, Manchester United, Oldham Athletic, Reading
0 Comments:
Post a Comment
<< Home