It’s World Cup crunch time. The group stages are over and it will be knockout games to the final from here on in. From the performances we’ve seen so far, there are numerous contenders for the title. Brazil, the favourites at the start of the tournament, haven’t wowed, and title holders Spain are already out of the running.
If you’re looking to predict how the tournament will progress – and perhaps how your sweepstake will fare if you’re still in it – a good look at the performance data can tell us how things might turn out. This kind of data analysis is the business of analytics, a growth industry described by the Harvard Business Review as the “sexiest job of the 21st Century”.
Analytics can be divided into two main areas: explanatory analytics and predictive analytics. The former is backward-looking, a forensic investigation of what has happened and why. The latter, predictive analytics, is forward-looking, forecasting what will happen in the future.
The two are connected. Better understanding of the past can help improve our forecasts of the future. Predictive analytics in one way or another extrapolates from the past into the future. It assumes continuity, but uncertainty always remains because of the myriad chance events that can influence future outcomes.
Predicting who will win the World Cup is a good exemplar of the problems faced by predictive analytics. Just look at Spain, reigning European and World champions and one of the pre-tournament favourites but yet on an early flight home after losing to the Netherlands and Chile. And who would have predicted before the World Cup started that Costa Rica would qualify as group winners against three teams ranked in the top ten by FIFA, all previous World Cup winners – England, Italy and Uruguay.
What the group games tell us
Exploratory analytics applied to the team performance data from the group games can give us a clearer understanding of current form and which teams are best placed to succeed. Statistical analysis shows that the key performances indicators (KPIs) that are most closely correlated with winning at this World Cup include: completed passes, high activity distance covered, shot accuracy, tackles, clearances and fouls committed. Combining these KPIs with goals scored and conceded, it is possible to produce current form team ratings going into the knockout phases and use these ratings to predict the likely destination of the World Cup.
Current form team rankings after the group stage:
- The Netherlands
- Costa Rica
The Netherlands top the ratings which is no surprise after winning all three group games including their comprehensive 5-1 defeat of Spain. Much more surprising is that Colombia rank second, also with a 100% record in the group games. The Netherlands and Colombia are in the opposite halves of the draw so could conceivably meet in the World Cup Final. But to do so would probably require Colombia to beat Brazil in the Quarter Finals and then beat either France or Germany in the Semi-Finals while the Netherlands will most likely need to triumph over Argentina in the Semi-Finals.
Brazil rank fourth and Argentina rank sixth based on their team KPIs, reflecting that the form of both of these teams has been a little patchy so far. Brazil could only draw with Mexico, and Argentina only sneaked past Iran with a late winner in stoppage time. But with Neymar and Messi in their teams Brazil and Argentina have the individual genius to go all the way.
The Last 16 games all look very predictable based on current form and at least on paper all of the group winners should progress. But the Quarter Finals look very hard to call, other than the Netherlands who should beat the winners of the Costa Rica-Greece game. Colombia, as already suggested, should beat Uruguay and could surprise Brazil. If France get past Nigeria and Germany beat Algeria, then a France-Germany Quarter Final might need penalties to separate the teams. And Belgium could push Argentina all the way if both teams as expected win their Last 16 games.
So, who will win the World Cup?
Although the ratings put the Netherlands as favourites based on their form in the group games, the main conclusion from the statistical analysis of the games is that this is possibly one of the most open and competitively balanced World Cup tournaments ever, with all of the group winners having some grounds for optimism.
The Netherlands against Colombia in the World Cup Final had long odds before the tournament started but those odds have now shortened considerably. The bookmakers still make Brazil favourites followed by Germany and Argentina. And you cannot discount either France or Belgium.
Predicting the World Cup even after half the teams have been eliminated remains very difficult. Indeed based on the team ratings Croatia would have been predicted to reach the Quarter Finals but the quality of their play did not translate into results. We just do not know who will lift the trophy and it is that uncertainty of outcome that makes this year’s World Cup so interesting. The only certainty over the next two weeks is that football fans around the world are going to be gripped and entertained by some tense and exciting games. And there will probably be some surprises so don’t take the table above as gospel.