The Making Data Work Blog

World Cup Predictor: how to get it right

[fa icon="calendar"] 24-Jul-2018 12:30:00 / by Matteo Pilotto

Germany? Brazil? Spain?
We went for France at the beginning and...we got right. How did we do it? Well, we used data.

At CDO Partners we are big on data and we looked at player's statistics to "predict" the result of the World Cup. Many other companies attempted to do the same but they got it wrong. So what did we do differently to make us effectively predict France's victory in Russia?

Our approach was based on the single player rather than on historical teams' performances or players' international experience. We adopted a combined approach that looks at the players' performance over the last year as well as their value on the market.

  • Players' Performance: the combination of player-level aggregated data and statistics from last season resulted in an indicator of the player's form at the beginning of the world cup
  • Players' Value: the market value of every single player taken from the end of last season

Alteryx World Cup Flow
 

 

The final parameter is then the result of an Alteryx workflow built to combine the sets of data and to obtain a consistent indicator. Each team's value comes from the average of players' parameters. To improve the quality of our model we updated each player's score after every game to adapt the team's rating throughout the competition based on performance. 

Our unbiased view of the teams wasn't swayed by opinions. Even though we wanted to give England the highest rating, the data said from the beginning it wasn't coming home... but going to Paris instead.

We are not geniuses, perfect predictors or match gamblers and we had a bit of luck to predict France as a winner. We did get things wrong, as it may be nearly impossible to predict every game, especially with the shock exits of some of the world giants. Not even data could foresee what happened to Germany.

The difficulty in predicting football scores or results comes from the high number of variables that affect the outcome of every match. From each player's form to his mental strength to the "team chemistry". Our aim for the future is to improve the robustness of the model by including more variables and testing them through the course of the time.

As the new Premier League season approaches, we are partnering with The Argus on a weekly predictor for the league so keep an eye out for posts from us to see if we can repeat our success! 

 

Topics: "alteryx", 2018, StorytellingWithData, Making Data Work, Insider, Tableau, Visualisations

Matteo Pilotto

Written by Matteo Pilotto