Decoding the football world cup through data analytics

The world has been in the midst of World Cup football fever for the last few weeks. In India, late-night games with penalty shootouts have led to many groggy workers in offices. Some big-name countries like Italy did not even make it to this World Cup (widely blamed on coach Giampiero Ventura) while others, like Germany (which lost to South Korea and Argentina), have already been knocked out.

The low frequency of goals and history of “upsets” has led to the thinking that football, unlike many other sports, cannot really be deconstructed analytically with the help of data. Sports such as baseball, American football, basketball and cricket are now being studied analytically and deeply for teams to gain a competitive advantage.

The 2011 movie Moneyball was based on a book that discussed an analytical system used in baseball by the Oakland Athletics to assemble a top-notch team on a limited budget. Firms like Cricket-21 provide detailed match analysis, including data, graphics, real-time video clippings, and analysis of the opposition. It is because of techniques like this that India’s Suresh Raina gets bounced out on his ribs and the opposition bowls full and straight (an in-swinging delivery if possible) to Australian Shane Watson. Until recently, football had escaped this data-oriented approach.

First, some history for context. The inaugural football World Cup was held in 1930, and won by hosts Uruguay, who beat their neighbours, Argentina, 4-2. Brazil have won the most number of World Cups (5) followed by Germany and Italy (4). In recent times, the average number of goals per World Cup match has been in the range of 2.2-3.0. It used to be between 3-4.5 in the 1950s.

As defence and passing have become more systematic, and fitness a more general requirement, the number of goals has gone down. Miroslav Klose of Germany, with 16 goals, and Ronaldo of Brazil with 15, lead the individual tally for goals in World Cups. The legendary Zinedine Zidane of France leads with the most cards. These aggregate statistics, though, are of little use in predicting the outcome of a specific game.

Unlike club football, World Cup football analytics is further complicated by “a small data set” problem since it is played once in four years. Individual players can represent their countries in multiple World Cups—as recent examples, Thierry Henry of France and Xavi of Spain have each represented their countries four times. Despite this, country teams are put together mere weeks before a World Cup and disbanded soon after because players have strict club commitments.

Leave a Reply

Your email address will not be published. Required fields are marked *