Every four years, the World Cup sweeps across the globe, turning casual viewers into devoted fans and offering millions a brief respite from everyday worries (although for those enduring conflict in places such as Kyiv or Gaza, the tournament might feel very far away). Beneath the spectacle of the tournament lies a vast commercial enterprise, where global brands compete for attention and football's emotional appeal is transformed into economic value. Host countries justify the huge expense involved in staging the tournament by highlighting the economic benefits that will flow in return. I have never been convinced by this. But you don’t have to take my word for it: This article, by Professor Rob Wilson, makes all the points (and more) that I have been making over the last 20 or so years.
This year’s tournament, which is so big it will be hosted by
three countries, feels overblown. FIFA has expanded the tournament to allow for
48 participants in the final stages, and while this undoubtedly makes the
tournament more inclusive, it is somewhat incongruous that the likes of Curaçao
and Qatar have made it through while four-time winners Italy have not. Although
FIFA’s attempt to broaden the game’s global reach is laudable, Curaçao has a
population of approximately 158,000 – Italy has 22 individual cities with
larger populations. It is hard to make the case that a tournament is
necessarily stronger or more compelling simply because it includes more teams.
While the expansion offers opportunities to nations that would previously have
had little realistic chance of qualification, it also raises questions about
whether the quality of the competition has been diluted.
One consequence of this expansion is that the tournament
will comprise 104 matches versus 64 in 2022. At least it is being held at a
more usual time of year for this tournament, rather than just before Christmas,
but while daytime temperatures may not quite match the highs recorded in Qatar
in 2022, they are still very high. Moreover, there are justifiable concerns
around the demands on top players – many of whom already face congested club
schedules.
However, football is now big business. FIFA expects to earn
around $13bn during this 2023-26 cycle – 72% more than the previous World Cup.
According to Deloitte’s the world's 20 highest-earning clubs generated a record
€11.2 billion in revenue in 2023/24 with Real Madrid becoming the first
football club to generate more than €1 billion in annual revenue. Almost half
(44%) of revenue generated by the top clubs is derived from commercial
activities and sponsorships, with a further 38% derived from broadcasting. Just
18% comes from matchday income. Fans who pay to watch football on TV are
actually more valuable to clubs than those who turn up at the stadium.
For all these commercial reservations, there is no doubt
about the appeal of o jogo bonito. Despite the plethora of matches to
wade through (I can assure you I will not be watching all of them) there is
something uniquely compelling about a World Cup tournament. Perhaps we tune in
because we hope for a classic tournament along the lines of Mexico 1970. Or
maybe you hope your team will win (though if you are English or Scottish I
would suggest not raising your hopes too much). Whatever the reason, many
people who do not follow the game regularly might take more than a passing
interest this summer.
Who might win: A statistical analysis
At the outset, I should declare that I have a notoriously
bad track record of predicting the tournament winner. That has not stopped me
from having another go at running a statistical model, largely because it is a
fun programming exercise, and partly because every time I do it, each new
version of the model is an improvement on what went before. Indeed, this year’s
version – a fully coded Monte Carlo simulation exercise – is a long way from
the spreadsheet models of 20 years ago.
The model simulates the entire 2026 FIFA World Cup, from the
opening group fixtures all the way through to the final, running a virtual tournament
10,000 times. Rather than simply relying on team ratings, the model derives
each team's attacking and defensive strength from recent historical data[1].
Expected goals are nudged up or down based on the difference in the ELO world rankings between the two sides.
Scorelines are then generated using a negative binomial distribution, which
better captures the over-dispersion and inherent unpredictability of football
results than simpler alternatives. The probabilities of qualifying for the
knockout stage of the tournament are shown in the table below (click to enlarge).
One thing to look out for are those cases where there are only small differences in probability between finishing second, and ensuring automatic qualification for the knockout stages, and finishing third, which will result in hoping to qualify as one of the 8 best third placed teams. For example, the simulated results for Group I point to little difference between Norway and Senegal, suggesting scope for an upset. There is also not much between first and second place qualifying slots in groups A, D, F, G and I. Group C also throws up a surprise with Morocco qualifying ahead of Brazil. I am not sure how much weight I would place on that but that is all to do with form over the last four years.
In line with the tournament rules, the top two teams from each group qualify automatically for the Round of 32. The remaining eight spots go to the best third-placed finishers across all 12 groups. FIFA has devised a complex matrix of outcomes to determine where the third placed teams are allocated in the draw, based on every possible combination of qualifying third-placed teams (all 495 of them) which I used to assign the opponents for the third placed teams. Matches level after 90 minutes were assumed to proceed to extra time, where scoring rates are reduced to reflect fatigue. If sides cannot be separated, penalties are decided by a weighted probability that gives a modest edge to the higher-ranked team, but with enough randomness built in to reflect the unpredictability of a shootout. The ultimate tournament outcomes are shown in the table below (click to enlarge).
As the table shows, the model outcome points to Spain as the tournament winner, by the very shortest of heads from Argentina. The bookmakers odds are considerably shorter than my estimate, offering 9-2 versus my estimated odds of 9-1. But the odds generated by the model are designed to sum to one – those offered by bookmakers are designed to maximise their chances of making money depending on the weight of bets placed (see here for an explanation). Two things stand out from the model generated results: first, Brazil is assigned a very low probability of winning, reflecting their relatively poor performance since 2003. Again, this may reflect the fact that their recent performance underestimates their true quality. Second, Japan look to be a good outside bet. Anyone who watched them beat England earlier this year will have noted that they are a very good team. As for England, I have them down as an 11% shot to reach the semi-finals but that is about as generous as I am prepared to be.
Statistically, of course, the odds are always against any
individual team winning the tournament. Spain may be the favourites, but they
still have roughly a 90% chance of not lifting the trophy. Should they fall
short, I reserve the right to declare that the data vindicated me all along.
[1] Previously
I used a time-decayed average of goals scored in World Cup final tournaments.
This time I have switched to a metric based on performance from 2023 onwards
(i.e. since the last World Cup).

