Ahead of the third round of this season’s Emirates FA Cup – a stage that has become synonymous with results that defy logic throughout the competition’s 150-year history – The FA has today released new mathematical research that reveals the 10 most improbable FA Cup third round giant-killings.
The FA teamed-up with the Institute for Mathematical Innovation at the University of Bath to commission the research that considered Opta data from 8,152 FA Cup ties as part of the process. Using probability theory, the University was able to verify a historical top 10 of the least likely scorelines from the last 50 years of the competition.
In order to perform their calculations, Dr Adwaye Rambojun and Professor Andreas Kyprianou from the University of Bath built a bespoke mathematical model that takes in to account the overall probability of the lower league teams reaching the third round, the difference in league status and the timing and sequence of the goals scored in each tie.
Goal scoring and league position trends across more than 8,000 ties from the competition since the 1959-1960 season were also considered to understand how these factors historically influence the individual outcomes.
The formula used in the study is expressed in the mathematical equation below:
The final list of results features several familiar ‘cupsets’ but now, for the very first time, the fixtures have been ranked in order of the statistical probability of each scoreline occurring.
The research has identified that Woking’s 4-2 victory over West Bromwich Albion at the Hawthorns in 1991 was by far and away the least likely third round result in the last 50 years of the competition, with just a one in 15,959,312 chance of that upset taking place.
To put that into context, the result was less probable than conceiving identical quadruplets – a one in 15 million probability.
The fixture is firmly entrenched in the competition’s folklore, with Woking - then of the Isthmian Premier League - reaching the third round for the very first time and beating second-tier West Brom away from home.
The result was made all the more improbable as the Baggies took the lead, but the visitors roared back in the second half with Tim Buzaglo registering an 11-minute hat-trick and Terry Worsfold adding a fourth. West Brom scored a late consolation, but Geoff Chapple's side held on for what is the most improbable third round victory in the last 50 years, according to the research.
Another iconic result ranks second on the newly calculated list, with Hereford United’s third round replay success over Newcastle in 1972 revealed as a one in 32,449 probability, even less likely than growing to 7 feet tall (one in 26,315) or finding a five-leaf clover (one in 24,000).
The game, made famous by Ronnie Radford’s late long-range equaliser and a young John Motson’s memorable TV commentary, saw the Southern League side overcome First Division opposition despite trailing late on in normal time.
Newcastle were again on the receiving end in 2011 when League Two Stevenage secured a 3-1 win over the Premier League side, in a result that would happen only once in every 7,712 attempts, which is only marginally more likely than rolling five consecutive sixes with a dice (1 in 7,776).
The full top 10, with probability and real-life example
1. West Bromwich Albion 2-4 Woking (1991) – 1 in 15,959,312 (conceiving identical quadruplets is 1 in 15,000,000)
2. Hereford United 2-1 Newcastle United (1972) – 1 in 32,449 (chance of growing to over seven feet tall is 1 in 26,315)
3. Stevenage 3-1 Newcastle United (2011) – 1 in 7,712 (rolling five consecutive sixes with a dice is 1 in 7,776)
4. Birmingham City 1-2 Altrincham (1986) – 1 in 4,376 (being dealt a four-of-a-kind poker hand is 1 in 4,165)
5. Oxford United 3-2 Swansea City (2016)– 1 in 3,487 (scoring a hat-trick in the Final and winning the Emirates FA Cup is 1 in 2,993)
6. Sutton United 2-1 Coventry City (1989) – 1 in 3,260 (chance of the Emirates FA Cup trophy landing in the UK if dropped from space is 1 in 2103)
7. Burnley 0-1 Wimbledon (1975) – 1 in 2,515 (becoming a NASA astronaut is 1 in 1,525)
8. Harlow Town 1-0 Leicester City (1980) – 1 in 1,800 (being born on a leap day is 1 in 1,461)
9. Derby County 1-3 Bristol Rovers (2002) – 1 in 397 (coin tossing 8 heads-in-a-row is 1 in 256)
10. Newport County 2-1 Leicester City (2019) – 1 in 337 (conceiving identical twins is 1 in 250)
Dr Adwaye Rambojun, Research Associate at the Institute for Mathematical Innovation at the University of Bath, explained how there were several aspects to the mathematical model used.
Dr Rambojun said: “Using data from over 8,000 Emirates FA Cup matches, we produced a mathematical model that takes into account not only the relative league status of the teams involved and how many matches they won to get there but also in-game scoring sequences to compute the total probability of the biggest third round giant-killings in the last 50 years. Our findings show that there is only one winner when it comes to the most improbable third round shock.
“The conditions surrounding Woking’s win had probability stacked against them but, somehow, they managed to produce a result that would likely happen only once every 15,959,312 attempts. Woking’s achievements in winning the match were remarkable enough but the fact that they went behind in the match and then scored four second-half goals before West Brom’s late consolation is what makes this result and scoreline so improbable.
“It’s an outcome that defies logic but, as our ranking proves, on any given day the unthinkable can happen. That’s the magic of the Emirates FA Cup and we have seen it time and again throughout the competition’s illustrious history.”
Woking’s manager at the time was Geoff Chapple, who is still at the club as Football Secretary and Club Ambassador.
Chapple said: “My football management career will forever be defined by this match. Before the game they all said we had no chance on paper, but thankfully we weren't playing on paper. I always told the team we had hope, even if it was a one in 16 million chance!
“I remember sitting there after the match thinking: ‘Am I dreaming? This can’t be happening. We’re from Surrey, nobodies, what’s going on here?’
“Our players were part-time. We had couriers, painters, decorators and builders. But I always used to instil in them that anything was possible, especially in the FA Cup, and this result proved it. Even 31 years later, the positive impact of this result on the club and on the community still lives on to this day – that’s how special the FA Cup is."
Andy Ambler, Director of Professional Game Relations at The FA, added: "It’s fascinating to look back at some of the third round games that have helped define the Emirates FA Cup and elevated it to the status of the world’s most popular domestic cup competition.
“The commissioned research outlines the competition’s ability to regularly defy logic and will hopefully spark many fond memories. In its 150th anniversary season, the Emirates FA Cup retains the values of hope, opportunity and equality.
“Improbable results are part of the competition’s fabric, providing life changing moments for players, managers, fans and communities alike and I am sure there will be plenty more to come as the competition looks ahead to an exciting future.”
A 50-year period has been considered due to restrictions of available data and changes in the competition format. All probabilities have been calculated based upon the modern format of the competition, with a club’s league level at the time of these fixtures compared with the current pyramid structure.
Duncan Alexander, co-editor-in-chief of Stats Perform’s theanalyst.com and football statistician, broadcaster and author, said: “It’s incredibly interesting to see how the raw data has been transformed into an algorithm that gets to the very heart of why we love the Emirates FA Cup third round. By measuring and evaluating every aspect of football we get closer to understanding the true magic of the sport: that there’s always a chance of pulling off a sensational result, however much the odds are stacked against you. Football fans around the world will be watching to see if we have more of the same this weekend!”
You can read more about the mathematical modelling used during the research below and a more detailed report is available upon request.
Features of the mathematical model:
In-match performance: The in-game model looks at the goals scored by each team and how this impacts the goal scoring rate of their opponent. The more goals a team has conceded, historical data suggests the less goals it is likely to score. The model also accounts for the clustering of goals in short time periods. The shorter the time period in which a number of goals is scored, the more unlikely such an event will be.
Reaching the third round: The model takes into account both the number of games each team has to play before reaching the third round of the Emirates FA Cup (based upon the current structure of the competition), as well as what happens during the knockout match.
The influence of a team’s current league: Teams from lower league have a lower goal scoring rate than teams from higher leagues. This leads to a lower probability of a win by a lower league team over a team from a higher league. Linked to this, a team’s current league also affects the number of games that it has to play before reaching the third round. Teams from lower leagues have to play more games, increasing the chance of not reaching the third round, and thus reducing the probability of them beating a team from a higher league.
The influence of goal scoring profiles: The model divides the game into epochs in which a team has to score. These are periods of time between successive goals. For example, in the case of only one goal being conceded, the first epoch would be the start of the game to the time the goal was conceded and the second epoch would be the time that the goal was conceded to the end of the game. If no goals were conceded, then there is just one epoch. Each epoch would be characterised by one goal scoring rate for each team, which, on average, goes down with the epoch number. If a team scores many goals in a short epoch, then this will drive down the probability due to the nature of the probabilistic model that we are using.
The role of historical data: The principal role of all the historical data provided is to use it to calibrate the model and is used to estimate the relevant parameters.
You can follow all of this weekend's Emirates FA Cup third round action live with scores and stats from every tie.