So-called classical statistics, which is most people’s introduction to the subject, relies on the insight that probability represents a fixed long-run relative frequency in which the likelihood of an event emerges as a ratio from an infinitely large sample size. In other words, the more observations we have, the more likely it is that the most frequently observed outcome represents the true mean of a given distribution.
To illustrate how these two schools of thought differ, consider the case of horse racing. Two horses – let’s call them True Blue and Knackers Yard – have raced against each other 15 times. True Blue has beaten Knackers Yard on 9 occasions. A classical statistician would thus assign a probability of 60% to the likelihood that True Blue wins (9/15), implying a 40% chance that Knackers Yard will win. But we have additional information that on 5 of the 7 occasions when Knackers Yard has won, the weather has been wet whilst True Blue won two wet races. The question of interest here is what are the odds that Knackers Yard will win knowing that the weather ahead of the sixteenth race is wet? To do this, we can combine two pieces of information: the head-to-head performance of the two horses, and their performance dependent on weather conditions.
In order to do this, we make use of Bayes Theorem which is written thus:
P(A | B) = P(B | A). P(A)
P(B)
P(A|B) is the likelihood that event A occurs conditional on
event B. In this case, we want to know the probability that Knackers Yard wins
conditional on the fact it is raining. P(B|A) is the probability of the
evidence turning up, given the outcome. In this case, we want to know the
likelihood that it is raining given that Knackers Yard wins. Since there were 7
rainy days in total and Knackers Yard won on five occasions, the answer is 5/7
or 83.3%. P(A) is the prior probability that the event occurs given no additional
evidence. In this case, the probability that Knackers Yard wins is 40% (it has
won 6 out of 15 races). P(B) is the probability of the evidence arising,
without regard for the outcome – in this case, the probability of rain
irrespective of which horse won. Since we know there were 7 rainy days out of
15 races, P(B)=7/15 = 46.7%. Plugging all this information into the formula, we
can calculate that P(A|B)=71.4%.
Now all this might appear to be a bit geeky but it is an interesting way to look at the problem of how the UK economy is likely to perform given that Brexit happens. Our variable of interest is thus P(A|B): the UK’s economic growth performance conditional on Brexit; P(B) is the likelihood of Brexit and assuming (as the government seems to suggest) that it is set in stone, we set it to a value of 1. Moreover, assuming that Brexit will happen regardless of the economic cost (i.e. ministers are not overly concerned about accepting a hard Brexit) then P(B|A) is also close to unity.
In effect, the Bayesian statistician might suggest that P(Growth│Brexit)=P(Growth). Since the only concrete information we have on economic performance is past performance, it is easy to make the case from a Bayesian perspective that the UK's future growth prospects can be extrapolated from past evidence. Those pro-Brexiteers who say that UK’s post-Brexit performance will not be damaged by leaving the UK may unwittingly have statistical theory on their side. But one of the key insights of Bayesian statistics is that we change our prior beliefs as new information becomes available. If growth slows over the next year or so, then other things being equal, it would be rational to reduce our assessment of post-Brexit growth prospects.
Incidentally, a joke doing the rounds of the statistics community at present suggests that although Bayes first published the theorem which bears his name, it was the French mathematician Laplace who developed the mathematics underpinning this branch of statistics. As a result, Brexit may present a good opportunity to give due credit to the Frenchman by naming it Laplacian statistics. It’s enough to make arch-Bayesian Nigel Farage choke on his croissant.
Now all this might appear to be a bit geeky but it is an interesting way to look at the problem of how the UK economy is likely to perform given that Brexit happens. Our variable of interest is thus P(A|B): the UK’s economic growth performance conditional on Brexit; P(B) is the likelihood of Brexit and assuming (as the government seems to suggest) that it is set in stone, we set it to a value of 1. Moreover, assuming that Brexit will happen regardless of the economic cost (i.e. ministers are not overly concerned about accepting a hard Brexit) then P(B|A) is also close to unity.
In effect, the Bayesian statistician might suggest that P(Growth│Brexit)=P(Growth). Since the only concrete information we have on economic performance is past performance, it is easy to make the case from a Bayesian perspective that the UK's future growth prospects can be extrapolated from past evidence. Those pro-Brexiteers who say that UK’s post-Brexit performance will not be damaged by leaving the UK may unwittingly have statistical theory on their side. But one of the key insights of Bayesian statistics is that we change our prior beliefs as new information becomes available. If growth slows over the next year or so, then other things being equal, it would be rational to reduce our assessment of post-Brexit growth prospects.
Incidentally, a joke doing the rounds of the statistics community at present suggests that although Bayes first published the theorem which bears his name, it was the French mathematician Laplace who developed the mathematics underpinning this branch of statistics. As a result, Brexit may present a good opportunity to give due credit to the Frenchman by naming it Laplacian statistics. It’s enough to make arch-Bayesian Nigel Farage choke on his croissant.