It is no secret that I like mathematics. It is generally known that most of the maths that I’m really into would fall under the heading of “pure mathematics”, which is a nice way of saying “mathematics with no real-world application”. I’m into topology, which has nothing to do with maps, and everything to do with drawing doughnuts and coffee cups and showing that they are, in fact, the same thing. Not only is topology one of the richest areas of mathematics in terms of potential for making bad puns, it is also very beautiful. Yes – I like maths because it is beautiful.
This doesn’t mean that I am useless at other kinds of mathematics (although I am not very good at basic arithmetic). To reach the university courses required to learn about coffee cups and klein bottles, one generally has to do many years of tedious and ugly mathematics. Mathematics that tells you about how sand moves, and how a metal rod heats up when you hold a fire to one end (I’m not making this up). However, in hacking through all of this “base” knowledge upon which an aspiring young mathematician will eventually build his tower of topological truth, one sometimes comes across interesting theorems or results. Maths is funny like that.
So imagine this situation – you’re at a clinic getting tested for AIDS because you want to donate blood.1 There’s big posters on the wall assuring you that the test is 95% accurate. “That’s pretty good” you think to yourself. After sifting through a dozen copies of cosmopolitan, the nurse returns with a dour expression on her face, and you heart skips a beat as she tells you the news:
“You have AIDS”
“Oh shit”, you think to yourself.
What are you going to do? Your whole life seems illuminated in an entirely different light, or to quote the well known broadway musical Rent: “the fire’s out anyway”. But then a small light bulb goes off in your head. You remember back to those maths classes you enjoyed so much at school and recall Bayes’ Theorem.
In plain English: The probability of A given B, is equal to the probability of B given A, multiplied by the probability of A, divided by the probability of B.
in the US, the prevalence of AIDS is a bit more than a third of a percent
So how does this apply to your situation? Well… in your leisurely weekend reading, you read a World Health Organization report which explained that the prevalence of AIDS in developed western countries is actually very low. In the US, it’s just over a third of a percent. For the purposes of this exercise, let us assume that the prevalence of AIDS is one percent.
What are we trying to work out? The probability that you have AIDS (A) given your positive test result (B). So… if you get a positive test result (like you just were), what is the probability that you have AIDS?
We know that the probability of you actually having AIDS is one in a hundred – that’s the statistic. Now we need to figure out how this test result affects that. What’s the probability of a positive test result (B)? Well, we know that from the posters in the clinic – it’s 95%.
There’s a helpful expansion of the original equation that we can apply here ( simply means “not”) .
Substituting our values in yields:
which comes to about 0.161, or slightly less than a one in six chance. So, if you get a positive test for AIDS, even if the test is 95% accurate, given that only one in a hundred people have AIDS, you can still reasonably expect not to have it, even with a positive test. If you substituted a more realistic value, like say a half percent chance, then it drops even further to slightly higher than a one in twelve chance.
This is a somewhat counterintuitive result. If you’re told that a test is 95% accurate, then you expect that what it tells you is true. However, when dealing with very small probabilities, this is not always the case. Perhaps a diagram might illustrate the point better than my clumsy words:
(click the image) The rectangle above represents the population. There are two instances where you are told that you have AIDS – firstly, the 5% of the time when the test is incorrect and you don’t have AIDS, represented by the yellow area, and secondly, the 95% of the time when the test is correct, and you do have AIDS, represented by the red area. It is simply a matter of dividing the area of the red area by the sum total of the areas of the yellow and red areas together. The point being – 95% of a very small number is still a very small number, and 5% of a very large number can still be a very large number.
Perhaps putting some numbers into this diagram might help. Let’s consider a population of 10,000 people.
Perhaps it is easier to wrap your head around it now. Of all the (590) people who are told that they have AIDS, only 95 of them actually do. To use percentages – if you’re test comes back and you’re told that you have AIDS, even though the test is 95% accurate, there’s a 83% chance that it’s wrong.
if you’re test comes back and you’re told that you have AIDS, even though the test is 95% accurate, there’s a 83% chance that it’s wrong
The converse is thankfully less worrying. Out of the 9410 people who are told that they don’t have AIDS, only 5 of them will have it. In other words, if your test comes back and you’re told that you don’t have AIDS, then there is only a 0.0005 chance that it is wrong, or a twentieth of a percent.
So that was a fun exercise, and an interesting and counterintuitive result. But why does it matter? Obviously, it would take too long to calculate all the probabilities on every single statistic you hear, but it pays to be on the lookout of misleading statistics. An easy way to be mislead, is when you only hear the percentages and no numbers to give them context.
By way of example, I was once intertwined in a protracted “discussion” with a conservative gun-lover on the internet. He was, naturally enough, trying to convince me that arming ordinary people would make everyone safer, an absurd claim when you examine the statistics, which I naturally made him aware of. He quoted his own statistic back at me – when an armed robber enters a shop, if the shopkeeper is armed and produces his weapon, 80% of the time the robber left without incident. EIGHTY PERCENT! Of course, I took the time to look up these numbers and when I finally found them, I found that that particular statistic had been constructed on a state level, and the number of incidents was less than 50 (in a state with a population above 10 million). Why was the number so small? I imagine that statistic-gathering on such a specific event isn’t very thorough. In fact, one wonders why anyone even bothers to keep track of these numbers, but anyway. I returned to the discussion armed with this new information, but my adversary could not understand why the fact that the number was small made a difference “but eighty percent!” was the oft-repeated refrain.
I then proceeded to explain Bayes’ Theorem to him.
I’m still waiting for a response.
So there you go, I have armed you with another tool for finding truth in the wilderness of knowledge. Use it wisely, and remember – with great power comes great responsibility.
- Yes, for the pedants out there, I’m well-aware that it’s HIV – human immunodeficiency virus which is being tested for, and that AIDS – acquired immunodeficiency syndrome is caused by HIV. The reason I stuck with “AIDS” is that it makes for a catchier 1-syllable title even though it is strictly technically incorrect. Whether or not I used “AIDS” or “HIV” also doesn’t make a difference to the point of the article, which is to explain a counterintuitive result of Bayes’ theorem. ↩