# Numeracy #1: Bayes Theorem

## Bayes Theorem

Thomas Bayes was an English statistician, philosopher, and Presbyterian minister who is known for having formulated the theorem that bears his name. Bayes Theorem originated as Bayes’ answer to the inverse probability question. Normal probability states, given a certain number of white and black balls, what is the probability of drawing a certain color ball. Inverse probability states, given that you drew a certain color ball, what can be said about the original make up of white and black balls.

Instead of focusing on the formula ( P(event|newinfo) = P(newinfo|event) * P(event) / P(newinfo) ), let’s focus on understanding the deeper concepts behind Bayesian thinking.

The main principle behind Bayes theorem is that if you take the original odds and multiply them by an evidence adjustment, you get the new odds. Imagine we are given two pieces of information.

1. There are less basketball players in the world than non basketball players..
2. A large percent of people who play basketball are tall.

Now let’s assign values to this information.

• One in ten people play basketball. Of those people, 75% are tall. Of the people who don’t play basketball, 15% are tall.
• Prior odds ratio – 1 : 10
• Likelihood ratio – 75 : 15

Now for the magic, say we run into a person on the street who happens to be tall. What is the probability that they are a basketball player? Let’s work through the reasoning.

1. What is the prior odds ratio of the person being a basketball player before we found out their height?
• 1 : 10 or 9%
2. What is the ratio between people being tall and playing basketball and people being tall and not playing basketball. This is the likelihood ratio for our question.
• 75 : 15 or 5 times more likely to be a basketball player given that you are tall than to be a non basketball player given that you are tall.
3. Now we can multiply our prior odds by our likelihood ratio to find out the posterior odds of the person being a basketball player.
• 1 : 10 * 75 : 15 = 75 : 150 = 1 : 2 or 33%

So given that the person is tall there is roughly a 1 : 2 chance that they are basketball players. This is a huge difference from our original 1 : 10 odds before adding the height info.

So what can we draw from this example to further our understanding of Bayesian thinking? Well for starters, if we randomly walked into someone who is 7ft tall we might instantly think they are basketball players. This would be a case of forgetting our priors. This is also known as base rate neglect or as putting too much weight on anecdotal evidence. Say someone is acting suspicious, we might instantly think that they are up to no good. But what if the prior odds of someone being up to no good is only 1%. When framed in this context, the suspicious behavior will make us a little more worried, but it wont cause us to completely freak out. This is Bayesian thinking.

Another factor in Bayesian thinking has to do with flipping our theory. We should always ask if we are wrong, would the world look different? In the case of the basketball player, our theory would be that since they are tall, they are a basketball player. Now if this theory is wrong, we would be left with a tall person who is not a basketball player. How likely is this new world? Well, it is actually very likely when we think about it. Maybe being tall shouldn’t have that much weight in our thinking.

Now for the key factor in Bayesian thinking, updating incrementally. New evidence should update our prior odds. We have to shift our beliefs as we encounter new information about the world. Being tall is just one piece of new information, what if we found out that the person we ran into also owns a basketball? Would we hold on to the original probability or revise it with the new information?

### In relation to investing

Base rate neglect is a common issue in investment analysis. We might analyze a company that is highly leveraged or has questionable management. These situations have a base rate of failure that we would need to include in our analysis. We might be tempted to say this time is different, but that would need extraordinary evidence.

Not updating incrementally is another issue in investment analysis. Say that same situation, of a highly leveraged company, has an amazing story attached to it. This will improve the likelihood ratio for a positive outcome and change our posterior odds of a success. A negative story can do the same in the other direction. The hard part is figuring out if a story is a signal or just noise. For example, a log term positive change in a company’s cash flows would be a positive signal. A sudden drop in earnings from temporary currency fluctuations would be noise. The ultimate story is one of intelligent fanatics who have proven time and time that they can create value even in the most difficult of situations. These stories suggest that our odds need to be revised.

However there are limitations to this framework, as was seen in the crisis of 2008. A long chain of incremental updates can create a world where the risk of default is very low. This completely ignores black swan events that are both hard to predict and have a high impact on outcomes. Picking up nickles on a train track might seem like a great business looking backwards, but a train hitting you might still be the only future that matters.

### Conclusion

Bayesian thinking, more than anything, should be a systemic way of changing our minds. If we hold a strong belief, measured in our prior odds, new information should be able to alter that belief. For example, if we think that spinoffs will have a high probability of excess returns, but the current spinoff we are analyzing has a powerful negative story attached to it, we should be able to conclude that the results will be different from our prior belief.

References:

Latticework of Mental Models: Bayes Theorem