In past posts, we explored how the normal distribution dominates our lives. Consider the heights of people in a given population: some will be extra tall, some extra short, and most will lie somewhere in between. Since no importance is given to outliers in these linear systems, we can compute some sort of average representation through sampling. The normal distribution guarantees us stable long-term results around that average. If the tallest person in the world walks into our sample, our average will only change by a few feet.
Now, consider what would happen if someone 50 billion feet tall walked into our sample. Or if someone worth $50 billion walked into a sample of people’s net worths. The average would change so much that the concept of it would become nonsensical. When values diverge rather than converge on an average, they are displaying the characteristic of a nonlinear system that follows a power law.
In statistics, a power law describes the relationship between two quantities where one quantity varies as a power of another. Mathematically it is expressed as:
ƒ(x) = ax^k + ε
k is the scaling exponent
ε is the error term
This creates a J/L-shaped distribution where many values fall below the mean and a few large values dominate the conversation. These distributions are also referred to as long or fat tail distributions. Compared to the normal distribution, the power law distribution means extreme events are much more likely and it renders the concept of an average meaningless. In other words, outliers matter a lot.
Another way to think about this is that a normal distribution assumes independent and non-correlated events. For example flipping heads doesn’t change the likelihood of getting heads on the next flip. Being tall doesn’t change the likelihood of the next person being tall. But a power law distributions feeds in on itself. Having a lot of followers increases your chance of gaining more followers. Having a lot of money increases your chance of gaining more money. And so on.
Examples of the power law
In the book Zero to One, Peter Thiel gives us an example of a power law in the context of venture capital returns:
The error lies in expecting that venture returns will be normally distributed: that is, bad companies will fail, mediocre ones will stay flat, and good ones will return 2x or even 4x. Assuming this bland pattern, investors assemble a diversified portfolio and hope that winners counterbalance losers.
But this ‘spray and pray’ approach usually produces an entire portfolio of flops, with no hits at all. This is because venture returns don’t follow a normal distribution overall. Rather, they follow a power law: a small handful of companies radically outperform all others. If you focus on diversification instead of single-minded pursuit of the very few companies that can become overwhelmingly valuable, you’ll miss those rare companies in the first place.
In the book Think Twice, Michael Mauboussin gives us another example:
But there are systems with heavily skewed distributions, where the idea of average holds little or no meaning. These distributions are better described by a power law, which implies that a few of the outcomes are really large (or have large impact) and most observations are small. Look at city sizes. New York City, with about 8 million inhabitants, is the largest city in the United States. The smallest town has about 50 people. So the ratio of the largest to the smallest is more than 150,000 to 1. Other social phenomena, like book or movie sales, show such extreme differences as well. City sizes have a much wider range of outcomes than human heights do.
Other examples of the power law include: customer sales, market share, earthquake magnitude, diminishing returns, premature optimization, metabolic rates of mammals, financial returns, popularity, the Pareto principle, and the network effect.
The Pareto principle
The Pareto principle states that for many events, roughly 80% of the effects come from 20% of the causes. This principle was originally proposed by management consultant Joseph Juran to describe common observations in business and is an example of a power law. For example, 80% of a company’s profit come from 20% of its customers, 80% of a company’s complaints come from 20% of its customers, and 80% of a market is controlled by 20% of the participants in that market.
This principle also holds true for many observations outside of business. In sports, about 20% of sportsmen participate in 80% of big competitions. In economics, about 20% of the world’s population controls 80% of the world’s income. Etc.
In the book The Signal and the Noise, Nate Silver describes the Pareto principle in the context of effort vs performance:
The name for the curve comes from the well-known business maxim called the Pareto principle or 80-20 rule (as in: 80 percent of your profits come from 20 percent of your customers). As I apply it here, it posits that getting a few basic things right can go a long way. In poker, for instance, simply learning to fold your worst hands, bet your best ones, and make some effort to consider what your opponent holds will substantially mitigate your losses. If you are willing to do this, then perhaps 80 percent of the time your will be making the same decision as one the best poker players like Dwan – even if you have spent only 20 percent as much time studying the game.
In computer science, Metcalfe’s law describes how anytime a new computer is added to a network, we have the possibility of adding more connections to that network than there are computers in the network. In other words, as the number of computers grows linearly, the number of connections grows exponentially.
As each new computer increases the value of the network exponentially for everyone else, this creates a feedback loop. The larger the network, the more valuable the network, and the more likely it will get even larger.
The network effect is the generalization of Metcalfe’s law that applies to all networks. For example, the more users that Facebook has, the more attractive it becomes to new users, and the more users it gets. In fact, most of the leading internet websites are a byproduct of the network effect. Costco, financial exchanges, phone companies, etc, all have network effects.
As it relates to investing
As investors, we need to understand that 20% of our lifetime investment choices are going to provide 80% of our returns. Similarly, 20% of our time spent researching will provide us with 80% of our insights. This is just an application of the Pareto principle. When we understand this, doing the research for and owning 500 stocks becomes much less appealing. Diversification spreads our attention thin and causes us to miss out on the investments that will move the needle.
Nassim Taleb, the author of The Black Swan, once explained that investors should look for:
Payoffs that follow a power law type of statistical distribution, with big, near unlimited upside but because of optionality, limited downside.
This is the fundamental creed of value investors and is ultimately a reflection about power law distributions.
Systems that follow power laws are dominated by outliers. This makes the concept of an average meaningless. For example, on black Monday the stock market dropped by 22% in a single day. This was a shift of more than 20 standard deviations from the average 1% daily fluctuation. A normal distribution would predict that the chances of this happening was 10^-50, or in other words, impossible.
Anyone who focused on that average and thought the world was normal, was badly burned by this event and the many that would follow. As investors, we need be aware of these “black swan” events and the effect they have on our portfolios to do both good and bad.