Skip to main content
What is Survivorship Bias?
  1. Glossary/

What is Survivorship Bias?

4 mins·
Ben Schmidt
Author
I am going to help you build the impossible.

You look at the most successful founders in the world and try to reverse engineer their path. You see that Steve Jobs dropped out of college. You see that Bill Gates dropped out of college. You see that Mark Zuckerberg dropped out of college.

The logical conclusion seems simple. To build a massive tech company, you should drop out of college.

This is a textbook example of survivorship bias. You are analyzing the data of those who made it while ignoring the massive dataset of college dropouts who never built a successful business.

In the startup world, this bias is rampant. It creates a distorted map of reality that can lead founders to take unnecessary risks based on incomplete data.

Defining the Invisible Data

#

Survivorship bias is a logical error where we concentrate on the people or things that made it past a selection process and overlook those that did not. We do this primarily because the failures are invisible. They do not write memoirs. They do not get interviewed on podcasts. They do not speak at conferences.

The classic example comes from World War II. The military looked at returning planes and saw they were riddled with bullet holes in the wings and tail. They decided to reinforce those areas.

A statistician named Abraham Wald pointed out the flaw. The planes they were studying were the ones that survived. The planes that were hit in the engine or cockpit never came back. The military was looking at the wrong data.

In business, we do the exact same thing.

We study the habits of successful CEOs and assume those habits caused the success. We rarely ask if failed CEOs had the exact same habits.

Comparison with Correlation vs Causation

#

It is helpful to distinguish survivorship bias from the concept of correlation versus causation. While they often overlap, they are distinct issues in data analysis.

Correlation versus causation asks if two variables are linked or if one actually causes the other. For example, ice cream sales and sunburns are correlated, but one does not cause the other.

Study the graveyard of failed ideas.
Study the graveyard of failed ideas.

Survivorship bias is a data selection problem. It happens before you even get to the analysis phase. You are working with a dataset that has already been filtered by success.

If you analyze a dataset of only winning lottery tickets, you might conclude that buying a ticket guarantees a win. The correlation is 100 percent. The math is correct, but the dataset is fundamentally broken.

For a founder, this distinction matters. You need to know if you are looking at a causal relationship or if you are simply looking at a lucky subset of survivors.

When to Audit Your Assumptions

#

There are specific moments in building a company when this bias is most dangerous. The first is during customer discovery.

If you only survey current customers about what features they want, you are ignoring the people who visited your site and left without buying. You are optimizing for survivors. You might make the product better for people who already like it, while failing to address the reasons why 90 percent of visitors leave.

The second moment is when making cultural decisions based on other companies. Copying the culture of a massive corporation is dangerous for a seed stage startup.

That corporation survived long enough to institute that culture. It does not mean the culture is what helped them survive.

Asking the Right Questions

#

To combat this, you have to actively look for the invisible data. You have to be willing to do the uncomfortable work of studying failure.

Here are the questions you should ask:

  • Who tried this business model and failed?
  • What variables are missing from my dataset?
  • Did the failed companies take the same risks as the successful ones?

We do not know what the next big startup will look like. But we do know that copying the past without context is a strategy based on errors rather than insights.