Skip to main content
What is Hashing?
  1. Glossary/

What is Hashing?

6 mins·
Ben Schmidt
Author
I am going to help you build the impossible.

You might hear your technical cofounder or lead engineer toss around the term hashing during a security audit or a database architecture meeting.

It sounds technical. It sounds abstract. But understanding hashing is fundamental to understanding how modern businesses protect user trust.

At its core, hashing is a mathematical operation. It is the process of transforming any given key or string of characters into another value. This value is usually represented by a fixed-length string of numbers and letters.

Think of it as a digital fingerprint.

No matter the size of the input data, the output remains a consistent length. You could input a single word or the entire text of War and Peace. The resulting hash would be a string of characters that is exactly the same length in both scenarios.

This concept is vital for startups building anything that touches user data.

The Mechanics of the Hash Function

#

A hash function takes an input (often called a message) and returns a fixed-size alphanumeric string. The string is called the hash value or digest.

There are three main properties that make this useful for business applications.

First is determinism. If you put the exact same input into the function, you will get the exact same output every single time. This reliability allows systems to verify data without reading the data itself.

Second is the avalanche effect. In a good hash function, a tiny change in the input results in a massive change in the output. If you change a single lowercase letter to an uppercase letter in a ten-page document, the resulting hash looks completely different. There is no resemblance to the previous hash.

Third is the one-way nature of the function. This is arguably the most important piece for security.

You can easily generate a hash from data. However, it is practically impossible to reverse the process to generate the original data from the hash. You cannot feed the hash back into the machine to see the original message.

Hashing vs. Encryption

#

This is where many non-technical founders get confused. Hashing and encryption are often used in the same sentences regarding security, but they serve different purposes.

Encryption is a two-way function. It is designed to scramble data so that it is unreadable to unauthorized parties, but it is also designed to be unscrambled. If you have the correct key, you can decrypt the message and read the original text.

Encryption is like a safe. You put a document inside and lock it. It is secure. But if you have the key or the combination, you can open the safe and retrieve the document intact.

Hashing is a one-way function. It is designed to verify integrity, not to store and retrieve data.

The classic analogy here is a blender. If you put a banana and some strawberries into a blender and hit the button, you get a smoothie. It is easy to go from fruit to smoothie. It is impossible to turn that smoothie back into the original banana and strawberries.

If you need to retrieve the data later, use encryption. If you need to verify the data without retrieving it, use hashing.

Why Startups Need Hashing

#

The most common application for hashing in a startup environment is password storage.

Never store user passwords in plain text.
Never store user passwords in plain text.
You should never store user passwords in plain text in your database. If you do and you are hacked, every user account is compromised immediately. This destroys reputation and trust.

Instead, when a user creates an account, your system hashes their password. You store the hash, not the password.

When the user tries to log in later, they type their password. Your system runs that input through the same hash function. It compares the new hash with the stored hash.

If they match, the system grants access. If they do not match, access is denied.

In this scenario, your servers never actually know the real password. If a hacker steals your database, they only steal a list of useless alphanumeric strings that cannot be reversed into actual passwords.

Ensuring Data Integrity

#

Another critical use case for founders is data integrity. This answers the question: has this file been tampered with?

This is relevant if you are dealing with software distribution, legal documents, or financial records.

When a file is created, a hash is generated for that file. If you send that file to a client, they can generate a hash on their end. If the hashes match perfectly, the file is identical to the one you sent.

If a single byte of data was corrupted during the download or if a malicious actor intercepted the file and inserted a virus, the hashes will not match.

This allows your systems to automatically reject corrupted data before it causes issues in your operations.

Salting and Collisions

#

As you dig deeper into this with your engineering team, two other terms will likely surface.

The first is a collision. This happens when two different inputs produce the exact same hash output. While mathematically possible, good algorithms make this statistically improbable. If an algorithm is found to have frequent collisions, the industry stops using it. This is why you might hear engineers say that MD5 is dead and that you should use SHA-256 instead.

The second term is salting. Hackers have become sophisticated. They have pre-computed lists of hashes for common passwords like 123456 or password.

To prevent hackers from simply looking up the hash in a table, developers add a salt. This is a random string of characters added to the password before it is hashed.

This ensures that even if two users have the same password, their stored hashes will be completely different because their salts are different. It adds a necessary layer of complexity to the security architecture.

Questions for Founders

#

You do not need to be a cryptographer to lead a successful company. You do need to ask the right questions to ensure your technical foundation is solid.

Ask your team how user credentials are stored. Verify that they are hashed and salted.

Ask what algorithms are being used. Ensure you are not relying on outdated standards that have known vulnerabilities.

Ask how data integrity is verified when moving large files between your servers or to your customers.

Security is often an afterthought in the early days of building a product. We rush to ship features. We rush to get to market. But technical debt in security is the most expensive debt you can accrue.

Understanding the basics of hashing allows you to have a coherent conversation about risk. It helps you protect the most valuable asset your startup has, which is the trust of your customers.