How Cryptographic Hash Functions Work — MD5 vs SHA-256 vs SHA-512 Explained

A cryptographic hash function takes any input — a single character, a 10 GB file, an empty string — and produces a fixed-length output called a hash (also known as a digest, checksum, or fingerprint). The same input always produces the same hash, but even changing a single bit in the input produces a completely different output. This one-way property makes hash functions the foundation of modern computer security, from password storage and digital signatures to blockchain and file integrity verification.

If you need to generate a hash right now, use our free online hash generator — it computes MD5, SHA-1, SHA-256, SHA-384, and SHA-512 hashes for text and files entirely in your browser.

What Is a Cryptographic Hash Function?

A hash function is a mathematical algorithm that maps data of arbitrary size to a fixed-size output. A cryptographic hash function is a hash function that satisfies additional security requirements, making it suitable for use in security protocols.

The concept dates back to the 1950s when Hans Peter Luhn at IBM created the Luhn algorithm for error-checking. Modern cryptographic hash functions emerged in the 1990s with Ronald Rivest's MD4 (1990) and MD5 (1991), followed by the NSA's SHA-0 (1993) and SHA-1 (1995), and eventually the SHA-2 family (2001) that includes SHA-256 and SHA-512.

The Five Properties of a Secure Hash Function

Every cryptographic hash function must satisfy five essential properties:

1. Deterministic

The same input always produces the same output. If you hash "hello" with SHA-256 today, tomorrow, or on any computer in the world, you will always get 2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824. This property is what makes hash functions useful for verification.

2. Fast Computation

Computing the hash of any input should be efficient. SHA-256 can process several hundred megabytes per second on modern hardware. This speed is essential for applications like TLS (where every packet is authenticated) and blockchain mining (where billions of hashes per second are computed).

3. Pre-image Resistance (One-Way)

Given a hash output, it should be computationally infeasible to find any input that produces that hash. This is what makes hashing "one-way." If an attacker obtains a password hash like 5e884898da28047151d0e56f8dc6292773603d0d6aabbdd62a11ef721d1542d8, they cannot reverse-engineer the original password "password" (though they could try guessing common passwords — which is why we use salting and specialized password hash functions).

4. Collision Resistance

It should be computationally infeasible to find two different inputs that produce the same hash output. Since hash functions map an infinite input space to a finite output space, collisions must mathematically exist (this is the pigeonhole principle). The security guarantee is that finding one should take longer than the expected lifetime of the algorithm's use.

For SHA-256 with its 256-bit output, finding a collision by brute force requires approximately 2^128 operations (the birthday attack bound) — more operations than there are atoms in the observable universe.

5. Avalanche Effect

A small change in the input should produce a drastically different hash. This is sometimes called the "avalanche effect" or "diffusion." Here is a concrete example with SHA-256:

Input	SHA-256 Hash
hello	`2cf24dba5fb0a30e26e83b2ac5b9e29e...`
Hello	`185f8db32271fe25f561a6fc938b2e26...`
hello!	`ce06092fb948d9ffac7d1a376e404b26...`

Changing just one character (lowercase 'h' to uppercase 'H') changes every single character of the hash output. There is no way to predict how the hash will change, and no way to work backwards from the hash to the input.

How Hash Functions Work Internally

While the full internals are complex, the basic process follows a common pattern across most hash algorithms:

Step 1: Padding. The input is padded to a multiple of the algorithm's block size (512 bits for MD5/SHA-1/SHA-256, 1024 bits for SHA-512). The padding includes the original message length.

Step 2: Initialization. An internal state is initialized with fixed constants (the "initial hash values"). For SHA-256, these are derived from the fractional parts of the square roots of the first 8 prime numbers.

Step 3: Compression. Each block of the padded input is processed through a compression function that updates the internal state. This function typically involves bitwise operations (AND, OR, XOR, rotations), modular addition, and message scheduling. SHA-256 performs 64 rounds of compression per block; SHA-512 performs 80 rounds.

Step 4: Finalization. After all blocks are processed, the final internal state is output as the hash value.

The security of the algorithm depends on the compression function making each bit of the output depend on every bit of the input in a complex, non-linear way.

MD5: History, Vulnerabilities, and Current Usage

MD5 (Message Digest Algorithm 5) was designed by Ronald Rivest in 1991 as a successor to MD4. It produces a 128-bit (32 hex character) hash and was the dominant hash algorithm through the 1990s and early 2000s.

The Fall of MD5

1996: Hans Dobbertin found collisions in MD5's compression function, raising initial concerns.
2004: Xiaoyun Wang and colleagues demonstrated the first practical collision attack, finding two different inputs with the same MD5 hash in about one hour.
2006: Vlastimil Klima published a method to find MD5 collisions in one minute on a laptop.
2008: Alexander Sotirov and others used an MD5 collision to forge a rogue Certificate Authority certificate, enabling HTTPS interception. This was the definitive proof that MD5 was dangerous in practice.
2012: The Flame malware (attributed to state-level actors) used an MD5 collision to forge Windows Update certificates.

Today, collision attacks on MD5 can be performed in seconds on consumer hardware.

When MD5 Is Still Acceptable

Despite being cryptographically broken, MD5 remains in use for non-security purposes:

Non-malicious file checksums: Verifying that a file download was not corrupted during transfer (not tampered with)
Data deduplication: Identifying duplicate files in storage systems
Cache keys: Generating short, unique identifiers for cached data
Hash table distribution: Distributing keys across hash table buckets
Legacy system compatibility: Interoperating with older systems that only support MD5

The key distinction: if an attacker could benefit from creating a collision, do not use MD5. If you only need a fast checksum for non-adversarial integrity checking, MD5 is fine.

SHA-1: The Deprecated Middle Ground

SHA-1 (Secure Hash Algorithm 1) was published by the NSA in 1995 and produces a 160-bit (40 hex character) hash. It was the most widely used hash function for over a decade, powering SSL/TLS certificates, Git commits, and code signing.

SHA-1's Demise

2005: Theoretical collision attacks showed SHA-1 was weaker than expected.
2017: Google and CWI Amsterdam demonstrated the first practical SHA-1 collision ("SHAttered"), finding two different PDF files with the same SHA-1 hash. The attack required 6,500 years of CPU computation and 110 years of GPU computation — expensive but feasible for well-funded attackers.
2020: A chosen-prefix collision attack reduced the cost to roughly $45,000, making it accessible to smaller organizations.

SHA-1 Today

Deprecated by all major browsers for TLS certificates (since 2017)
Git still uses SHA-1 for commit hashes but is actively transitioning to SHA-256 (git hash object format v2)
Should not be used for any new security application

SHA-256 and SHA-512: The Current Standard

SHA-256 and SHA-512 are part of the SHA-2 family, published by the NSA in 2001. They are the current recommended standard for cryptographic hashing.

SHA-256

Output: 256 bits (64 hex characters)
Block size: 512 bits (operates on 32-bit words)
Rounds: 64 rounds of compression
Security: No known practical or theoretical attacks. Finding a collision would require approximately 2^128 operations.
Used in: TLS 1.2/1.3 certificates, Bitcoin (double-SHA-256), HMAC in API authentication (AWS, Stripe), Docker content addressing, file integrity verification

SHA-512

Output: 512 bits (128 hex characters)
Block size: 1024 bits (operates on 64-bit words)
Rounds: 80 rounds of compression
Security: Even stronger than SHA-256, with a collision bound of approximately 2^256 operations
Performance: Paradoxically, SHA-512 is often faster than SHA-256 on 64-bit processors because its algorithm naturally uses 64-bit operations that map directly to modern CPU instructions
Used in: Certificate authorities, government/military applications, SSH keys, some cryptocurrency implementations

SHA-384

SHA-384 is a truncated version of SHA-512 — it uses the same algorithm but with different initial values and outputs only 384 bits. It provides a middle ground between SHA-256 and SHA-512 security levels.

MD5 vs SHA-256 vs SHA-512: Complete Comparison

Property	MD5	SHA-256	SHA-512
Output size	128 bits (32 hex)	256 bits (64 hex)	512 bits (128 hex)
Block size	512 bits	512 bits	1024 bits
Word size	32-bit	32-bit	64-bit
Rounds	64	64	80
Security status	Broken (2004)	Secure	Secure
Collision resistance	None (seconds)	~2^128 operations	~2^256 operations
Speed (software)	~650 MB/s	~250 MB/s	~350 MB/s*
Speed (hardware/ASIC)	Extremely fast	Fast (Bitcoin ASICs)	Moderate
Suitable for security	No	Yes	Yes
Suitable for passwords	No	No (too fast)	No (too fast)

*SHA-512 is faster than SHA-256 on 64-bit CPUs due to native 64-bit word operations.

Bottom line: Use SHA-256 for general-purpose security (file integrity, digital signatures, HMAC). Use SHA-512 when you need maximum security or are operating on 64-bit systems where SHA-512 is actually faster. Never use MD5 for security.

Hashing vs Encryption: What Is the Difference?

This is one of the most common confusions in cryptography:

	Hashing	Encryption
Direction	One-way (irreversible)	Two-way (reversible with key)
Purpose	Verify integrity and authenticity	Protect confidentiality
Key required?	No (except HMAC)	Yes (symmetric or asymmetric)
Output size	Fixed (e.g., always 256 bits)	Variable (proportional to input)
Same input → same output?	Always	No (with proper IV/nonce)
Examples	SHA-256, MD5, bcrypt	AES, RSA, ChaCha20

Hashing is for when you need to verify that data hasn't changed (file checksums, password verification, digital signatures) but don't need to recover the original data.

Encryption is for when you need to protect data so only authorized parties can read it (HTTPS, disk encryption, messaging).

You often use both together: TLS uses encryption (AES) to protect data in transit, and hashing (SHA-256 via HMAC) to verify data integrity.

Password Hashing: Why SHA-256 Is Not Enough

SHA-256 is a secure hash function, but it is not suitable for password hashing. The reason is counterintuitive: SHA-256 is too fast. An attacker with a GPU can compute billions of SHA-256 hashes per second, trying every common password in seconds.

Purpose-Built Password Hash Functions

Modern password hashing uses specialized algorithms designed to be intentionally slow:

Algorithm	Year	Key Feature	Resistance
bcrypt	1999	Adjustable cost factor, based on Blowfish	CPU-hard
scrypt	2009	Memory-hard (requires large RAM)	CPU + memory hard
Argon2	2015	Winner of Password Hashing Competition	CPU + memory + parallelism hard

Argon2 (specifically Argon2id) is the current recommended standard for new systems. It has three configurable parameters:

Time cost: Number of iterations (more = slower)
Memory cost: Amount of RAM required (more = harder to parallelize on GPUs)
Parallelism: Number of threads

Salting

All password hash functions use salting — adding a unique random value to each password before hashing. This means:

Two users with the same password get different hashes
Precomputed rainbow tables are useless
Attackers must crack each password individually

The salt is stored alongside the hash (it doesn't need to be secret, just unique).

HMAC: Hash-Based Message Authentication

HMAC (Hash-based Message Authentication Code) combines a hash function with a secret key to create an authentication code. Defined in RFC 2104, the HMAC construction is:

HMAC(K, m) = H((K ⊕ opad) || H((K ⊕ ipad) || m))

Where H is the hash function, K is the key (padded to block size), ipad is 0x36 repeated, and opad is 0x5c repeated.

Why Not Just Hash the Key + Message?

A naive approach like H(key || message) is vulnerable to length extension attacks — an attacker who knows H(key || message) can compute H(key || message || attacker_data) without knowing the key. HMAC's nested construction prevents this.

HMAC in Practice

HMAC-SHA256 (HS256): The most common algorithm for JWT token signing
AWS Signature V4: Uses HMAC-SHA256 for API request authentication
Stripe webhooks: Signed with HMAC-SHA256 for verification
TLS 1.3: Uses HMAC as part of the HKDF key derivation function
OAuth 1.0: Request signing with HMAC-SHA1

You can generate HMAC values with our hash generator tool using the HMAC tab — it supports HMAC-MD5, HMAC-SHA1, HMAC-SHA256, HMAC-SHA384, and HMAC-SHA512.

Real-World Applications of Hash Functions

Blockchain and Cryptocurrency

Bitcoin's proof-of-work system is built on SHA-256. Each block header is hashed with double-SHA-256 (SHA256(SHA256(header))), and miners compete to find an input whose hash starts with a certain number of zeros. Ethereum uses Keccak-256 (a SHA-3 variant). The entire security model of blockchain — immutable ledgers, transaction verification — depends on collision resistance.

TLS/SSL Certificates

When you visit an HTTPS website, the server's certificate is signed using a hash function (typically SHA-256). Your browser verifies the certificate by computing the hash and checking it against the signature. The entire certificate chain from root CA to the website depends on hash collision resistance.

Git Version Control

Git identifies every object (commits, trees, blobs) by its SHA-1 hash. When you run git commit, Git computes the SHA-1 hash of the commit object, which becomes the commit ID. Git is actively transitioning to SHA-256 for stronger security, but the SHA-1-based system has served reliably for nearly two decades.

Digital Signatures

Digital signatures work by hashing a document with SHA-256, then encrypting the hash with the signer's private key. The recipient decrypts the signature with the signer's public key and compares it to their own hash of the document. If they match, the document is authentic and unmodified.

Content-Addressable Storage

Systems like Docker, IPFS, and package managers (npm, pip) use SHA-256 hashes as content addresses. A Docker image layer is identified by the SHA-256 hash of its contents, ensuring that the same content always resolves to the same identifier and cannot be tampered with.

Subresource Integrity (SRI)

Web browsers can verify that CDN-hosted JavaScript and CSS files haven't been tampered with using SRI hashes. The integrity attribute contains a Base64-encoded SHA-384 hash of the expected file contents:

HTML4 lines

Highlighting code...

<script src="https://cdn.example.com/lib.js"
  integrity="sha384-oqVuAfXRKap7fdgcCY5uykM6+R9GqQ8K/uxy9rx7HNQlGYl1kPzQho1wx4JwY8w"
  crossorigin="anonymous"></script>

The Future: SHA-3 and Beyond

SHA-3 (Keccak) was standardized by NIST in 2015 as a backup in case SHA-2 is broken. It uses a completely different internal structure (the "sponge construction") compared to SHA-2's Merkle-Damgard construction. While SHA-2 remains secure and widely used, SHA-3 provides defense in depth.

BLAKE3, released in 2020, is a newer hash function that is significantly faster than SHA-256 (often 5-10x) while maintaining strong security properties. It is gaining adoption in applications where hash speed is critical, such as file integrity checking and deduplication.

Frequently Asked Questions

What is the difference between hashing and encryption?

Hashing is one-way — you cannot recover the original input from a hash. Encryption is two-way — you can decrypt the ciphertext back to plaintext with the correct key. Hashing verifies integrity; encryption protects confidentiality. They serve different purposes and are often used together in security protocols like TLS.

Can a hash be reversed or "decrypted"?

No. A hash cannot be reversed because information is lost during the hashing process (the input space is infinite but the output space is fixed). However, attackers can try to find the input by guessing (brute force) or using precomputed tables (rainbow tables). This is why passwords should use salted, slow hash functions like bcrypt or Argon2.

Is SHA-256 quantum-safe?

Partially. Grover's algorithm on a quantum computer could reduce the brute-force search for a pre-image from 2^256 to 2^128 operations, which is still infeasible. For collision finding, the impact is less significant. Current consensus is that SHA-256 provides approximately 128 bits of security against quantum attacks, which is considered sufficient for the foreseeable future.

Why does Bitcoin use double-SHA-256 instead of single SHA-256?

Bitcoin uses SHA256(SHA256(x)) as a precaution against length extension attacks and to add a safety margin. When Satoshi Nakamoto designed Bitcoin in 2008, concerns about potential weaknesses in single SHA-256 (similar to what happened with MD5 and SHA-1) motivated the double-hashing approach. In practice, single SHA-256 remains secure, so the double hashing is a conservative design choice.

What is the fastest hash function that is still secure?

BLAKE3 is currently the fastest general-purpose cryptographic hash function, often achieving 5-10x the throughput of SHA-256 by leveraging parallelism and SIMD instructions. For standardized algorithms, SHA-512 is faster than SHA-256 on 64-bit systems. For non-cryptographic speed (when security is not needed), xxHash and MurmurHash are orders of magnitude faster.

How long would it take to crack a SHA-256 hash by brute force?

For a random 256-bit input, finding a pre-image would require approximately 2^256 operations — about 1.16 x 10^77 guesses. Even if every atom in the observable universe (~10^80) were a computer performing 10^18 hashes per second, it would take approximately 3.7 x 10^41 years. SHA-256 pre-images are, for all practical purposes, impossible to find by brute force.

Conclusion

Cryptographic hash functions are one of the most fundamental building blocks of modern security. Understanding the differences between MD5 (broken), SHA-1 (deprecated), and SHA-256/SHA-512 (current standard) helps you make informed decisions about which to use in your applications. Remember: SHA-256 for general security, bcrypt/Argon2 for passwords, HMAC for authentication, and never MD5 for security.

Try our free online hash generator to compute MD5, SHA-1, SHA-256, SHA-384, SHA-512 hashes and HMAC codes for text and files — all processing happens in your browser.