Base64 Encoding Explained: How It Works, Data URIs & JWT
Learn how Base64 encoding works, why it exists, and where it's used. Covers the encoding algorithm, data URIs, JWT tokens, email attachments, and programming examples in 7 languages.
Base64 Encoding Explained: How It Works, Data URIs & JWT
Base64 encoding is one of the most widely used encoding schemes in computing, yet many developers use it without fully understanding how it works or why it exists. Every time you embed an image in an HTML email, decode a JWT token, handle HTTP Basic Authentication, or open a PEM certificate file, you are working with Base64.
This guide explains the Base64 encoding algorithm from the ground up, explores its most common applications, and shows you how to encode and decode Base64 in seven programming languages. Try our free Base64 encoder and decoder to experiment with any of the examples below.
What Is Base64?
Base64 is a binary-to-text encoding scheme that represents binary data using 64 printable ASCII characters. It was designed to safely transmit binary data over channels that only support text, such as email (SMTP) and early HTTP.
The name "Base64" comes from the fact that it uses a 64-character alphabet:
| Index Range | Characters | Count |
|---|---|---|
| 0-25 | A-Z | 26 |
| 26-51 | a-z | 26 |
| 52-61 | 0-9 | 10 |
| 62 | + | 1 |
| 63 | / | 1 |
| Padding | = | — |
This gives exactly 64 characters (2^6), meaning each Base64 character encodes exactly 6 bits of data.
How Base64 Encoding Works
The Base64 algorithm converts binary data in three steps:
Step 1: Convert Input to Binary
Each input byte is represented as 8 bits. For text input, this means converting each character to its ASCII or UTF-8 byte value first.
Example: The text "Man" in ASCII:
| Character | ASCII Decimal | Binary (8 bits) |
|---|---|---|
| M | 77 | 01001101 |
| a | 97 | 01100001 |
| n | 110 | 01101110 |
Step 2: Split into 6-bit Groups
The combined 24 bits (3 bytes × 8 bits) are split into four groups of 6 bits:
Original: 01001101 01100001 01101110
-------- -------- --------
Regrouped: 010011 010110 000101 101110
------ ------ ------ ------
Step 3: Map to Base64 Characters
Each 6-bit value (0-63) maps to a character in the Base64 alphabet:
| 6-bit Value | Decimal | Base64 Character |
|---|---|---|
| 010011 | 19 | T |
| 010110 | 22 | W |
| 000101 | 5 | F |
| 101110 | 46 | u |
Result: "Man" → "TWFu"
Handling Padding
Since Base64 works on groups of 3 bytes (24 bits = four 6-bit groups), input that is not a multiple of 3 bytes requires padding:
| Input Length | Padding Needed | Example |
|---|---|---|
| 3 bytes (divisible) | None | "Man" → "TWFu" |
| 2 bytes (remainder 2) | One = | "Ma" → "TWE=" |
| 1 byte (remainder 1) | Two == | "M" → "TQ==" |
The = padding character tells the decoder how many bytes were actually encoded in the final group.
The 33% Size Increase
Base64 encoding always increases data size by approximately 33%. This is because every 3 bytes of input (24 bits) become 4 bytes of output (4 ASCII characters). The exact ratio is 4/3 = 1.333...
| Original Size | Base64 Size | Overhead |
|---|---|---|
| 1 KB | 1.33 KB | +0.33 KB |
| 10 KB | 13.3 KB | +3.3 KB |
| 100 KB | 133 KB | +33 KB |
| 1 MB | 1.33 MB | +0.33 MB |
This overhead is the fundamental trade-off of Base64: larger size in exchange for safe text-based representation. For this reason, Base64 is best suited for small payloads -- not for encoding large files unless necessary.
Where Base64 Is Used
1. Data URIs (Embedding Files in HTML/CSS)
Data URIs allow you to embed file content directly into HTML or CSS, eliminating an HTTP request:
The format is: data:[MIME type];base64,[encoded data]
When to use data URIs:
- Small images under 5-10 KB (icons, decorative elements)
- CSS background images used on every page
- Single-use images that don't benefit from caching
When NOT to use data URIs:
- Images larger than 10 KB (the 33% overhead adds up)
- Images shared across multiple pages (cannot be cached separately)
- Performance-critical pages (base64 strings increase HTML parse time)
2. JWT (JSON Web Tokens)
JWT tokens use a URL-safe variant of Base64 (called Base64URL) for their three parts:
header.payload.signature
Each part is Base64URL-encoded. For example, decoding a JWT payload:
eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiaWF0IjoxNTE2MjM5MDIyfQ
Decodes to:
Important: JWT tokens are encoded, not encrypted. Anyone can decode and read the payload. Never store sensitive data (passwords, credit card numbers) in JWT payloads.
3. Email Attachments (MIME)
Email was originally designed for plain ASCII text. The MIME (Multipurpose Internet Mail Extensions) standard uses Base64 to encode binary attachments:
Content-Type: application/pdf
Content-Transfer-Encoding: base64
JVBERi0xLjQKMSAwIG9iago8PAovVHlwZSAvQ2F0YWxvZwov...
Every email attachment you have ever sent or received was Base64-encoded in transit.
4. HTTP Basic Authentication
HTTP Basic Auth encodes credentials as username:password in Base64:
Authorization: Basic dXNlcm5hbWU6cGFzc3dvcmQ=
Decoding dXNlcm5hbWU6cGFzc3dvcmQ= reveals username:password. This is not secure on its own -- it must be used over HTTPS.
5. Cryptographic Keys and Certificates
PEM (Privacy Enhanced Mail) format wraps Base64-encoded binary data between header and footer lines:
-----BEGIN PUBLIC KEY-----
MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA...
-----END PUBLIC KEY-----
SSL/TLS certificates, SSH keys, and GPG keys all use this format.
Standard Base64 vs URL-Safe Base64
Standard Base64 uses + and /, which have special meaning in URLs. URL-safe Base64 (Base64URL, defined in RFC 4648) replaces them:
| Feature | Standard Base64 | URL-Safe Base64 |
|---|---|---|
| Character 62 | + | - |
| Character 63 | / | _ |
| Padding | = (required) | Often omitted |
| Used in | Email, PEM, data URIs | URLs, JWT, filenames |
Base64 vs Other Encodings
| Encoding | Base | Size Increase | Character Set | Common Use |
|---|---|---|---|---|
| Base64 | 64 | ~33% | A-Z, a-z, 0-9, +, / | Email, data URIs, JWT |
| Base32 | 32 | ~60% | A-Z, 2-7 | TOTP codes, onion addresses |
| Base16 (Hex) | 16 | 100% | 0-9, A-F | Hash digests, color codes |
| Base85 (Ascii85) | 85 | ~25% | 33-117 ASCII | PDF, Git binary patches |
Base64 offers the best balance of compactness and compatibility for most use cases.
Base64 in Programming Languages
JavaScript
Python
Command Line (macOS/Linux)
Java
PHP
Go
C#
Base64 Is NOT Encryption
A common misconception: Base64 is often confused with encryption. They are fundamentally different:
| Aspect | Base64 Encoding | Encryption |
|---|---|---|
| Purpose | Data format conversion | Data protection |
| Reversibility | Anyone can decode | Requires secret key |
| Security | None | Strong (when properly implemented) |
| Use case | Transport compatibility | Confidentiality |
Never use Base64 to "hide" sensitive data. It provides zero security. If you need to protect data, use proper encryption (AES-256, RSA) and then optionally Base64-encode the encrypted output for transport.
Frequently Asked Questions
Can I Base64 encode an image?
Yes. Read the image file as binary bytes, then Base64-encode the bytes. The result is a text string that can be embedded in HTML, CSS, JSON, or any text format. However, the encoded string will be ~33% larger than the original file. For web use, only embed small images (under 5-10 KB) as data URIs.
Why does my Base64 string end with = or ==?
The = characters are padding. They appear when the input length is not a multiple of 3 bytes. One = means the last group had 2 bytes; == means the last group had 1 byte. Padding ensures the encoded string length is always a multiple of 4.
Is Base64 encoding the same in all programming languages?
The standard Base64 algorithm (RFC 4648) is identical across all languages -- the same input always produces the same output. However, some languages default to different line-wrapping behavior (e.g., Java's getMimeEncoder() adds line breaks every 76 characters for MIME compliance).
Ready to encode or decode Base64? Try our Base64 encoder and decoder for instant conversion with size comparison. For other encoding tools, check out our URL encoder, hex to text converter, and binary translator.