Converters

URL Encoding Guide: Percent Encoding, RFC 3986 & Common Mistakes

Learn how URL encoding works, which characters need percent-encoding, and common pitfalls. Covers RFC 3986 rules, encodeURIComponent vs encodeURI, double encoding, and programming examples.

Published March 20, 2026
15 minute read
Cryptography Guide

URL Encoding Guide: Percent Encoding, RFC 3986 & Common Mistakes

If you have ever seen %20 in a web address or wondered why your API request broke when you included an & in a query parameter, you have encountered URL encoding. Also known as percent encoding, it is the mechanism that makes it possible to include special characters in URLs without breaking the web.

This guide covers everything developers need to know about URL encoding: which characters need encoding, how the algorithm works, common mistakes that break applications, and encoding functions in seven programming languages. Try our free URL encoder and decoder to see percent encoding in action.

What Is URL Encoding?

URL encoding (officially called percent encoding in RFC 3986) is the process of converting characters into a format that can be safely included in a URL. Characters that have special meaning in URL syntax, or that fall outside the allowed ASCII range, are replaced with a % sign followed by their two-digit hexadecimal value.

For example:

  • Space → %20
  • Ampersand &%26
  • Equals =%3D
  • Slash /%2F
  • Euro sign %E2%82%AC (three UTF-8 bytes)

Why URL Encoding Is Necessary

URLs have a strict syntax defined in RFC 3986. Certain characters serve as structural delimiters:

https://example.com:8080/path/to/page?key=value&other=data#section
 ___    ___________  ____  ___________  _____________________  _______
 scheme    host     port      path         query string       fragment

Each delimiter character has a defined role:

CharacterRole in URLWhat Happens Without Encoding
:Separates scheme from host, host from portkey=time:30 confuses parsers
/Separates path segments/search/cats/dogs looks like 3 segments
?Starts query string?query=what? has ambiguous meaning
#Starts fragment identifierkey=C# truncates at #
&Separates query parametersname=Tom&Jerry creates 2 parameters
=Separates key from valueequation=2+2=4 has 2 equals signs
@Separates userinfo from hostemail=user@host looks like authentication
+Space (in form encoding only)Ambiguous: plus sign or space?

When these characters appear as data rather than delimiters, they must be percent-encoded.

Which Characters Need Encoding?

RFC 3986 classifies characters into three groups:

Unreserved Characters (Never Need Encoding)

These characters can appear anywhere in a URL without encoding:

A-Z  a-z  0-9  -  _  .  ~

Total: 66 characters.

Reserved Characters (Encode When Used as Data)

These characters have special meaning in URL syntax. They must be encoded when used as data values:

CharacterHex CodeRole
:%3AScheme/port separator
/%2FPath separator
?%3FQuery start
#%23Fragment start
[%5BIPv6 brackets
]%5DIPv6 brackets
@%40Userinfo separator
!%21Sub-delimiter
$%24Sub-delimiter
&%26Parameter separator
'%27Sub-delimiter
(%28Sub-delimiter
)%29Sub-delimiter
*%2ASub-delimiter
+%2BSub-delimiter / space
,%2CSub-delimiter
;%3BSub-delimiter
=%3DKey-value separator

Everything Else (Always Encode)

All other characters -- including spaces, non-ASCII characters, and control characters -- must always be encoded:

CharacterEncodingNotes
Space%20Most common encoding
"%22Double quote
<%3CLess than
>%3EGreater than
{%7BLeft brace
}%7DRight brace
``%7C
\%5CBackslash
^%5ECaret
`%60Backtick
Non-ASCIIMulti-byteUTF-8 encoded, then percent-encoded

How Percent Encoding Works

The encoding algorithm is straightforward:

  1. Check if the character needs encoding (not unreserved, or is a reserved character used as data)
  2. Convert the character to its UTF-8 byte sequence
  3. Express each byte as %XX where XX is the hexadecimal value

Example: Encoding "café"

CharacterUTF-8 BytesPercent-Encoded
c63c (unreserved)
a61a (unreserved)
f66f (unreserved)
éC3 A9%C3%A9

Result: caf%C3%A9

Example: Encoding a Full Query String

Original URL:

https://example.com/search?q=hello world&lang=en&tag=c#

Properly encoded:

https://example.com/search?q=hello%20world&lang=en&tag=c%23

Note: & and = are kept unencoded because they serve their structural role. Only the data values (hello world, c#) are encoded.

Spaces: %20 vs + (Plus Sign)

The two encodings for spaces cause endless confusion:

FormatSpace EncodingStandardWhere Used
Percent encoding%20RFC 3986URL paths, general use
Form encoding+HTML spec (application/x-www-form-urlencoded)HTML form submissions

Best practice: Use %20 for spaces in URL paths. In query strings, both %20 and + are widely accepted, but %20 is more universally compatible.

encodeURI vs encodeURIComponent

JavaScript provides two URL encoding functions with critically different behavior:

encodeURI()

Encodes a complete URI. Preserves characters that have structural meaning in URLs:

JavaScript3 lines
Highlighting code...

Does NOT encode: ; , / ? : @ & = + $ - _ . ! ~ * ' ( ) #

encodeURIComponent()

Encodes a URI component (e.g., a single query parameter value). Encodes everything except unreserved characters:

JavaScript3 lines
Highlighting code...

Does NOT encode: A-Z a-z 0-9 - _ . ! ~ * ' ( )

When to Use Which

ScenarioFunctionExample
Encoding a complete URLencodeURI()encodeURI(fullUrl)
Encoding a query parameter valueencodeURIComponent()?q=${encodeURIComponent(searchTerm)}
Encoding a path segmentencodeURIComponent()/users/${encodeURIComponent(username)}
Encoding a redirect URL as a parameterencodeURIComponent()?redirect=${encodeURIComponent(returnUrl)}

Rule of thumb: If the value might contain &, =, ?, or /, use encodeURIComponent().

Common Mistakes

1. Double Encoding

The most common URL encoding bug. It happens when you encode a string that is already encoded:

Original:    hello world
First encode:  hello%20world
Double encode: hello%2520world  ← WRONG

The % in %20 gets encoded as %25, producing %2520. The server receives literal %20 instead of a space.

How to prevent: Always decode before re-encoding, or track whether a string is already encoded.

2. Not Encoding Query Parameter Values

JavaScript6 lines
Highlighting code...

If searchTerm is "cats & dogs", the wrong version creates /search?q=cats & dogs, which splits into two parameters: q=cats and dogs (or worse).

3. Using encodeURI() for Parameter Values

JavaScript8 lines
Highlighting code...

4. Forgetting Non-ASCII Characters

URLs with non-ASCII characters (accented letters, CJK, emoji) must be encoded:

❌ https://example.com/café
✅ https://example.com/caf%C3%A9

Modern browsers display the decoded version in the address bar, but the actual HTTP request uses percent encoding.

URL Encoding in Programming Languages

LanguageEncodeDecode
JavaScriptencodeURIComponent(str)decodeURIComponent(str)
Pythonurllib.parse.quote(str)urllib.parse.unquote(str)
JavaURLEncoder.encode(str, "UTF-8")URLDecoder.decode(str, "UTF-8")
PHPrawurlencode($str)rawurldecode($str)
C#Uri.EscapeDataString(str)Uri.UnescapeDataString(str)
Gourl.QueryEscape(str)url.QueryUnescape(str)
RubyCGI.escape(str)CGI.unescape(str)

Note: Java's URLEncoder.encode() and PHP's urlencode() use + for spaces (form encoding). Use rawurlencode() in PHP for RFC 3986 compliance with %20 for spaces.

URL Encoding vs HTML Encoding

These two encoding systems are frequently confused:

FeatureURL EncodingHTML Encoding
PurposeSafe characters in URLsSafe characters in HTML
Format%XX (hex bytes)&name; or &#number;
Space%20&nbsp; or just a space
&%26&amp;
<%3C&lt;
"%22&quot;
StandardRFC 3986HTML specification
Used forURL query strings, pathsHTML attributes, content

They serve completely different purposes and are not interchangeable.

Frequently Asked Questions

What does %20 mean in a URL?

%20 is the percent-encoded representation of a space character. The % indicates encoding, and 20 is the hexadecimal value of the space character (decimal 32). When a browser or server encounters %20 in a URL, it replaces it with a space.

Is URL encoding case-sensitive?

The hex digits in percent encoding (%3A vs %3a) are case-insensitive per RFC 3986, but the standard recommends uppercase. Most implementations accept both. However, the rest of the URL path may be case-sensitive depending on the server (Linux servers are typically case-sensitive, Windows servers are not).

How do international domain names work?

International Domain Names (IDN) use a system called Punycode to convert Unicode domain names to ASCII. For example, münchen.de becomes xn--mnchen-3ya.de. The path and query components use standard percent encoding for non-ASCII characters.


Ready to encode or decode URLs? Try our URL encoder and decoder for instant percent encoding with a character breakdown. For other encoding tools, check out our Base64 encoder and hex to text converter.

About This Article

This article is part of our comprehensive converters cipher tutorial series. Learn more about classical cryptography and explore our interactive cipher tools.

More Converters Tutorials

Try Converters Cipher Tool

Put your knowledge into practice with our interactive convertersencryption and decryption tool.

Try Converters Tool