URL Encoding Explained: Percent-Encoding, Reserved Chars, UTF-8

Why URLs Need Encoding

URLs reserve certain characters as structural delimiters: / separates paths, ? starts the query string, & separates query parameters, # marks a fragment. If your data contains any of these characters, browsers and servers would interpret them as structure rather than content — breaking your URL.

Example problem — a search query for "black & white":

/* Wrong: ampersand breaks the parameter parsing */
https://example.com/search?q=black & white

/* Correct: ampersand encoded */
https://example.com/search?q=black%20%26%20white

How Percent-Encoding Works

Each encoded character is replaced with % followed by two hexadecimal digits representing the character's byte value in UTF-8:

Character	Encoded	Why encoded
space	`%20` or `+` (in queries)	Not allowed in URLs
&	`%26`	Parameter separator
?	`%3F`	Starts query string
#	`%23`	Starts fragment
/	`%2F`	Path separator
=	`%3D`	Key-value delimiter
+	`%2B`	Means space in query context
%	`%25`	Escape character itself
:	`%3A`	Scheme separator

Reserved vs Unreserved Characters

Per RFC 3986:

Unreserved (never need encoding): A-Z a-z 0-9 - _ . ~
Reserved (encode when they'd be confused with structure): : / ? # [ ] @ ! $ & ' ( ) * + , ; =
Other (always encode): spaces, non-ASCII, control characters

Unreserved characters are guaranteed safe to pass through any URL context unchanged. Reserved characters are safe only when they're not being interpreted as structure — but since you usually don't control interpretation, encoding them is the safe default.

encodeURI vs encodeURIComponent

JavaScript provides two built-in functions that behave differently:

// encodeURI: encodes spaces and non-ASCII but leaves URL structure intact
encodeURI("https://example.com/path with spaces?a=b&c=d");
// Result: "https://example.com/path%20with%20spaces?a=b&c=d"

// encodeURIComponent: encodes EVERYTHING including URL structure
encodeURIComponent("https://example.com/path with spaces?a=b&c=d");
// Result: "https%3A%2F%2Fexample.com%2Fpath%20with%20spaces%3Fa%3Db%26c%3Dd"

Rule of thumb: use encodeURIComponent for individual query values. Use encodeURI only when encoding a complete URL that might contain spaces, and you want to preserve the URL structure.

// Correct way to build a query URL:
const query = "black & white";
const url = `https://example.com/search?q=${encodeURIComponent(query)}`;
// Result: "https://example.com/search?q=black%20%26%20white"

UTF-8 and Non-ASCII Characters

Non-English characters like ñ, 中, 한, 日 are multi-byte in UTF-8. Each byte gets its own %HH escape:

encodeURIComponent("ñ");      // "%C3%B1"      (2 bytes)
encodeURIComponent("中");      // "%E4%B8%AD"   (3 bytes)
encodeURIComponent("한");      // "%ED%95%9C"   (3 bytes)
encodeURIComponent("😀");      // "%F0%9F%98%80" (4 bytes — emoji)

Modern browsers display unencoded UTF-8 characters in the address bar for readability (IDN — Internationalized Domain Names), but the underlying HTTP request still sends the percent-encoded version. This is why a URL looks clean in your browser but shows %E4%B8%AD when you copy-paste it.

Query String: + vs %20 for Spaces

In query strings (after ?), + is a valid alternative for space. In paths, it's literal "+". This inconsistency causes real bugs:

// encodeURIComponent always produces %20, never +
encodeURIComponent("hello world");
// "hello%20world"

// Browsers decode BOTH + and %20 as space in query strings
// Result of search?q=hello+world == search?q=hello%20world

// But in path segments, + stays as literal +
// /tags/c%2B%2B → "c++"  (encoded)
// /tags/c++     → "c++"  (no special meaning)

Safer to always use %20 everywhere. The + shortcut is a legacy HTML form convention that causes ambiguity outside of pure query-string contexts.

Decoding URLs

// JavaScript decoding
decodeURIComponent("hello%20world");        // "hello world"
decodeURIComponent("black%20%26%20white");  // "black & white"
decodeURIComponent("%E4%B8%AD");            // "中"

// Reversibility: encode then decode gives you the original
const original = "Hello, 世界! & more";
const encoded = encodeURIComponent(original);
const decoded = decodeURIComponent(encoded);
decoded === original;  // true

Common Pitfalls

Double-encoding — accidentally calling encodeURIComponent twice. "hello world" becomes "hello%2520world" (the % got re-encoded). Decode once correctly reverses it; decoding twice garbles.
Using encodeURI for parameter values — it doesn't encode &, so an ampersand in user input breaks the URL structure. Always use encodeURIComponent for values.
Trusting URLs in forms — HTML forms encode with application/x-www-form-urlencoded, which uses + for spaces. Server-side parsers must handle both + and %20.
Encoding already-encoded URLs — if you're passing a full URL as a query parameter (e.g., a redirect URL), encode it once with encodeURIComponent. Don't pre-encode.

Frequently Asked Questions

What's the maximum URL length?: No hard spec limit, but practical limits: browsers support 2,000-8,000 characters (IE was 2,083 — still quoted as "the limit"). Servers may reject longer. For anything beyond a few hundred characters of parameter data, use a POST request with a body instead of a GET query string.
Should I encode the full URL or just the parameters?: Encode individual parameter values with encodeURIComponent. Never encode the full URL — you'd convert the // in the scheme, breaking it. Build the URL structure manually with ${encodeURIComponent(value)} interpolation.
Why does my URL look fine in the browser but fail when shared?: Browsers display UTF-8 characters unencoded for readability. When copied, some systems get the displayed version and don't re-encode properly. Always generate URLs with encodeURIComponent — don't rely on the browser's display form being portable.
Is URL encoding case-sensitive?: The hex digits themselves are not — %3F and %3f both mean ?. Per RFC 3986, uppercase is preferred for consistency, and some strict parsers only accept uppercase. Use uppercase when generating.
What about URL encoding vs HTML entity encoding?: They're different and not interchangeable. HTML entities (&, <) escape characters in HTML documents. URL encoding (%26, %3C) escapes characters in URLs. If you embed a URL inside HTML, you may need both — URL encode the URL first, then HTML-entity encode the ampersands in the resulting URL string.