HTML Encoder / Decoder

Encode special HTML characters to entities (& < >) or decode entities back to characters. Named and numeric modes.

Encode HTML special characters to entities (&lt;, &amp;, &quot;) or decode an entity-encoded string back to its original form. Useful for embedding raw HTML safely into a documentation page, debugging output that has been double-encoded somewhere along the way, and decoding snippets that arrive entity-mangled (named entities plus any numeric or hex character reference). The encode and decode both run in your browser, so any HTML containing personal data or secrets stays local during the round-trip.

Input
Output

Common HTML Entities

&&amp;
<&lt;
>&gt;
"&quot;
'&#39;
©&copy;
®&reg;
&trade;
&euro;
£&pound;
¥&yen;
°&deg;
Runs right inside your browser tab. No uploads. Your files stay private.

What Is HTML Encoding?

HTML encoding (also called HTML escaping) replaces characters that have special meaning in HTML - the angle brackets that mark up tags, the ampersand that begins entities, and the quote characters that delimit attribute values - with named or numeric entity references. The browser then displays them as literal characters instead of treating them as syntax.
This encoder uses a small in-house mapping (ENTITY_MAP in the source above) for the most common named entities. Named mode escapes the five mandatory characters plus that curated symbol set; numeric mode walks the input by code point and escapes the five mandatory characters plus every non-ASCII or control character as a decimal reference. The five mandatory escapes for a safe HTML encoder are &amp; (must come first - escaping it last would double-escape everything else), <, >, &quot;, and &#39; for the apostrophe. Without &#39;, attribute values single-quoted in HTML can be broken by injected single quotes - a real XSS vector.
Two output modes are exposed. Named entities like &amp;copy; are human-readable; the HTML5 spec defines about 2,000 of them, though this tool emits only the curated set listed above. Numeric entities like &amp;#169; (decimal) or &amp;#xA9; (hex) work for any Unicode code point and are the only safe choice for characters outside the named set; numeric mode emits the decimal form here. The decoder understands all three styles - this tool's named entities plus any decimal or hex numeric reference - so anything you encode with this tool round-trips losslessly; named entities outside the curated set (pasted from elsewhere) are left untouched rather than decoded.
Where this matters most: server-side rendering. Frameworks like React (auto-escapes children), Vue (auto-escapes interpolations), and Django (auto-escapes via {{ }} unless you mark a string as |safe) handle this for you. Old-school string-concatenation rendering (PHP echo, raw template literals into innerHTML, dangerouslySetInnerHTML in React) does not - and that's where most XSS vulnerabilities live. Run any user-controlled string through HTML encoding before it touches innerHTML or its equivalent.
HTML encoding is context-specific. Inside an HTML element body, encoding the five characters above is sufficient. Inside an HTML attribute, you also need to consider the quoting style. Inside a <script> block, HTML encoding does nothing useful - you need JavaScript escaping. Inside a URL attribute (href, src), you need URL encoding instead. Inside a CSS context (<style>, style=), you need CSS escaping. Treating HTML encoding as a universal &quot;sanitizer&quot; is the most common security mistake - it isn't, it's only one of four context-specific encodings.
The decoder handles a more interesting edge case correctly: numeric references like &amp;#x1F600; can encode characters outside the BMP (the smiley face emoji is at U+1F600). Because the decoder resolves each reference with String.fromCodePoint rather than the 16-bit String.fromCharCode, code points above 0xFFFF are reconstructed as a single character instead of being split into broken surrogate halves - so emoji and astral-plane scripts round-trip cleanly.
What this tool deliberately does not do: it does not parse or sanitize HTML. If you paste <script>evil()</script>, the tool encodes the angle brackets but does not reach in and remove the script element. Encoding makes it safe to display as text; if you want to remove dangerous tags entirely while keeping safe ones (a paste-from-Word workflow, say), use a real HTML sanitizer like DOMPurify, which runs in-browser and handles attribute filtering, URL scheme allowlists, and namespace coercion.

Common Use Cases

01

XSS-safe output

Encode user comments before inserting them into a server-rendered HTML page when your framework doesn't auto-escape.

02

Displaying code samples

Encode an HTML snippet so it renders as text inside a documentation page instead of being interpreted as markup.

03

Email template safety

Encode dynamic merge-fields in HTML email templates so a user's name with an angle bracket doesn't break the layout.

04

Decoding API responses

Reverse double-encoded HTML returned from legacy APIs that wrapped already-escaped output in another encoding pass.

Frequently Asked Questions

Named entities (&amp;copy;, &amp;mdash;) use a memorable mnemonic from the HTML spec - there are around 2,000 in HTML5. Numeric entities (&amp;#169; decimal, &amp;#xA9; hex) work for any Unicode code point and don't depend on the browser knowing the name. Numeric is universal; named is more readable.
Only HTML-context XSS. Encoding the five characters &amp; < > &quot; &#39; makes user content safe inside an element body and properly-quoted attributes. JavaScript contexts need JS string escaping, URL attributes need URL encoding, CSS contexts need CSS escaping. Use a context-aware framework or a library like DOMPurify for defense in depth.
&amp;nbsp; is the non-breaking space (U+00A0). Browsers treat it as a space character but won't wrap a line at it and won't collapse it with adjacent whitespace, which is why it's used to keep numbers and units together (10&amp;nbsp;kg) or as a layout shim in tables.
Because every other HTML entity starts with &amp;. If you escape < into &amp;lt; first and then escape &amp; into &amp;amp;, your < becomes &amp;amp;lt; - double-encoded. Always escape the ampersand first, then the other characters, which is exactly what the encoder here does in a single pass - each character is examined once and converted in place, so an already-escaped &amp; is never re-scanned.
Named mode encodes the five HTML-mandatory characters (&amp; < > &quot; &#39;) plus a curated set of common typographic and currency symbols (© ® ™ € £ ¥ &deg; &plusmn; × ÷ — – non-breaking space); everything else is left as-is. Numeric mode encodes those five mandatory characters plus every non-ASCII or control character as a decimal reference, so accented letters, CJK, and emoji are all escaped.
Yes, across the full Unicode range. Numeric mode walks the input with codePointAt, so characters above U+FFFF (emoji, ancient scripts) are emitted as a single decimal reference rather than split surrogate halves, and the decoder rebuilds them with String.fromCodePoint. Encode any text to numeric and decode it back and you get the original string, emoji included.
For one-off encoding while writing template strings or debugging, yes. For runtime user-content rendering, prefer your framework's built-in escaping (React's {} interpolation, Vue's &lcub;&lcub; &rcub;&rcub;, Django's autoescape) - those are battle-tested across many context combinations and won't miss edge cases.
Encoding turns dangerous characters into safe text representations - the page displays <script> as literal text. Sanitizing parses the input as HTML, removes dangerous elements and attributes, and re-serializes only the safe parts. Use encoding when you want the input shown as text; use a sanitizer like DOMPurify when you want to allow some HTML (bold, links) but block others (script, iframe).
No - the encoder operates on character-level escapes only. CDATA is a parser-level concept used in XML and XHTML to disable entity interpretation inside a block; if you're generating XHTML you may need to wrap content in <!&lbrack;CDATA&lbrack;...&rbrack;&rbrack;> instead of escaping. Modern HTML5 parsing doesn't use CDATA outside SVG and MathML.
No. The encoder and decoder are pure string operations running in JavaScript inside your tab. There is no network call during encoding or decoding, so user content stays in your browser.
Maintained by the WebToolVerse teamLast updated Suggest an edit

Advertisement