Encodings

The HTTP header will specify what the character encoding is in documents being sent. This allows us to know how we need to trick the encoder. If nothing is specified it will default to ISO-8859-1 (latin 1). Example encoding: Content-Type: text/html; charset=utf-8 Telling the server our content type: -PHP: header('Content-type: text/html; charset=utf-8'); -ASP.NET: <%Response.charset="utf-8"%> -JSP: <%@ page contentType="text/html; charset=UTF-8" %> -HTML: <meta http-equiv="Content-Type" Content="text/html; charset=utf-8"> -HTML5: <meta charset="utf-8">

Base conversions

PHP: <?=base_convert("OHPE",36,10);?> //base 36 to dec(10), flip for encode <?=base64_encode('encode this string')?> //Encode <?=base64_decode('ZW5jb2RlIHRoaXMgc3RyaW5n')?> JS: (1142690).toString(36) //encode, dec to 36 1142690..toString(36) //alternative parseInt("ohpe",36) //decode Win base64: window.btoa('encode this string'); //Encode window.atob('ZW5jb2RlIHRoaXMgc3RyaW5n'); //Decode

Extending an encoding

To include characters that are outside of the encoding character-set or to change a character like < to be the text version. We can use the following syntax: HTML5: &#D; //here we replace D with the Unicode decimal character number &#xH; //here we replace H with the Unicode hexadecimal character number HTML: U+0026 U+0023 D U+003B U+0026 U+0023 U+0058 H U+003B There are also some common ones that don't need hex/dec numbers: &lt; represents the < sign. &gt; represents the > sign. &amp; represents the & sign. &quot; represents the " mark. Reference list for the U+ encodings: https://html.spec.whatwg.org/#named-character-references

Last updated