More than Just Lines on a Map: Best Practices for U.S Bike Routes
A test of character; ASCII silly question get a silly ANSI
1. A test of character
ASCII silly question get a silly ANSI
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15. ASCII What’s wrong with it?
• Actually called US-ASCII
• [a-zA-Z0-9] + …
• It uses 7 bits not 8 bits
• Defining 128 ‘characters’
• Not defining the other 128
• Just about covers English script
• Minimum level of interoperability
16. CJK and font encoding for other languages
• Upper 128
• Completely different often 2-byte encoding
• Standards? Often not
26. Encoding
• Multibyte
• 1 byte space for 256 things
• 2 bytes space for 65536 things
• 3 bytes 16,777,216
• 4 bytes 4,294,967,296
• UTF encodings
• UTF-16 – two bytes
• UTF-32 – four bytes
• Variable size
• UTF-8 can encode lots of things
27. UTF-8
• 00000000 – 0111111 (just like ASCII)
• 11000000,10000000 – 11011111,11111111
• U+0080 – U+07FF
• 11100000,10000000,10000000 – 11010000,11111111,11111111
• U+0800 – U+FFFF
• 11110000,10000000,10000000,10000000 – …
• Bigger code points
• Now we can encode things and not break (old stuff) {much}
28.
29. The good stuff
• Good stuff
• Room for everything
• CJK fits
• Hieroglyphs fit
• Emojis fit