Very interesting question, looking forward to going through the responses
anonymous
2016-03-16 02:18:33 UTC
What you are finding are extensions to the original 7 bit ASCII code. It is called 7 bit because there was only 128 characters in the set. Codes above 128 can vary depending on who made it, software or a number of other factors. Most of the original internet worked on 7 bit ASCII and 8 bits was binary. You can still do an FTP transfer using 7 bit text or binary modes. Email is also typically limited to 7 bits and the ways around this limitation are generally hidden from most people. Part of the reason may be due to the "char" variable type in C using the 8th bit as a sign bit. The reason it is all in 8 bits is that most computers use 8 bit bytes to communicate these days. You simply ignore the 8th bit or it may be used as a sign bit and even a parity bit in some instances. Shadow Wolf
Jallan
2012-10-04 17:17:05 UTC
Strict ASCII is a characters set in which each character is 7 bits. Accordingly it contains 128 characters, See http://www.ecma-international.org/publications/standards/Ecma-006.htm for one standard defining it. Normally ASCII occurs in an 8-bit environment and the 8th bit allows an additional 128 characters. There are a number of different 8-bit character sets that are sometimes called extended ASCII character sets. There are also various encodings that slightly vary the characters in the strict ASCII set but these are now very seldom used.
Most computer systems today have one of these extended ASCII character sets as the base of one of their character sets.
UTF-8 is one of the official encodings of the Unicode character set, along with UTF-16 and UTF-32. All three encodings equally cover every character in Unicode. UTF-8 is the normal encoding used on the web. It is also the basic encoding used on current Macintosh and LInux machines. Windows uses UTF-16. The first 128 characters of Unicode is identical to ASCII and accordingly the first 128 characters in UTF-8 is not distinguishable from ASCII in an 8-bit environment.
For the characters in ASCII and the first 128 characters of UTF-8 see http://www.unicode.org/charts/PDF/U0000.pdf . For a complete list of Unicode characters see http://unicode.org/charts/ and click on the various charts.
Look up ASCII and UTF-8 at http://www.unicode.org/glossary/ for more information. See also http://en.wikipedia.org/wiki/ASCII and http://en.wikipedia.org/wiki/UTF-8 .
ⓘ
This content was originally posted on Y! Answers, a Q&A website that shut down in 2021.