ASCII

From Citizendium
Revision as of 17:16, 12 June 2011 by imported>Pat Palmer
Jump to navigation Jump to search

ASCII stands for American Standard Code for Information Interchange and is a character-encoding scheme used internally by computers dating back to (I think) the 1940's. It encodes up to 128 characters as 7-bit values, with the most significant bit of an 8-bit byte unused. Characters with values below decimal 32 were considered control characters and were used to control print heads of devices, or to indicate line ends of text files, just for example. These are not printable or visible to the naked eye most of the time, but they do have important consequences, depending on how particular programs handle them.

Within years of its implementation, it was found that 128 characters were not enough, and a new version of ASCII, called Extended ASCII, was devised that used all 8 bits and could represent 255 characters. Extended ASCII included additional punctuation marks and common "foreign" characters. But within a very few years, it became apparent that 255 characters were also not enough.

Thus, additional character encoding standards were devised, including ISO-8???, which was a remapping of the upper 128 characters of Extended ASCII to include more European language characters. Eventually, all these encodings were superceded by three versions of Unicode encodings, which extend the vocabulary of expressed character encodings to including as many as 24 bits if needed.

During the first three decades after electronic computers were invented, a variety of other encodings competed with ASCII for dominance, including notably EBCDIC from IBM, but in practice ASCII became so entrenched that no one could really afford to do away with its conventions altogether. Thus, each successive character encoding standard attempted to preserve the original 128 values as mapped by ASCII, so that ASCII encodings would continue to work whenever possible. Many files today, although saved with newer (perhaps wider) character encodings, can still be read with at least partial success by legacy programs which only understand ASCII character encodings.