| • Science | • People | • Locations | • Timeline |
Many languages or language families not based on the Latin alphabet such as Greek, Russian, Arabic, or Hebrew have historically been represented on computers with 8-bit extended ASCII encodings including the ISO 8859 family of character sets. Written East Asian languages, specifically Chinese, Japanese, and KoreanThe Korean language is the most widely used language in Korea, and is the official language of both South and North Korea. The language is also spoken widely in neighbouring Yanbian, China. Worldwide, there are around 78 million Korean speakers, including, use far more characters than fit in an 8- bit computer byteThis article refers to the unit of binary information. Byte was also the name of a popular computer industry magazine, see Byte magazine. A byte is commonly used as a unit of storage measurement in computers, regardless of the type of data being stored. and were first represented on computers with language-specific double byte encodings . ISO 2022 was developed as a technique to represent characters in multiple character sets within a single character encoding. The ISO 2022 character encodings include escape sequenceAn escape sequence is a series of characters used to trigger some sort of command state in computers and their attached peripherals. It is commonly used when the computer and the peripheral have only a single channel in which to send information back ands which indicate the character set for characters which follow. The escape sequences are registered with ISO and are often three characters long starting with the ASCIIASCII A merican S tandard C ode for I nformation I nterchange , generally pronounced 'aski', is a character set and a character encoding based on the Roman alphabet as used in modern English and other Western European languages. It is most commonly used b ESCAPE character (hexadecimal 1B, octal 33). These character encodings require data to be processed sequentially in a forward direction since the correct interpretation of the data depends on the most recently encountered escape sequence. Although the ISO 2022 character sets, particularly ISO-2022-JP, are still in common use most modern E-mail software is converting to the use of UnicodeIn computing, Unicode is the international standard whose goal is to provide the means to encode the text of every document people want to store in computers. This includes all scripts still in active use today, many scripts known only by scholars, and sy character encodings such as UTF-8UTF-8 (8- bit Unicode Transformation Format) is a lossless, variable-length character encoding for Unicode created by Rob Pike and Ken Thompson. It uses groups of bytes to represent the Unicode standard for the alphabets of many of the world's languages..