mail us  |  mail this page

products  |  company  |  support  |  downloads  |  isp services  |  contact us

Tech Info Pages - Character Sets

This page summarises what at face value seems a remarkably simple concept - character representation. Turns out its more like a nightmare. The column marked relationship tries to define the relationships between the various standards.

Name Standard Aliases Description Relationship
ASCII ANSI X3.4-1986
ISO 646
ITU-T T.50
US-ASCII
IA5
IRA5
ISO 646
ASCII is encoded as an 8 bit field but only uses the 7 bits 00 to 7F (0 to 127 decimal). Almost all other character codes contain ASCII as a base. Various national definitions exist which typically have only two printable differences. ASCII is the same as IA5 or more properly now International Reference Alphabet No. 5 (IRA5) and previously International Alphabet No. 5 (defined in ITU-T T.50) and ISO 646. It also forms the first 128 values in ISO 8859-1 (Latin-1), Unicode and ISO 10646 (UCS).
IA5 ITU-T T.50 IRA5
ASCII
ISO 646
International Alphabet No. 5 (ISO 646) now renamed International Reference Alphabet No. 5 (IRA5).
IRA5 ITU-T T.50 IA5
ISO 646
ASCII
International Reference Alphabet No. 5 (IRA5) (was International Alphabet No. 5 - IA5) and is the ITU equivalent of US-ASCII and ISO 646. IRA5 is encoded as an 8 bit field but only uses the 7 bits 00 to 7F (0 to 127 decimal). Almost all other character codes contain IRA5 as a base. IRA5 is the same as ISO 646 and ASCII. It also forms the first 128 values in ISO 8859-1 (Latin-1), Unicode and ISO 10646 (UCS).
ISO 646 ISO 646 IA5
IRA5
ASCII
ISO 646 is encoded as an 8 bit field but only uses the 7 bits 00 to 7F (0 to 127 decimal). Almost all other character codes contain ISO 646 as a base. ISO 646 is the same as IRA5 (IA5) and ASCII. It also forms the first 128 values in ISO 8859-1 (Latin-1), Unicode and ISO 10646 (UCS).
ISO 8859-1 ISO 8859-1 Latin-1 ISO 8859-1 is part of a large family (ISO 8859-1 to 8859-16) is encoded as an 8 bit field and uses all 8 bits 00 to FF (0 to 255 decimal). It forms the first 128 values in IRA5, ISO 646, US-ASCII, ISO 8559-15 (Latin-1), Unicode and ISO 10646 (UCS).
ISO 8859-15 ISO 8859-15 Latin-9 ISO 8859-15 is part of a large family (ISO 8859-1 to 8859-16) is encoded as an 8 bit field and uses all 8 bits 00 to FF (0 to 255 decimal). It differs from 8859-1 by 8 changes including the euro symbol. It forms the first 128 values in IRA5, ISO 646, US-ASCII, ISO 8559-1 (Latin-9), Unicode and ISO 10646 (UCS).
ISO 10646 ISO 10646 UCS ISO 10646 (Universal Character Set) is designed to be the replacement for all previous character sets by providing a single family of standards for the encoding of all possible characters and symbols in all written languages. It has two implementations UCS-2 (a 16 bit encoding) and UCS-4 (a 32 bit encoding). The first 128 values in ISO 10646 are the same as ASCII, IA5, IRA5 and ISO 646, 8859-1 and 8859-15. Unicode from version 1.1 is th same as ISO 646.
Unicode Unicode Consortium - Unicode (currently version 3.0). From version 1.1 is fully compatible with ISO 10646.
CP1252 RFC RFC 2781 - Microsoft's version of ISO 8859-1. There are 27 differences from 8859-1 (it includes the euro) - all in range x80 - x9F. The first 128 values are the same as those of IRA5, ISO 646, US-ASCII ISO 8559-1 (Latin-1) & -15 (Latin-9), Unicode and ISO 10646 (UCS).
Transformations
These values define how the underlying codeset of Unicode/ISO 10646 are sent over the wire. They are not charsets.
UTF-7 RFC 2152 - UCS Transformation Format-7. Defines how ISO 10646 (UCS) is transformed for non-MIME email data communications. May use from 1 to 9 octets for a single ISO 10646/Unicode character.
UTF-8 RFC 3629 UTF-2
FSS-UTF
UCS Transformation Format-8. Defines how ISO 10646 (UCS) is transformed for MIME enabled data communications. May use from 1 to 7 octets for a single ISO 10646/Unicode character.
UTF-16 - - UCS Transformation Format-16. Defines how ISO 10646 (UCS) is transformed for data communications. May use 1 or 2 octets for a single ISO 10646/Unicode character and thus reduces any UCS-4 to a UCS-2 format before encoding.

ISO 8859 Family

ISO 8859-1   Latin alphabet No. 1     West European
ISO 8859-2   Latin alphabet No. 2     Central and East European
ISO 8859-3   Latin alphabet No. 3     South European, Maltese & Esperanto
ISO 8859-4   Latin alphabet No. 4     North European
ISO 8859-5   Latin/Cyrillic alphabet  Slavic languages
ISO 8859-6   Latin/Arabic alphabet    Arabic
ISO 8859-7   Latin/Greek alphabet     modern Greek
ISO 8859-8   Latin/Hebrew alphabet    Hebrew and Yiddish
ISO 8859-9   Latin alphabet No. 5     Turkish
ISO 8859-10  Latin alphabet No. 6     Nordic (Sámi, Inuit, Icelandic)
ISO 8859-11  Latin/Thai alphabet      Thai
ISO 8859-12  not been defined)
ISO 8859-13  Latin alphabet No. 7     Baltic Rim
ISO 8859-14  Latin alphabet No. 8     Celtic
ISO 8859-15  Latin alphabet No. 9     adds euro to -1 (8 changes)
ISO 8859-16  Latin alphabet No. 10    South-Eastern Europe


Problems, comments, suggestions, corrections (including broken links) or something to add? Please take the time from a busy life to 'mail us' (at top of screen), the webmaster (below) or info-support at zytrax. You will have a warm inner glow for the rest of the day.

Tech

tech home
web stuff
dom stuff
css stuff
language stuff
regex stuff
rfc stuff
protocol stuff
cable stuff
lan wiring
rs232 wiring
howto stuff
survival stuff
wireless stuff
ascii codes
data rate stuff
telephony stuff
mechanical stuff
pc stuff
electronic stuff
tech links
open guides
RSS Feed Icon RSS Feed

If you are happy it's OK - but your browser is giving a less than optimal experience on our site. You could, at no charge, upgrade to a W3C STANDARDS COMPLIANT browser such as Mozilla

zytrax.com web



Standards

General

ISO (International)
ANSI (US)
DIN (Germany)
ETSI (EU)
BSI (UK)
AFNOR (France)

Telecom

TIA (US)
EIA (US)
ITU (International)
IEEE (US)
ETSI (EU)
OFTEL (UK)

Internet

IETF
IETF-RFCs
IANA
ICANN
W3C

Electronics

JEDEC
EIA (US)

printer friendly

Print Page

SPF Record Conformant Domain Logo

Copyright © 1994 - 2008 ZyTrax, Inc.
All rights reserved. Legal and Privacy
site by zytrax
Hosted by super.net.sg
web-master at zytrax
Page modified: October 07 2007.