(recode.info)Charset overview


Next: Surface overview Prev: Introduction Up: Introduction
Enter node , (file) or (file)node

Overview of charsets
====================

   Recoding is currently possible between many charsets, the bulk of
which is described by RFC 1345 tables or available in the `iconv'
library.  Note: Tabular, and Note: libiconv.  The `recode' library
also handles some charsets in some specialised ways.  These are:

   * 6-bit charsets based on CDC display code: 6/12 code from NOS;
     bang-bang code from Universite' de Montre'al;

   * 7-bit ASCII: without any diacritics, or else: using backspace for
     overstriking; Unisys' Icon convention; TeX/LaTeX coding; easy
     French conventions for electronic mail;

   * 8-bit extensions to ASCII: ISO Latin-1, Atari ST code, IBM's code
     for the PC, Apple's code for the Macintosh;

   * 8-bit non-ASCII codes: three flavours of EBCDIC;

   * 16-bit or 31-bit universal characters, and their transfer
     encodings.

   The introduction of RFC 1345 in `recode' has brought with it a few
charsets having the functionality of older ones, but yet being different
in subtle ways.  The effects have not been fully investigated yet, so
for now, clashes are avoided, the old and new charsets are kept well
separate.

   Conversion is possible between almost any pair of charsets.  Here is
a list of the exceptions.  One may not recode _from_ the `flat',
`count-characters' or `dump-with-names' charsets, nor _from_ or _to_
the `data', `tree' or `:libiconv:' charsets.  Also, if we except the
`data' and `tree' pseudo-charsets, charsets and surfaces live in
disjoint recoding spaces, one cannot really transform a surface into a
charset or vice-versa, as surfaces are only meant to be applied over
charsets, or removed from them.


automatically generated by info2www version 1.2.2.9