(recode.info)UTF-16


Next: count-characters Prev: UTF-8 Up: Universal
Enter node , (file) or (file)node

Universal Transformation Format, 16 bits
========================================

   Another external surface of `UCS' is also variable length, each
character using either two or four bytes.  It is usable for the subset
defined by the first million characters (17 * 2^16) of `UCS'.

   Martin J. Du"rst writes (to `comp.std.internat', on 1995-03-28):

     `UTF-16' is another method that reserves two times 1024 codepoints
     in Unicode and uses them to index around one million additional
     characters.  `UTF-16' is a little bit like former multibyte codes,
     but quite not so, as both the first and the second 16-bit code
     clearly show what they are.  The idea is that one million
     codepoints should be enough for all the rare Chinese ideograms and
     historical scripts that do not fit into the Base Multilingual
     Plane of ISO 10646 (with just about 63,000 positions available,
     now that 2,000 are gone).

   This charset is available in `recode' under the name `UTF-16'.
Accepted aliases are `Unicode', `TF-16' and `u6'.


automatically generated by info2www version 1.2.2.9