(recode.info)Texte


Next: Mule Prev: Others Up: Miscellaneous
Enter node , (file) or (file)node

Easy French conventions
=======================

   This charset is available in `recode' under the name `Texte' and has
`txte' for an alias.  It is a seven bits code, identical to `ASCII-BS',
save for French diacritics which are noted using a slightly different
convention.

   At text entry time, these conventions provide a little speed up.  At
read time, they slightly improve the readability over a few alternate
ways of coding diacritics.  Of course, it would better to have a
specialised keyboard to make direct eight bits entries and fonts for
immediately displaying eight bit ISO Latin-1 characters.  But not
everybody is so fortunate.  In a few mailing environments, and sadly
enough, it still happens that the eight bit is often willing-fully
destroyed.

   Easy French has been in use in France for a while.  I only slightly
adapted it (the diaeresis option) to make it more comfortable to several
usages in Que'bec originating from Universite' de Montre'al.  In fact,
the main problem for me was not to necessarily to invent Easy French,
but to recognise the "best" convention to use, (best is not being
defined, here) and to try to solve the main pitfalls associated with
the selected convention.  Shortly said, we have:

`e''
     for `e' (and some other vowels) with an acute accent,

`e`'
     for `e' (and some other vowels) with a grave accent,

`e^'
     for `e' (and some other vowels) with a circumflex accent,

`e"'
     for `e' (and some other vowels) with a diaeresis,

`c,'
     for `c' with a cedilla.

There is no attempt at expressing the `ae' and `oe' diphthongs.  French
also uses tildes over `n' and `a', but seldomly, and this is not
represented either.  In some countries, `:' is used instead of `"' to
mark diaeresis.  `recode' supports only one convention per call,
depending on the `-c' option of the `recode' command.  French quotes
(sometimes called "angle quotes") are noted the same way English quotes
are noted in TeX, _id est_ by ```' and `'''.  No effort has been put to
preserve Latin ligatures (`ae', `oe') which are representable in
several other charsets.  So, these ligatures may be lost through Easy
French conventions.

   The convention is prone to losing information, because the diacritic
meaning overloads some characters that already have other uses.  To
alleviate this, some knowledge of the French language is boosted into
the recognition routines.  So, the following subtleties are
systematically obeyed by the various recognisers.

  1. A comma which follows a `c' is interpreted as a cedilla only if it
     is followed by one of the vowels `a', `o' or `u'.

  2. A single quote which follows a `e' does not necessarily means an
     acute accent if it is followed by a single other one.  For example:

    `e''
          will give an `e' with an acute accent.

    `e'''
          will give a simple `e', with a closing quotation mark.

    `e''''
          will give an `e' with an acute accent, followed by a closing
          quotation mark.

     There is a problem induced by this convention if there are English
     quotations with a French text.  In sentences like:

          There's a meeting at Archie's restaurant.

     the single quotes will be mistaken twice for acute accents.  So
     English contractions and suffix possessives could be mangled.

  3. A double quote or colon, depending on `-c' option, which follows a
     vowel is interpreted as diaeresis only if it is followed by
     another letter.  But there are in French several words that _end_
     with a diaeresis, and the `recode' library is aware of them.
     There are words ending in "igue", either feminine words without a
     relative masculine (besaigue" and cigue"), or feminine words with
     a relative masculine(1) (aigue", ambigue", contigue", exigue",
     subaigue" and suraigue").  There are also words not ending in
     "igue", but instead, either ending by "i"(2) (ai", congai", goi",
     hai"kai", inoui", sai", samurai", thai" and tokai"), ending by "e"
     (canoe") or ending by "u"(3) (Esau").

     Just to complete this topic, note that it would be wrong to make a
     rule for all words ending in "igue" as needing a diaerisis, as
     there are counter-examples (becfigue, be`sigue, bigue, bordigue,
     bourdigue, brigue, contre-digue, digue, d'intrigue, fatigue,
     figue, garrigue, gigue, igue, intrigue, ligue, prodigue, sarigue
     and zigue).

   ---------- Footnotes ----------

   (1) There are supposed to be seven words in this case.  So, one is
missing.

   (2) Look at one of the following sentences (the second has to be
interpreted with the `-c' option):

     "Ai"e!  Voici le proble`me que j'ai"
     Ai:e!  Voici le proble`me que j'ai:

   There is an ambiguity between an ai", the small animal, and the
indicative future of _avoir_ (first person singular), when followed by
what could be a diaeresis mark.  Hopefully, the case is solved by the
fact that an apostrophe always precedes the verb and almost never the
animal.

   (3) I did not pay attention to proper nouns, but this one showed up
as being fairly evident.


automatically generated by info2www version 1.2.2.9