(recode.info)Request level


Next: Task level Prev: Outer level Up: Library
Enter node , (file) or (file)node

Request level functions
=======================

   The request level functions are meant to cover most recoding needs
programmers may have; they should provide all usual functionality.
Their API is almost stable by now.  To get started with request level
functions, here is a full example of a program which sole job is to
filter `ibmpc' code on its standard input into `latin1' code on its
standard output.

     #include <stdio.h>
     #include <stdbool.h>
     #include <recode.h>
     
     const char *program_name;
     
     int
     main (int argc, char *const *argv)
     {
       program_name = argv[0];
       RECODE_OUTER outer = recode_new_outer (true);
       RECODE_REQUEST request = recode_new_request (outer);
       bool success;
     
       recode_scan_request (request, "ibmpc..latin1");
     
       success = recode_file_to_file (request, stdin, stdout);
     
       recode_delete_request (request);
       recode_delete_outer (outer);
     
       exit (success ? 0 : 1);
     }

   The header file `<recode.h>' declares a `RECODE_REQUEST' structure,
which the programmer should use for allocating a variable in his
program.  This REQUEST variable is given as a first argument to all
request level functions, and in most cases, may be considered as opaque.

   * Initialisation functions

          RECODE_REQUEST recode_new_request (OUTER);
          bool recode_delete_request (REQUEST);

     No REQUEST variable may not be used in other request level
     functions of the recoding library before having been initialised by
     `recode_new_request'.  There may be many such REQUEST variables,
     in which case, they are independent of one another and they all
     need to be initialised separately.  To avoid memory leaks, a
     REQUEST variable should not be initialised a second time without
     calling `recode_delete_request' to "un-initialise" it.

     Like for `recode_delete_outer', calling `recode_delete_request'
     prior to program termination, in the example above, may be left
     out.

   * Fields of `struct recode_request'

     Here are the fields of a `struct recode_request' which may be
     meaningfully changed, once a REQUEST has been initialised by
     `recode_new_request', but before it gets used.  It is not very
     frequent, in practice, that these fields need to be changed.  To
     access the fields, you need to include `recodext.h' _instead_ of
     `recode.h', in which case there also is a greater chance that you
     need to recompile your programs if a new version of the recoding
     library gets installed.

    `verbose_flag'
          This field is initially `false'.  When set to `true', the
          library will echo to stderr the sequence of elementary
          recoding steps needed to achieve the requested recoding.

    `diaeresis_char'
          This field is initially the ASCII value of a double quote `"',
          but it may also be the ASCII value of a colon `:'.  In `texte'
          charset, some countries use double quotes to mark diaeresis,
          while other countries prefer colons.  This field contains the
          diaeresis character for the `texte' charset.

    `make_header_flag'
          This field is initially `false'.  When set to `true', it
          indicates that the program is merely trying to produce a
          recoding table in source form rather than completing any
          actual recoding.  In such a case, the optimisation of step
          sequence can be attempted much more aggressively.  If the
          step sequence cannot be reduced to a single step, table
          production will fail.

    `diacritics_only'
          This field is initially `false'.  For `HTML' and `LaTeX'
          charset, it is often convenient to recode the diacriticized
          characters only, while just not recoding other HTML code
          using ampersands or angular brackets, or LaTeX code using
          backslashes.  Set the field to `true' for getting this
          behaviour.  In the other charset, one can edit text as well
          as HTML or LaTeX directives.

    `ascii_graphics'
          This field is initially `false', and relate to characters 176
          to 223 in the `ibmpc' charset, which are use to draw boxes.
          When set to `true', while getting out of `ibmpc', ASCII
          characters are selected so to graphically approximate these
          boxes.

   * Study of request strings

          bool recode_scan_request (REQUEST, "STRING");

     The main role of a REQUEST variable is to describe a set of
     recoding transformations.  Function `recode_scan_request' studies
     the given STRING, and stores an internal representation of it into
     REQUEST.  Note that STRING may be a full-fledged `recode' request,
     possibly including surfaces specifications, intermediary charsets,
     sequences, aliases or abbreviations (Note: Requests).

     The internal representation automatically receives some
     pre-conditioning and optimisation, so the REQUEST may then later
     be used many times to achieve many actual recodings.  It would not
     be efficient calling `recode_scan_request' many times with the
     same STRING, it is better having many REQUEST variables instead.

   * Actual recoding jobs

     Once the REQUEST variable holds the description of a recoding
     transformation, a few functions use it for achieving an actual
     recoding.  Either input or output of a recoding may be string, an
     in-memory buffer, or a file.

     Functions with names like `recode_INPUT-TYPE_to_OUTPUT-TYPE'
     request an actual recoding, and are described below.  It is easy
     to remember which arguments each function accepts, once grasped
     some simple principles for each possible TYPE.  However, one of
     the recoding function escapes these principles and is discussed
     separately, first.

          recode_string (REQUEST, STRING);

     The function `recode_string' recodes STRING according to REQUEST,
     and directly returns the resulting recoded string freshly
     allocated, or `NULL' if the recoding could not succeed for some
     reason.  When this function is used, it is the responsibility of
     the programmer to ensure that the memory used by the returned
     string is later reclaimed.

          char *recode_string_to_buffer (REQUEST,
            INPUT_STRING,
            &OUTPUT_BUFFER, &OUTPUT_LENGTH, &OUTPUT_ALLOCATED);
          bool recode_string_to_file (REQUEST,
            INPUT_FILE,
            OUTPUT_FILE);
          bool recode_buffer_to_buffer (REQUEST,
            INPUT_BUFFER, INPUT_LENGTH,
            &OUTPUT_BUFFER, &OUTPUT_LENGTH, &OUTPUT_ALLOCATED);
          bool recode_buffer_to_file (REQUEST,
            INPUT_BUFFER, INPUT_LENGTH,
            OUTPUT_FILE);
          bool recode_file_to_buffer (REQUEST,
            INPUT_FILE,
            &OUTPUT_BUFFER, &OUTPUT_LENGTH, &OUTPUT_ALLOCATED);
          bool recode_file_to_file (REQUEST,
            INPUT_FILE,
            OUTPUT_FILE);

     All these functions return a `bool' result, `false' meaning that
     the recoding was not successful, often because of reversibility
     issues.  The name of the function well indicates on which types it
     reads and which type it produces.  Let's discuss these three types
     in turn.

    string
          A string is merely an in-memory buffer which is terminated by
          a `NUL' character (using as many bytes as needed), instead of
          being described by a byte length.  For input, a pointer to
          the buffer is given through one argument.

          It is notable that there is no `to_string' functions.  Only
          one function recodes into a string, and it is
          `recode_string', which has already been discussed separately,
          above.

    buffer
          A buffer is a sequence of bytes held in computer memory.  For
          input, two arguments provide a pointer to the start of the
          buffer and its byte size.  Note that for charsets using many
          bytes per character, the size is given in bytes, not in
          characters.

          For output, three arguments provide the address of three
          variables, which will receive the buffer pointer, the used
          buffer size in bytes, and the allocated buffer size in bytes.
          If at the time of the call, the buffer pointer is `NULL',
          then the allocated buffer size should also be zero, and the
          buffer will be allocated afresh by the recoding functions.
          However, if the buffer pointer is not `NULL', it should be
          already allocated, the allocated buffer size then gives its
          size.  If the allocated size gets exceeded while the recoding
          goes, the buffer will be automatically reallocated bigger,
          probably elsewhere, and the allocated buffer size will be
          adjusted accordingly.

          The second variable, giving the in-memory buffer size, will
          receive the exact byte size which was needed for the
          recoding.  A `NUL' character is guaranteed at the end of the
          produced buffer, but is not counted in the byte size of the
          recoding.  Beyond that `NUL', there might be some extra space
          after the recoded data, extending to the allocated buffer
          size.

    file
          A file is a sequence of bytes held outside computer memory,
          but buffered through it.  For input, one argument provides a
          pointer to a file already opened for read.  The file is then
          read and recoded from its current position until the end of
          the file, effectively swallowing it in memory if the
          destination of the recoding is a buffer.  For reading a file
          filtered through the recoding library, but only a little bit
          at a time, one should rather use `recode_filter_open' and
          `recode_filter_close' (these two functions are not yet
          available).

          For output, one argument provides a pointer to a file already
          opened for write.  The result of the recoding is written to
          that file starting at its current position.

   The following special function is still subject to change:

     void recode_format_table (REQUEST, LANGUAGE, "NAME");

and is not documented anymore for now.


automatically generated by info2www version 1.2.2.9