(recode.info)Request level
Request level functions
=======================
The request level functions are meant to cover most recoding needs
programmers may have; they should provide all usual functionality.
Their API is almost stable by now. To get started with request level
functions, here is a full example of a program which sole job is to
filter `ibmpc' code on its standard input into `latin1' code on its
standard output.
#include <stdio.h>
#include <stdbool.h>
#include <recode.h>
const char *program_name;
int
main (int argc, char *const *argv)
{
program_name = argv[0];
RECODE_OUTER outer = recode_new_outer (true);
RECODE_REQUEST request = recode_new_request (outer);
bool success;
recode_scan_request (request, "ibmpc..latin1");
success = recode_file_to_file (request, stdin, stdout);
recode_delete_request (request);
recode_delete_outer (outer);
exit (success ? 0 : 1);
}
The header file `<recode.h>' declares a `RECODE_REQUEST' structure,
which the programmer should use for allocating a variable in his
program. This REQUEST variable is given as a first argument to all
request level functions, and in most cases, may be considered as opaque.
* Initialisation functions
RECODE_REQUEST recode_new_request (OUTER);
bool recode_delete_request (REQUEST);
No REQUEST variable may not be used in other request level
functions of the recoding library before having been initialised by
`recode_new_request'. There may be many such REQUEST variables,
in which case, they are independent of one another and they all
need to be initialised separately. To avoid memory leaks, a
REQUEST variable should not be initialised a second time without
calling `recode_delete_request' to "un-initialise" it.
Like for `recode_delete_outer', calling `recode_delete_request'
prior to program termination, in the example above, may be left
out.
* Fields of `struct recode_request'
Here are the fields of a `struct recode_request' which may be
meaningfully changed, once a REQUEST has been initialised by
`recode_new_request', but before it gets used. It is not very
frequent, in practice, that these fields need to be changed. To
access the fields, you need to include `recodext.h' _instead_ of
`recode.h', in which case there also is a greater chance that you
need to recompile your programs if a new version of the recoding
library gets installed.
`verbose_flag'
This field is initially `false'. When set to `true', the
library will echo to stderr the sequence of elementary
recoding steps needed to achieve the requested recoding.
`diaeresis_char'
This field is initially the ASCII value of a double quote `"',
but it may also be the ASCII value of a colon `:'. In `texte'
charset, some countries use double quotes to mark diaeresis,
while other countries prefer colons. This field contains the
diaeresis character for the `texte' charset.
`make_header_flag'
This field is initially `false'. When set to `true', it
indicates that the program is merely trying to produce a
recoding table in source form rather than completing any
actual recoding. In such a case, the optimisation of step
sequence can be attempted much more aggressively. If the
step sequence cannot be reduced to a single step, table
production will fail.
`diacritics_only'
This field is initially `false'. For `HTML' and `LaTeX'
charset, it is often convenient to recode the diacriticized
characters only, while just not recoding other HTML code
using ampersands or angular brackets, or LaTeX code using
backslashes. Set the field to `true' for getting this
behaviour. In the other charset, one can edit text as well
as HTML or LaTeX directives.
`ascii_graphics'
This field is initially `false', and relate to characters 176
to 223 in the `ibmpc' charset, which are use to draw boxes.
When set to `true', while getting out of `ibmpc', ASCII
characters are selected so to graphically approximate these
boxes.
* Study of request strings
bool recode_scan_request (REQUEST, "STRING");
The main role of a REQUEST variable is to describe a set of
recoding transformations. Function `recode_scan_request' studies
the given STRING, and stores an internal representation of it into
REQUEST. Note that STRING may be a full-fledged `recode' request,
possibly including surfaces specifications, intermediary charsets,
sequences, aliases or abbreviations (Note: Requests).
The internal representation automatically receives some
pre-conditioning and optimisation, so the REQUEST may then later
be used many times to achieve many actual recodings. It would not
be efficient calling `recode_scan_request' many times with the
same STRING, it is better having many REQUEST variables instead.
* Actual recoding jobs
Once the REQUEST variable holds the description of a recoding
transformation, a few functions use it for achieving an actual
recoding. Either input or output of a recoding may be string, an
in-memory buffer, or a file.
Functions with names like `recode_INPUT-TYPE_to_OUTPUT-TYPE'
request an actual recoding, and are described below. It is easy
to remember which arguments each function accepts, once grasped
some simple principles for each possible TYPE. However, one of
the recoding function escapes these principles and is discussed
separately, first.
recode_string (REQUEST, STRING);
The function `recode_string' recodes STRING according to REQUEST,
and directly returns the resulting recoded string freshly
allocated, or `NULL' if the recoding could not succeed for some
reason. When this function is used, it is the responsibility of
the programmer to ensure that the memory used by the returned
string is later reclaimed.
char *recode_string_to_buffer (REQUEST,
INPUT_STRING,
&OUTPUT_BUFFER, &OUTPUT_LENGTH, &OUTPUT_ALLOCATED);
bool recode_string_to_file (REQUEST,
INPUT_FILE,
OUTPUT_FILE);
bool recode_buffer_to_buffer (REQUEST,
INPUT_BUFFER, INPUT_LENGTH,
&OUTPUT_BUFFER, &OUTPUT_LENGTH, &OUTPUT_ALLOCATED);
bool recode_buffer_to_file (REQUEST,
INPUT_BUFFER, INPUT_LENGTH,
OUTPUT_FILE);
bool recode_file_to_buffer (REQUEST,
INPUT_FILE,
&OUTPUT_BUFFER, &OUTPUT_LENGTH, &OUTPUT_ALLOCATED);
bool recode_file_to_file (REQUEST,
INPUT_FILE,
OUTPUT_FILE);
All these functions return a `bool' result, `false' meaning that
the recoding was not successful, often because of reversibility
issues. The name of the function well indicates on which types it
reads and which type it produces. Let's discuss these three types
in turn.
string
A string is merely an in-memory buffer which is terminated by
a `NUL' character (using as many bytes as needed), instead of
being described by a byte length. For input, a pointer to
the buffer is given through one argument.
It is notable that there is no `to_string' functions. Only
one function recodes into a string, and it is
`recode_string', which has already been discussed separately,
above.
buffer
A buffer is a sequence of bytes held in computer memory. For
input, two arguments provide a pointer to the start of the
buffer and its byte size. Note that for charsets using many
bytes per character, the size is given in bytes, not in
characters.
For output, three arguments provide the address of three
variables, which will receive the buffer pointer, the used
buffer size in bytes, and the allocated buffer size in bytes.
If at the time of the call, the buffer pointer is `NULL',
then the allocated buffer size should also be zero, and the
buffer will be allocated afresh by the recoding functions.
However, if the buffer pointer is not `NULL', it should be
already allocated, the allocated buffer size then gives its
size. If the allocated size gets exceeded while the recoding
goes, the buffer will be automatically reallocated bigger,
probably elsewhere, and the allocated buffer size will be
adjusted accordingly.
The second variable, giving the in-memory buffer size, will
receive the exact byte size which was needed for the
recoding. A `NUL' character is guaranteed at the end of the
produced buffer, but is not counted in the byte size of the
recoding. Beyond that `NUL', there might be some extra space
after the recoded data, extending to the allocated buffer
size.
file
A file is a sequence of bytes held outside computer memory,
but buffered through it. For input, one argument provides a
pointer to a file already opened for read. The file is then
read and recoded from its current position until the end of
the file, effectively swallowing it in memory if the
destination of the recoding is a buffer. For reading a file
filtered through the recoding library, but only a little bit
at a time, one should rather use `recode_filter_open' and
`recode_filter_close' (these two functions are not yet
available).
For output, one argument provides a pointer to a file already
opened for write. The result of the recoding is written to
that file starting at its current position.
The following special function is still subject to change:
void recode_format_table (REQUEST, LANGUAGE, "NAME");
and is not documented anymore for now.
automatically generated by info2www version 1.2.2.9