(flex.info)Options for Scanner Speed and Size


16.4 Options for Scanner Speed and Size
=======================================

'-C[aefFmr]'
     controls the degree of table compression and, more generally,
     trade-offs between small scanners and fast scanners.

     '-C'
          A lone '-C' specifies that the scanner tables should be
          compressed but neither equivalence classes nor
          meta-equivalence classes should be used.

     '-Ca, --align, '%option align''
          ("align") instructs flex to trade off larger tables in the
          generated scanner for faster performance because the elements
          of the tables are better aligned for memory access and
          computation.  On some RISC architectures, fetching and
          manipulating longwords is more efficient than with
          smaller-sized units such as shortwords.  This option can
          quadruple the size of the tables used by your scanner.

     '-Ce, --ecs, '%option ecs''
          directs 'flex' to construct "equivalence classes", i.e., sets
          of characters which have identical lexical properties (for
          example, if the only appearance of digits in the 'flex' input
          is in the character class "[0-9]" then the digits '0', '1',
          ..., '9' will all be put in the same equivalence class).
          Equivalence classes usually give dramatic reductions in the
          final table/object file sizes (typically a factor of 2-5) and
          are pretty cheap performance-wise (one array look-up per
          character scanned).

     '-Cf'
          specifies that the "full" scanner tables should be generated -
          'flex' should not compress the tables by taking advantages of
          similar transition functions for different states.

     '-CF'
          specifies that the alternate fast scanner representation
          (described above under the '--fast' flag) should be used.
          This option cannot be used with '--c++'.

     '-Cm, --meta-ecs, '%option meta-ecs''
          directs 'flex' to construct "meta-equivalence classes", which
          are sets of equivalence classes (or characters, if equivalence
          classes are not being used) that are commonly used together.
          Meta-equivalence classes are often a big win when using
          compressed tables, but they have a moderate performance impact
          (one or two 'if' tests and one array look-up per character
          scanned).

     '-Cr, --read, '%option read''
          causes the generated scanner to _bypass_ use of the standard
          I/O library ('stdio') for input.  Instead of calling 'fread()'
          or 'getc()', the scanner will use the 'read()' system call,
          resulting in a performance gain which varies from system to
          system, but in general is probably negligible unless you are
          also using '-Cf' or '-CF'.  Using '-Cr' can cause strange
          behavior if, for example, you read from 'yyin' using 'stdio'
          prior to calling the scanner (because the scanner will miss
          whatever text your previous reads left in the 'stdio' input
          buffer).  '-Cr' has no effect if you define 'YY_INPUT()'
          (Note: Generated Scanner).

     The options '-Cf' or '-CF' and '-Cm' do not make sense together -
     there is no opportunity for meta-equivalence classes if the table
     is not being compressed.  Otherwise the options may be freely
     mixed, and are cumulative.

     The default setting is '-Cem', which specifies that 'flex' should
     generate equivalence classes and meta-equivalence classes.  This
     setting provides the highest degree of table compression.  You can
     trade off faster-executing scanners at the cost of larger tables
     with the following generally being true:

              slowest & smallest
                    -Cem
                    -Cm
                    -Ce
                    -C
                    -C{f,F}e
                    -C{f,F}
                    -C{f,F}a
              fastest & largest

     Note that scanners with the smallest tables are usually generated
     and compiled the quickest, so during development you will usually
     want to use the default, maximal compression.

     '-Cfe' is often a good compromise between speed and size for
     production scanners.

'-f, --full, '%option full''
     specifies "fast scanner".  No table compression is done and 'stdio'
     is bypassed.  The result is large but fast.  This option is
     equivalent to '--Cfr'

'-F, --fast, '%option fast''
     specifies that the _fast_ scanner table representation should be
     used (and 'stdio' bypassed).  This representation is about as fast
     as the full table representation '--full', and for some sets of
     patterns will be considerably smaller (and for others, larger).  In
     general, if the pattern set contains both _keywords_ and a
     catch-all, _identifier_ rule, such as in the set:

              "case"    return TOK_CASE;
              "switch"  return TOK_SWITCH;
              ...
              "default" return TOK_DEFAULT;
              [a-z]+    return TOK_ID;

     then you're better off using the full table representation.  If
     only the _identifier_ rule is present and you then use a hash table
     or some such to detect the keywords, you're better off using
     '--fast'.

     This option is equivalent to '-CFr'.  It cannot be used with
     '--c++'.
automatically generated by info2www version 1.2.2.9