(flex.info)Cxx


Next: Reentrant Prev: Performance Up: Top
Enter node , (file) or (file)node

18 Generating C++ Scanners
**************************

*IMPORTANT*: the present form of the scanning class is _experimental_
and may change considerably between major releases.

   'flex' provides two different ways to generate scanners for use with
C++.  The first way is to simply compile a scanner generated by 'flex'
using a C++ compiler instead of a C compiler.  You should not encounter
any compilation errors (Note: Reporting Bugs).  You can then use C++
code in your rule actions instead of C code.  Note that the default
input source for your scanner remains 'yyin', and default echoing is
still done to 'yyout'.  Both of these remain 'FILE *' variables and not
C++ _streams_.

   You can also use 'flex' to generate a C++ scanner class, using the
'-+' option (or, equivalently, '%option c++)', which is automatically
specified if the name of the 'flex' executable ends in a '+', such as
'flex++'.  When using this option, 'flex' defaults to generating the
scanner to the file 'lex.yy.cc' instead of 'lex.yy.c'.  The generated
scanner includes the header file 'FlexLexer.h', which defines the
interface to two C++ classes.

   The first class in 'FlexLexer.h', 'FlexLexer', provides an abstract
base class defining the general scanner class interface.  It provides
the following member functions:

'const char* YYText()'
     returns the text of the most recently matched token, the equivalent
     of 'yytext'.

'int YYLeng()'
     returns the length of the most recently matched token, the
     equivalent of 'yyleng'.

'int lineno() const'
     returns the current input line number (see '%option yylineno)', or
     '1' if '%option yylineno' was not used.

'void set_debug( int flag )'
     sets the debugging flag for the scanner, equivalent to assigning to
     'yy_flex_debug' (Note: Scanner Options).  Note that you must
     build the scanner using '%option debug' to include debugging
     information in it.

'int debug() const'
     returns the current setting of the debugging flag.

   Also provided are member functions equivalent to
'yy_switch_to_buffer()', 'yy_create_buffer()' (though the first argument
is an 'istream&' object reference and not a 'FILE*)',
'yy_flush_buffer()', 'yy_delete_buffer()', and 'yyrestart()' (again, the
first argument is a 'istream&' object reference).

   The second class defined in 'FlexLexer.h' is 'yyFlexLexer', which is
derived from 'FlexLexer'.  It defines the following additional member
functions:

'yyFlexLexer( istream* arg_yyin = 0, ostream* arg_yyout = 0 )'
'yyFlexLexer( istream& arg_yyin, ostream& arg_yyout )'
     constructs a 'yyFlexLexer' object using the given streams for input
     and output.  If not specified, the streams default to 'cin' and
     'cout', respectively.  'yyFlexLexer' does not take ownership of its
     stream arguments.  It's up to the user to ensure the streams
     pointed to remain alive at least as long as the 'yyFlexLexer'
     instance.

'virtual int yylex()'
     performs the same role is 'yylex()' does for ordinary 'flex'
     scanners: it scans the input stream, consuming tokens, until a
     rule's action returns a value.  If you derive a subclass 'S' from
     'yyFlexLexer' and want to access the member functions and variables
     of 'S' inside 'yylex()', then you need to use '%option yyclass="S"'
     to inform 'flex' that you will be using that subclass instead of
     'yyFlexLexer'.  In this case, rather than generating
     'yyFlexLexer::yylex()', 'flex' generates 'S::yylex()' (and also
     generates a dummy 'yyFlexLexer::yylex()' that calls
     'yyFlexLexer::LexerError()' if called).

'virtual void switch_streams(istream* new_in = 0, ostream* new_out = 0)'
'virtual void switch_streams(istream& new_in, ostream& new_out)'
     reassigns 'yyin' to 'new_in' (if non-null) and 'yyout' to 'new_out'
     (if non-null), deleting the previous input buffer if 'yyin' is
     reassigned.

'int yylex( istream* new_in, ostream* new_out = 0 )'
'int yylex( istream& new_in, ostream& new_out )'
     first switches the input streams via 'switch_streams( new_in,
     new_out )' and then returns the value of 'yylex()'.

   In addition, 'yyFlexLexer' defines the following protected virtual
functions which you can redefine in derived classes to tailor the
scanner:

'virtual int LexerInput( char* buf, int max_size )'
     reads up to 'max_size' characters into 'buf' and returns the number
     of characters read.  To indicate end-of-input, return 0 characters.
     Note that 'interactive' scanners (see the '-B' and '-I' flags in
     Note: Scanner Options) define the macro 'YY_INTERACTIVE'.  If you
     redefine 'LexerInput()' and need to take different actions
     depending on whether or not the scanner might be scanning an
     interactive input source, you can test for the presence of this
     name via '#ifdef' statements.

'virtual void LexerOutput( const char* buf, int size )'
     writes out 'size' characters from the buffer 'buf', which, while
     'NUL'-terminated, may also contain internal 'NUL's if the scanner's
     rules can match text with 'NUL's in them.

'virtual void LexerError( const char* msg )'
     reports a fatal error message.  The default version of this
     function writes the message to the stream 'cerr' and exits.

   Note that a 'yyFlexLexer' object contains its _entire_ scanning
state.  Thus you can use such objects to create reentrant scanners, but
see also Note: Reentrant.  You can instantiate multiple instances of
the same 'yyFlexLexer' class, and you can also combine multiple C++
scanner classes together in the same program using the '-P' option
discussed above.

   Finally, note that the '%array' feature is not available to C++
scanner classes; you must use '%pointer' (the default).

   Here is an example of a simple C++ scanner:

          // An example of using the flex C++ scanner class.
     
         %{
         #include <iostream>
         using namespace std;
         int mylineno = 0;
         %}
     
         %option noyywrap c++
     
         string  \"[^\n"]+\"
     
         ws      [ \t]+
     
         alpha   [A-Za-z]
         dig     [0-9]
         name    ({alpha}|{dig}|\$)({alpha}|{dig}|[_.\-/$])*
         num1    [-+]?{dig}+\.?([eE][-+]?{dig}+)?
         num2    [-+]?{dig}*\.{dig}+([eE][-+]?{dig}+)?
         number  {num1}|{num2}
     
         %%
     
         {ws}    /* skip blanks and tabs */
     
         "/*"    {
                 int c;
     
                 while((c = yyinput()) != 0)
                     {
                     if(c == '\n')
                         ++mylineno;
     
                     else if(c == '*')
                         {
                         if((c = yyinput()) == '/')
                             break;
                         else
                             unput(c);
                         }
                     }
                 }
     
         {number}  cout << "number " << YYText() << '\n';
     
         \n        mylineno++;
     
         {name}    cout << "name " << YYText() << '\n';
     
         {string}  cout << "string " << YYText() << '\n';
     
         %%
     
     	// This include is required if main() is an another source file.
     	//#include <FlexLexer.h>
     
         int main( int /* argc */, char** /* argv */ )
         {
             FlexLexer* lexer = new yyFlexLexer;
             while(lexer->yylex() != 0)
                 ;
             return 0;
         }

   If you want to create multiple (different) lexer classes, you use the
'-P' flag (or the 'prefix=' option) to rename each 'yyFlexLexer' to some
other 'xxFlexLexer'.  You then can include '<FlexLexer.h>' in your other
sources once per lexer class, first renaming 'yyFlexLexer' as follows:

         #undef yyFlexLexer
         #define yyFlexLexer xxFlexLexer
         #include <FlexLexer.h>
     
         #undef yyFlexLexer
         #define yyFlexLexer zzFlexLexer
         #include <FlexLexer.h>

   if, for example, you used '%option prefix="xx"' for one of your
scanners and '%option prefix="zz"' for the other.


automatically generated by info2www version 1.2.2.9