j-guru-blue.jpg (8086 bytes)

ANTLR

jGuru

C++ Notes


The C++ runtime and generated grammars look very much the same as the java ones. There are some subtle differences though, but more on this later.

Building the runtime

The runtime files are located in the lib/cpp subdirectory of the ANTLR distribution. This release is the first to include preliminary automake/autoconf support. Building it is in general done by doing the following:

./configure --prefix=/usr/local
make

Installing the runtime is done by typing

make install
This installs the runtime library libantlr.a in /usr/local/lib and the header files in /usr/local/include/antlr.

Using the runtime

Generally you will compile the ANTLR generated files with something similar to:
c++ -c MyParser.cpp -I/usr/local/include
Linking is done with something similar to:
c++ -o MyExec <your .o files> -L/usr/local/lib -lantlr

Getting ANTLR to generate C++

To get ANTLR to generate C++ code you have to add

language="Cpp";
to the global options section. After that things are pretty much the same as in java mode except that a all token and AST classes are wrapped by a reference counting class (this to make live easier). The reference counting class uses
operator->
to reference the object it is wrapping. As a result of this you use -> in C++ mode in stead of the '.' of java. See the examples in examples/c++ for some illustrations.

Using Custom AST types

In C++ mode it is also possible to override the AST type used by the code generated by ANTLR. To do this you have to do the following:
  • Define a custom AST class like the following:
    #include <antlr/CommonAST.hpp>
    
    typedef antlr::ASTRefCount<My_AST> RefMyAST;
    
    class MyAST : public antlr::CommonAST {
    public:
        MyAST( void ) : down(), right()
        {
        }
        ~MyAST( void )
        {
        }
        void initialize( antlr::RefToken t )
        {
            antlr::CommonAST::initialize(t);
            // more stuff....
            // ...
        }
       void initialize(int t,const ANTLR_USE_NAMESPACE(std)string& txt)
        {
            setType(t);
            setText(txt);
        }
        void addChild( RefMy_AST c )
        {
            antlr::BaseAST::addChild( static_cast<antlr::RefAST>(c) );
        }
        static antlr::RefAST factory( void )
        {
            antlr::RefAST ret = static_cast<antlr::RefAST>(RefMyAST(new MyAST));
            return ret;
        }
    private:
        RefMyAST down;      // are these really necessary...
        RefMyAST right;
    };
    
  • Tell ANTLR's C++ codegenerator to use your RefMyAST by including the following in the options section:
    ASTLabelType = "RefMyAST";
    
    After that you only need to tell the parser before every invocation of a new instance that it should use the AST factory defined in your class. This is done like this:
    My_Parser parser(lexer);
    parser.setASTNodeFactory( MyAST::factory );
    
    If you do not do this only CommonAST objects get created and used as if they were MyAST's. (In future versions this might be done automatically) Now all ANTLR generated code uses RefMyAST/MyAST as type. As a result you can access extra members and methods without typecasting.

Using Heterogeneous AST types

This is largely untested. Small examples seem to work. Functionality from duptree and the likes will not work, this may be fixed in the next release, in general inspection of the trees will work, transformations 90% sure not.. Basically follow the java instructions and look at the generated code. If someone would be willing to share some experiences?

A template grammar file for C++

header "pre_include_hpp" {
    // gets inserted before antlr generated includes in the header file
}
header "post_include_hpp" {
    // gets inserted after antlr generated includes in the header file
	 // outside any generated namespace specifications
}

header "pre_include_cpp" {
    // gets inserted after the antlr generated includes in the cpp file
}

header "post_include_cpp" {
    // gets inserted after the antlr generated includes in the cpp file
}

header {
	// gets inserted after generated namespace specifications in the header
	// file. But outside the generated class.
}

options {
   language="Cpp";
    namespace="something";      // encapsulate code in this namespace
//  namespaceStd="std";         // cosmetic option to get rid of long defines
                                // in generated code
//  namespaceAntlr="antlr";     // cosmetic option to get rid of long defines
                                // in generated code
    genHashLines = true;        // generated #line's or turn it off.
}

{
   // global stuff in the cpp file
   ...
}
class MyParser extends Parser;
options {
   exportVocab=My;
}
{
   // additional methods and members
   ...
}
... rules ...

{
   // global stuff in the cpp file
   ...
}
class MyLexer extends Lexer;
options {
   exportVocab=My;
}
{
   // additional methods and members
   ...
}
... rules ...

{
   // global stuff in the cpp file
   ...
}
class MyTreeParser extends TreeParser;
options {
   exportVocab=My;
}
{
   // additional methods and members
   ...
}
... rules ...

Version: $Id: //depot/code/org.antlr/release/antlr-2.7.1/doc/cpp-runtime.html#3 $