Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Modeling » TMF (Xtext) » automatic tweaking lexer for semantic predicates
automatic tweaking lexer for semantic predicates [message #1073856] Thu, 25 July 2013 16:24
Jens Kuenzer is currently offline Jens KuenzerFriend
Messages: 29
Registered: October 2009
Junior Member

I faced the problem that a VHDL lexer requires some semantic predicates.
(for details see:

But I don't like to switch to a custom lexer. So I added some code to the generator fragments to tweak the antlr grammer with code taken from comments in the grammar.

I add this to both fragments:


(I wish a I could add that by overloading and not with a full copy of these classes.) The TweakLexer file is attached.

So my xtext grammar contains:
/*@lexer::members {

    private int lastSignificantTokenType;

    public void emit(Token token) {
        if(  token.getChannel() == Token.DEFAULT_CHANNEL
          && token.getType() != RULE_SL_COMMENT
          && token.getType() != RULE_ML_COMMENT
          && token.getType() != RULE_WS
          ) {
            lastSignificantTokenType = token.getType();


terminal CHARACTER_LITERAL : /*{
	  && lastSignificantTokenType!=KEYWORD_")"
      && lastSignificantTokenType!=KEYWORD_"]"
      && lastSignificantTokenType!=KEYWORD_"all"
      && lastSignificantTokenType!=RULE_BASIC_IDENTIFIER
      && lastSignificantTokenType!=RULE_EXTENDED_IDENTIFIER
      && lastSignificantTokenType!=RULE_INTERNAL_IDENTIFIER
      && lastSignificantTokenType!=RULE_STRING
      && lastSignificantTokenType!=RULE_CHARACTER_LITERAL
      && lastSignificantTokenType!=RULE_BIT_STRING_LITERAL
       }?=>*/ "'" (GRAPHIC_CHARACTER | '"' ) "'";

terminal TICK : /*{
      || lastSignificantTokenType==KEYWORD_"]"
      || lastSignificantTokenType==KEYWORD_"all"
      || lastSignificantTokenType==RULE_BASIC_IDENTIFIER
      || lastSignificantTokenType==RULE_EXTENDED_IDENTIFIER
      || lastSignificantTokenType==RULE_INTERNAL_IDENTIFIER
      || lastSignificantTokenType==RULE_STRING
      || lastSignificantTokenType==RULE_CHARACTER_LITERAL
      || lastSignificantTokenType==RULE_BIT_STRING_LITERAL
      }?=>*/ "'";

It requires not to use class splitting for the lexer, but it works pretty well.
I know it is a pretty ugly approach to parse comments. Maybe someone can suggest a better way doing this.

I hope this helps someone with a simmilar problem.
Previous Topic:A simple error break a whole file: how to avoid?
Next Topic:quick fix in xtext
Goto Forum:

Current Time: Wed Apr 25 18:29:06 GMT 2018

Powered by FUDForum. Page generated in 0.03029 seconds
.:: Contact :: Home ::.

Powered by: FUDforum 3.0.2.
Copyright ©2001-2010 FUDforum Bulletin Board Software