Skip to main content


Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Modeling » TMF (Xtext) » Special signs in parser pules
Special signs in parser pules [message #878530] Tue, 29 May 2012 13:40 Go to next message
Oleg Bolshakov is currently offline Oleg BolshakovFriend
Messages: 36
Registered: August 2010
Member
Why it isn't possible to use special signs in parser rules, such as "!", ".", "..", which I can use for terminal rules?

In some parser rule, say, I would like to tell smth like that:

somerule:
'{' someruletext = !'}'* '}'
;

Of course, I would like to use this rule in some context, i.e. not all blocks like this one:

'{' !'}'* '}'

can be treated as "someruletext" field of "somerule" object in syntax tree, that's why I can't use terminal rule for that (as I know, terminal rules are aplied to text before any parsing rules, right?).
Re: Special signs in parser pules [message #878584 is a reply to message #878530] Tue, 29 May 2012 15:04 Go to previous messageGo to next message
Henrik Lindberg is currently offline Henrik LindbergFriend
Messages: 2509
Registered: July 2009
Senior Member
On 2012-29-05 15:40, Oleg Bolshakov wrote:
> Why it isn't possible to use special signs in parser rules, such as "!",
> ".", "..", which I can use for terminal rules?
>
> In some parser rule, say, I would like to tell smth like that:
>
> somerule:
> '{' someruletext = !'}'* '}'
> ;
>
> Of course, I would like to use this rule in some context, i.e. not all
> blocks like this one:
>
> '{' !'}'* '}'
>
> can be treated as "someruletext" field of "somerule" object in syntax
> tree, that's why I can't use terminal rule for that (as I know, terminal
> rules are aplied to text before any parsing rules, right?).

Grammar rules operates on tokens, not on the text in the tokens. The
terminals define the tokens. A "not" grammar rule would mean any other
rule than the specified rule - which is not very useful, and probably
not what you wanted anyway.

You need to solve lexical issues such as "globbing" in your lexer.
The easiest is to write an external lexer using ANTLR (not as difficult
as it sounds). I have such an implementation in cloudsmith / geppetto @
github. An external lexer allows you to use more advanced ANTLR
statements as well as execute arbitrary java logic during lexing.

Trying to do the same in the grammar will most likely just cause you
grief. (I wasted quite a bit of time trying to do just that...)

- henrik
Re: Special signs in parser pules [message #878616 is a reply to message #878584] Tue, 29 May 2012 15:57 Go to previous messageGo to next message
Oleg Bolshakov is currently offline Oleg BolshakovFriend
Messages: 36
Registered: August 2010
Member
Henrik Lindberg wrote on Tue, 29 May 2012 11:04
On 2012-29-05 15:40, Oleg Bolshakov wrote:
> Why it isn't possible to use special signs in parser rules, such as "!",
> ".", "..", which I can use for terminal rules?
>
> In some parser rule, say, I would like to tell smth like that:
>
> somerule:
> '{' someruletext = !'}'* '}'
> ;
>
> Of course, I would like to use this rule in some context, i.e. not all
> blocks like this one:
>
> '{' !'}'* '}'
>
> can be treated as "someruletext" field of "somerule" object in syntax
> tree, that's why I can't use terminal rule for that (as I know, terminal
> rules are aplied to text before any parsing rules, right?).

Grammar rules operates on tokens, not on the text in the tokens. The
terminals define the tokens. A "not" grammar rule would mean any other
rule than the specified rule - which is not very useful, and probably
not what you wanted anyway.

You need to solve lexical issues such as "globbing" in your lexer.
The easiest is to write an external lexer using ANTLR (not as difficult
as it sounds). I have such an implementation in cloudsmith / geppetto @
github. An external lexer allows you to use more advanced ANTLR
statements as well as execute arbitrary java logic during lexing.

Trying to do the same in the grammar will most likely just cause you
grief. (I wasted quite a bit of time trying to do just that...)

- henrik


Did you integrate it with an Eclipse editor and its features like folding, autocompleting etc (I wanted to use Xtext to simplify and accelerate the develompent of an Eclipse editor)
Re: Special signs in parser pules [message #878635 is a reply to message #878616] Tue, 29 May 2012 16:16 Go to previous message
Henrik Lindberg is currently offline Henrik LindbergFriend
Messages: 2509
Registered: July 2009
Senior Member
On 2012-29-05 17:57, Oleg Bolshakov wrote:
>
> Did you integrate it with an Eclipse editor and its features like
> folding, autocompleting etc (I wanted to use Xtext to simplify and
> accelerate the develompent of an Eclipse editor)
>
Yes.
The use of an external lexer is supported in the Xtext workflow. The
only chore is to keep tokens in sync when/if you are experimenting with
the grammar.

Look at
/org.cloudsmith.geppetto.pp.dsl/src/org/cloudsmith/geppetto/pp/dsl/lexer/PPLexer.g
(cloudsmith/geppeto @ github) for an example of an external lexer.

In order to fully understand what is going on, and how to configure it,
you also need to look at the mwe workflow and the guice bindings. (There
are several different lexers at use, and the same external lexer needs
to be used for all of them. In geppetto the same external lexer is used
for all lexing tasks - you will find the implementations by starting in
the modules that binds "...lex..." stuff.

Here is a little bit more information - in your grammar, the terminals
and keyword definitions acts as the declarations of the tokens - i.e it
defines the set of tokens the grammar can deal with. The external lexer
however completely overrides their actual definition - you are free to
return any token for any input. In geppetto, I used terminal definitions
in the grammar that are close to what is in the lexer, but some
semantics are missing (I did this for documentation purposes).

Hence, you can define a terminal say:

terminal NOT_RIGHT_BRACE : !'}'* ;
and use that in your grammar

You solve any lexical issues in the external lexer, and produce the
NOT_RIGHT_BRACE token where appropriate.

As an example of what you can do - this rule:
KEYWORD_24 : '${' {
if(doubleQuotedString) {
// in string expression interpolation mode
pushDq();
enterBrace();
} else if(singleQuotedString) {
_type = RULE_ANY_OTHER;
}
};

will produce KEYWORD_24 (i.e. '${' keyword) unless it is found in a
single quoted string, in which case it produces RULE_ANY_OTHER (i.e. the
terminal ANY_OTHER).

You need to learn a bit of ANTLR (but not that much), and be prepared to
experiment. One thing to remember is that the lexer can be invoked
starting with any position in the text (when editing and partial parsing
is invoked) - there is no way of knowing exactly "where in the grammar"
you are, you can only know what tokens you have already seen. (As an
example - look at org.cloudsmith.geppetto.pp.dsl.lexer.PPOverridingLexer
to see how "last significant token" can be remembered - I need that
since the puppet language uses '/' to denote both division and start of
a regular expression).

Hope this helps you.

Regards
- henrik
Previous Topic:Xtext C grammar
Next Topic:SerializationFragment generating bad code when grammar mixin
Goto Forum:
  


Current Time: Fri Mar 29 14:36:43 GMT 2024

Powered by FUDForum. Page generated in 0.03010 seconds
.:: Contact :: Home ::.

Powered by: FUDforum 3.0.2.
Copyright ©2001-2010 FUDforum Bulletin Board Software

Back to the top