However, it cannot parse the input "compatible_directions mod variable2". The parse error that I get is "mismatched character 'c' expecting 'm' at offset: 13". As far as I can tell it's trying to parse "rem" from the "re" in the middle of "directions".
In the grammar above, I can make VARIABLE terminal and it works. But, in the real grammar, I cannot make the VARIABLE rule terminal.
Any idea why xtext is trying to parse "rem" rather than "mod" in the input? Is there a way to fix that?
the lexer is greedy and tries to make tokens as long as possible. This is why "re" in directions are not tokenized as two individual characters as you expect. The lexer knows a token (rem) starting with "re" which is longer than the single characters and it does not backtrack. Hence the error.
Why can't you have a terminal that accepts something like ID from the default grammar. You could still make Variable a datatype rule with a value converter that enforces the correct format. Not every language detail has to be dealt with in the grammar. Often it is better to make the grammar more forgiving and have validation for user friendly error messages.
P.S.: Your digit, lower case and upper case character terminal rules look more like terminal fragments.
It is not a good idea to have tokens that are individual characters when
there are going to be many of them; your LOWER_CASE_LETTER and
UPPER_CASE_LETTER will cause serious bloat to the resulting parse tree
(each node will have quite a lot of extra information).
Recommend using a longer token; like the ID in the standard terminals.
Also do the same for DIGIT.
Small note, if you use WS as the name for Whitespace I think you will
need to do less customization (you need to tell the framework what your
whitespace rule is otherwise IIRC).
On 2012-19-12 5:23, Scott Hendrickson wrote:
> I have the following grammar:
> grammar org.archstudio.prolog.xtext.Prolog hidden(WHITESPACE)
> import "http://www.eclipse.org/emf/2002/Ecore" as ecore
> generate prolog "http://www.archstudio.org/prolog/xtext/Prolog"
> TopExpression returns Expression:
> exps+=BottomExpression (ops+=Operations exps+=BottomExpression)*;
> '*' | '/' | '//' | 'rdiv' | '<<' | '>>' | 'mod' | 'rem' ;
> BottomExpression returns Expression:
> (value=VARIABLE) | '(' exps+=TopExpression ')';
> terminal DIGIT:
> terminal LOWER_CASE_LETTER:
> terminal UPPER_CASE_LETTER:
> terminal WHITESPACE:
> (' ' | '\t' | '\r' | '\n')+;
> LOWER_CASE_LETTER (DIGIT | LOWER_CASE_LETTER | UPPER_CASE_LETTER |
> However, it cannot parse the input "compatible_directions mod
> variable2". The parse error that I get is "mismatched character 'c'
> expecting 'm' at offset: 13". As far as I can tell it's trying to parse
> "rem" from the "re" in the middle of "directions".
> In the grammar above, I can make VARIABLE terminal and it works. But, in
> the real grammar, I cannot make the VARIABLE rule terminal.
> Any idea why xtext is trying to parse "rem" rather than "mod" in the
> input? Is there a way to fix that?
> Any help is greatly appreciated.
> Thank you,
> -- Scott