|
Re: Xtext picking unexpected rules [message #871240 is a reply to message #871225] |
Fri, 11 May 2012 14:53 |
Henrik Lindberg Messages: 2509 Registered: July 2009 |
Senior Member |
|
|
Do you have backtracking turned on?
Do you have terminal backtracking turned on?
Is what you posted the complete grammar include any additional terminals?
Which version of Xtext are you using?
If no backtracking, and no additional terminals then I am
guessing here:
- guess 1: since the parser has not yet completed the TestRule, it still
considers other possibilities and hence get it wrong. It may help to put
a semantic predicate on "Test" like this:
TestRule:
=>'Test' testName=ID '{' testExpr+=ExistExpression* '}' ;
i.e. "if you see 'Test' take this path - do not consider any other
possibility.
- guess 2: this may have to do with overlapping terminals/keywords. The
INT rule and the keywords starting with numbers may be in conflict.
It is illuminating to fire up the debugger and break in the lexer and
look at the sequence of tokens being returned. It is important to know
you get the expected tokens from the lexer before trying to figure out
if something is wrong with the grammar.
A general observation is that the grammar is very sensitive to
whitespace - you bake keywords and paranthesis into complex keywords
like this: '.CType(' it may be better to write this as;
'.' "CType" '('
and if whitespace etc. is not allowed, you can control that with the
modifier hidden(), like this perhaps:
MethodTypeC:
CTYPE '(' prop = STRING ')'
;
CTYPE hidden():
'.' "CType"
;
which would accept input like:
.CType ( "..." )
.CType("...")
If the problem is lexical, it may work better this way.
Also, in general, it is far better to be forgiving in the grammar, and
instead validate what the user entered. It is better to get a helpful
error message (with a possible quickfix) than being slapped with a
"syntax error - got x, expect y". You can also use formatting and save
actions to clean things like "extra spaces that are not supposed to be
there".
Hope that helps...
(I would start by looking at what the lexer returns by using the debugger).
- henrik
On 2012-11-05 15:55, Olaf noname wrote:
> Hi,
>
> the following grammar seems to pick the Method Rules instead of the
> ExistExpression rule. This is my complete grammar.
>
> grammar org.xtext.example.mydsl.MyDsl with
> org.eclipse.xtext.common.Terminals
> generate myDsl ".xtext.org/example/mydsl/MyDsl"
>
> Model:
> methods+=Method*
> tests+=TestRule*
> ;
>
> Method:
> MethodTypeA|MethodTypeB|MethodTypeC
> ;
>
> MethodTypeA:
> name=ID '.AType(' prop=STRING ')'
> ;
>
> MethodTypeB:
> '2Param:' name=ID '.BType(' prop=STRING ',' prop2=STRING ')'
> ;
>
> MethodTypeC:
> '.CType(' prop=STRING ')'
> ;
>
> TestRule:
> 'Test' testName=ID '{' testExpr+=ExistExpression* '}' ;
>
> ExistExpression:
> obj=ID ':' 'has('object=INT '.' value=ID ')' ;
>
>
> And here is the Model.
>
> test.AType("p1")
> 2Param: test2.BType("p1","p2")
> CType("p")
> Test testName
> {
> a:has(1.Ax)
> // ^^^^ Errors (2 Items): mismatched character 'x' expecting 'T' ;
> mismatched input ')' expecting '.'
> b:has(2.Bx)
> // ^^^^ Errors (2 Items): mismatched character 'x' expecting 'T' ;
> mismatched input ')' expecting '.'
> b:has(3.Cx)
> // ^^^^ Errors (2 Items): mismatched character 'x' expecting 'T' ;
> mismatched input ')' expecting '.'
> b:has(4.Dx)
> // Works!
> b:has(5.abcd)
> // Works!
>
> }
>
> It looks like that 'a:has(1.A' for 'value' in ExistExpression is
> expanded to '.AType' following the MethodTypeA rule. The same for '.B'
> or '.C' .
> Any other characters for value are working
> Why? My understanding is, that the parser should follow the rule
> TestRule->ExistExpression.
> Where I am wrong?
>
> Thanks for any help.
>
> Olaf
>
|
|
|
|
Re: Xtext picking unexpected rules [message #871317 is a reply to message #871247] |
Fri, 11 May 2012 21:43 |
Henrik Lindberg Messages: 2509 Registered: July 2009 |
Senior Member |
|
|
On 2012-11-05 17:52, Olaf noname wrote:
> Hi Henrik,
>> A general observation is that the grammar is very sensitive to
>> whitespace - you bake keywords and paranthesis into complex keywords
>> like this: '.CType(' it may be better to write this as;
>>
>> '.' "CType" '('
>
>
> That was the solution!!
> My grammar now looks like this and the model is working.
>
> MethodTypeA:
> name=ID '.' 'AType(' prop=STRING ')'
> ;
>
> MethodTypeB:
> '2Param:' name=ID '.' 'BType(' prop=STRING ',' prop2=STRING ')'
> ;
>
> MethodTypeC:
> '.' 'CType(' prop=STRING ')'
> ;
>
> TestRule:
> 'Test' testName=ID '{' testExpr+=ExistExpression* '}' ;
>
> ExistExpression:
> obj=ID ':' 'has' '('object=INT '.' value=ID ')' ;
>
>
>
> Thanks a lot for the very good hints.
>
> Do you have an explanation why methodTypeB was used for 2.Bx. The
> definitve keyword 2Param (Even changed to TwoParam) should have stop the
> rule. Shouldn't it?
>
The problems was lexical - the lexer is not backtracking - once it found
'.A' or '.B' is is determined that that is the token to return and then
it does not find the correct continuation of the selected token. By
default the lexer does not backtrack (nor do you typically want it to).
So - there was never the question if "methodTypeB was used or not" - it
did not get that far - it found the beginning of a token, and could not
continue.
Remember - lexing takes place long before the grammar sees the tokens.
The lexer performs cookie cutting on the input text and lines up a
series of tokens that the grammar acts on. Hence, use as few and as
simple terminals and keywords as possible.
With the change to use '.' as a separate token, the lexer finds and
produces it. It will then produce an ID or one of the tokens 'AType('
'BType(', or 'CType('. If you followed by advice to separate the '('
from the keyword you would also have needed to do use a data rule
instead of ID (MyID : ID | 'AType' | 'BType' | 'CType';) and use that
instead of ID where you would like to be able to use those keywords as
ID. (You are protected against this since the '(' is part of the
keyword, but it also means users can't put a space before the '(').
It is kind of difficult to think in terms of the terminals when looking
at the grammar - I have worked with Xtext for quite some time now, and I
fell into the trap too - suggesting fixes to the grammar when the
problem was lexical.
Remember - Lexer tells grammar "Duck, Duck, Tiger, Cat, Duck", and
Parser tries to make sense of it vs. Parser asking lexer for "give me
best of a Duck or a Tiger".
Hope that makes sense.
Regards
- henrik
|
|
|
|
Powered by
FUDForum. Page generated in 0.03405 seconds