Home » Modeling » TMF (Xtext) » [Xtext] Does Xtext really support LL(*) parsing?
[Xtext] Does Xtext really support LL(*) parsing? [message #640551] |
Mon, 22 November 2010 13:07 |
Eclipse User |
|
|
|
Originally posted by: koester.matthias.gmx.de
Hi,
The following grammar is consumed by the Xtext generator without an error:
grammar org.xtext.example.mydsl.MyDsl
generate myDsl "http://www.xtext.org/example/mydsl/MyDsl"
import "http://www.eclipse.org/emf/2002/Ecore" as ecore
Program:
literals+=Literal;
Literal:
keyword=Keyword|
name=QualifiedName|
number=Number;
QualifiedName:
Name (DELIM Name)+;
Name:
('-'|'+') (SYMBOL_START SYMBOL_TAIL*)?|
SYMBOL_START SYMBOL_TAIL*;
Number:
('-'|'+')? DIGIT+(('.' DIGIT*)|('r' DIGIT+))?'M'?;
Keyword:
':' Name;
terminal DELIM:
'$'|'.'|'/';
terminal SYMBOL_TAIL:
SYMBOL_START|DIGIT|'-'|'+';
terminal SYMBOL_START:
'a'..'z'|'A'..'Z'|'_'|'*'|'!'|'?'|'='|'#'|'<'|'>'|'%'|'&';
terminal DIGIT:
('0'..'9');
But when I now try to parse a number like "+12", the parser is
complaining about "no viable alternative at input '+'. I would assume
that an LL(*) could handle this situation, at least when it can
backtrack. Since it has an "unlimited" lookahead, it should be able to
detect that a prefix "+1" can only be matched by rule Number. But when I
debug the parser, it tries to match a Name, fails and doesn't backtrack.
Even when I enable backtracking it still doesn't recognize "+12". And it
seems to me that you avoided this problem in your arithmetics example!
There it doesn't seem to be possible to have a INT with signs. Altough I
think I have a good understanding of parser theory, I'm a little bit
confused...
Best regards,
Matthias
|
|
| |
Re: [Xtext] Does Xtext really support LL(*) parsing? [message #640574 is a reply to message #640551] |
Mon, 22 November 2010 14:19 |
Henrik Lindberg Messages: 2509 Registered: July 2009 |
Senior Member |
|
|
With just a quick glance...
seems like your Name: rule allows a single '-' or '+' as a name.
Since Name is used before Number in Literal it will be used when
backtracking is on.
Try changing the order of name and number in Literal and see if it makes
a difference.
Regards
- henrik
On 11/22/10 2:07 PM, Matthias Köster wrote:
> Hi,
>
> The following grammar is consumed by the Xtext generator without an error:
>
> grammar org.xtext.example.mydsl.MyDsl
>
> generate myDsl "http://www.xtext.org/example/mydsl/MyDsl"
>
> import "http://www.eclipse.org/emf/2002/Ecore" as ecore
>
> Program:
> literals+=Literal;
>
> Literal:
> keyword=Keyword|
> name=QualifiedName|
> number=Number;
>
> QualifiedName:
> Name (DELIM Name)+;
>
> Name:
> ('-'|'+') (SYMBOL_START SYMBOL_TAIL*)?|
> SYMBOL_START SYMBOL_TAIL*;
>
> Number:
> ('-'|'+')? DIGIT+(('.' DIGIT*)|('r' DIGIT+))?'M'?;
>
> Keyword:
> ':' Name;
>
> terminal DELIM:
> '$'|'.'|'/';
>
> terminal SYMBOL_TAIL:
> SYMBOL_START|DIGIT|'-'|'+';
>
> terminal SYMBOL_START:
> 'a'..'z'|'A'..'Z'|'_'|'*'|'!'|'?'|'='|'#'|'<'|'>'|'%'|'&';
>
> terminal DIGIT:
> ('0'..'9');
>
>
>
> But when I now try to parse a number like "+12", the parser is
> complaining about "no viable alternative at input '+'. I would assume
> that an LL(*) could handle this situation, at least when it can
> backtrack. Since it has an "unlimited" lookahead, it should be able to
> detect that a prefix "+1" can only be matched by rule Number. But when I
> debug the parser, it tries to match a Name, fails and doesn't backtrack.
>
> Even when I enable backtracking it still doesn't recognize "+12". And it
> seems to me that you avoided this problem in your arithmetics example!
> There it doesn't seem to be possible to have a INT with signs. Altough I
> think I have a good understanding of parser theory, I'm a little bit
> confused...
>
> Best regards,
> Matthias
|
|
|
Re: [Xtext] Does Xtext really support LL(*) parsing? [message #640676 is a reply to message #640572] |
Mon, 22 November 2010 17:58 |
Eclipse User |
|
|
|
Originally posted by: koester.matthias.gmx.de
Hi Alex,
I enabled backtracking for both ANTLR generator fragments.
Regards,
Matthias
Alexander Nittka schrieb:
> Hi,
>
> haven't tested your grammar, but did you enable the backtracking options
> in the generator workflow? For performance reasons (?) they are not
> activated by default.
>
> Also I think SYMBOL_TAIL shoul not be made a terminal rule as it
> overlaps with SYMBOL_START and DIGIT
>
> Alex
|
|
| |
Re: [Xtext] Does Xtext really support LL(*) parsing? [message #640782 is a reply to message #640551] |
Tue, 23 November 2010 07:23 |
Sebastian Zarnekow Messages: 3118 Registered: July 2009 |
Senior Member |
|
|
Hi Matthias,
it seems that you have a conflict between the '+' sign as a keyword in
Number / Name and the '+' sign as symbol tail. I'd try to inline the
'+'/'-' from symbol tail into the rule that uses symbol tail. However
it's only a shot into the dark.
Regards,
Sebastian
--
Need professional support for Eclipse Modeling?
Go visit: http://xtext.itemis.com
Am 22.11.10 14:07, schrieb Matthias Köster:
> Hi,
>
> The following grammar is consumed by the Xtext generator without an error:
>
> grammar org.xtext.example.mydsl.MyDsl
>
> generate myDsl "http://www.xtext.org/example/mydsl/MyDsl"
>
> import "http://www.eclipse.org/emf/2002/Ecore" as ecore
>
> Program:
> literals+=Literal;
>
> Literal:
> keyword=Keyword|
> name=QualifiedName|
> number=Number;
>
> QualifiedName:
> Name (DELIM Name)+;
>
> Name:
> ('-'|'+') (SYMBOL_START SYMBOL_TAIL*)?|
> SYMBOL_START SYMBOL_TAIL*;
>
> Number:
> ('-'|'+')? DIGIT+(('.' DIGIT*)|('r' DIGIT+))?'M'?;
>
> Keyword:
> ':' Name;
>
> terminal DELIM:
> '$'|'.'|'/';
>
> terminal SYMBOL_TAIL:
> SYMBOL_START|DIGIT|'-'|'+';
>
> terminal SYMBOL_START:
> 'a'..'z'|'A'..'Z'|'_'|'*'|'!'|'?'|'='|'#'|'<'|'>'|'%'|'&';
>
> terminal DIGIT:
> ('0'..'9');
>
>
>
> But when I now try to parse a number like "+12", the parser is
> complaining about "no viable alternative at input '+'. I would assume
> that an LL(*) could handle this situation, at least when it can
> backtrack. Since it has an "unlimited" lookahead, it should be able to
> detect that a prefix "+1" can only be matched by rule Number. But when I
> debug the parser, it tries to match a Name, fails and doesn't backtrack.
>
> Even when I enable backtracking it still doesn't recognize "+12". And it
> seems to me that you avoided this problem in your arithmetics example!
> There it doesn't seem to be possible to have a INT with signs. Altough I
> think I have a good understanding of parser theory, I'm a little bit
> confused...
>
> Best regards,
> Matthias
|
|
|
Re: [Xtext] Does Xtext really support LL(*) parsing? [message #640799 is a reply to message #640782] |
Tue, 23 November 2010 08:40 |
Eclipse User |
|
|
|
Originally posted by: koester.matthias.gmx.de
Hi Sebastian,
Inlining did the trick, it now works correctly. But I had to enable
backtracking, which I still find confusing. I'm quite sure that I can
build an ANTLR grammar without backtracking. Backtracking is -
hopefullly - no problem for my project, but I'm feeling a little bit
uncomfortable with having to use it :-)
Sometimes it's hard for me to understand what Xtext does with my
grammar. As an old ANTLR user I sometimes feel a little bit helpless
with regards to the transformation Xtext applies... But my understanding
is getting better, especially thanks to the support of this newsgroup ;-)
Best regards,
Matthias
Sebastian Zarnekow schrieb:
> Hi Matthias,
>
> it seems that you have a conflict between the '+' sign as a keyword in
> Number / Name and the '+' sign as symbol tail. I'd try to inline the
> '+'/'-' from symbol tail into the rule that uses symbol tail. However
> it's only a shot into the dark.
>
> Regards,
> Sebastian
|
|
|
Re: [Xtext] Does Xtext really support LL(*) parsing? [message #640880 is a reply to message #640799] |
Tue, 23 November 2010 12:20 |
Sebastian Zarnekow Messages: 3118 Registered: July 2009 |
Senior Member |
|
|
Hi Matthias,
if you can write an Antlr grammar without backtracking or semantic
predicates, it should be possible to write an Xtext grammar for the very
same language, too. How does your final grammar look like?
Regards,
Sebastian
--
Need professional support for Eclipse Modeling?
Go visit: http://xtext.itemis.com
Am 23.11.10 09:40, schrieb Matthias Köster:
> Hi Sebastian,
>
> Inlining did the trick, it now works correctly. But I had to enable
> backtracking, which I still find confusing. I'm quite sure that I can
> build an ANTLR grammar without backtracking. Backtracking is -
> hopefullly - no problem for my project, but I'm feeling a little bit
> uncomfortable with having to use it :-)
> Sometimes it's hard for me to understand what Xtext does with my
> grammar. As an old ANTLR user I sometimes feel a little bit helpless
> with regards to the transformation Xtext applies... But my understanding
> is getting better, especially thanks to the support of this newsgroup ;-)
>
> Best regards,
> Matthias
>
> Sebastian Zarnekow schrieb:
>> Hi Matthias,
>>
>> it seems that you have a conflict between the '+' sign as a keyword in
>> Number / Name and the '+' sign as symbol tail. I'd try to inline the
>> '+'/'-' from symbol tail into the rule that uses symbol tail. However
>> it's only a shot into the dark.
>>
>> Regards,
>> Sebastian
|
|
|
Re: [Xtext] Does Xtext really support LL(*) parsing? [message #640897 is a reply to message #640880] |
Tue, 23 November 2010 13:35 |
Eclipse User |
|
|
|
Originally posted by: koester.matthias.gmx.de
Hi Sebastian,
The grammar I showed was just a way to solve a problem I had with my
clojure grammar. And I haven't commited or otherwise kept a copy of it.
But I think the correctly working grammar looked like:
grammar org.xtext.example.mydsl.MyDsl
hidden (WSS,SL_COMMENT)
generate myDsl "http://www.xtext.org/example/mydsl/MyDsl"
import "http://www.eclipse.org/emf/2002/Ecore" as ecore
Program:
literals+=Literal*;
Literal:
keyword=Keyword|
name=QualifiedName|
number=Number;
QualifiedName:
NAME (('$'|'.'|'/') NAME)*;
Number:
SIGN? DIGIT+(('.' DIGIT*)|('r' DIGIT+))?'M'?;
Keyword:
':' NAME;
// terminal rules
terminal NAME:
SIGN (SYMBOL_START (SYMBOL_START|DIGIT|SIGN)*)?|
SYMBOL_START (SYMBOL_START|DIGIT|SIGN)*;
terminal SIGN:
('-'|'+');
terminal SYMBOL_START:
'a'..'z'|'A'..'Z'|'_'|'*'|'!'|'?'|'='|'#'|'<'|'>'|'&';
terminal DIGIT:
('0'..'9');
terminal WSS:
(' '|'\t'|'\r'|'\n'|',')+;
terminal SL_COMMENT:
';' !('\n'|'\r')* ('\r'? '\n')?;
I switched the NAME rule back to a terminal rule because otherwise
"name1 name2" would have been parsed as one Name. I still have this
problem with QualifiedName: clojure - as well as Java ;-) - doesn't
allow to put comments or whitespaces between the parts of a qualified
name, but I understand how useful it is to have data type rules.
Hopefully ANTLR 4 will support lexers that support LL(*) lexer rules so
that Xtext can support this as well.
Best regards,
Matthias
Sebastian Zarnekow schrieb:
> Hi Matthias,
>
> if you can write an Antlr grammar without backtracking or semantic
> predicates, it should be possible to write an Xtext grammar for the very
> same language, too. How does your final grammar look like?
>
> Regards,
> Sebastian
|
|
|
Re: [Xtext] Does Xtext really support LL(*) parsing? [message #641456 is a reply to message #640897] |
Thu, 25 November 2010 14:50 |
Eclipse User |
|
|
|
Originally posted by: koester.matthias.gmx.de
Hi,
Today I finally realized how to solve the problem with hidden tokens in
my QualifiedName rule! I realized that hiden() can be used per rule and
not just per grammar! Great, my grammar works much better now and can
handle almost every weird Clojure construct. Xtext really provides more
clever solutions then I imagined! Xtext really rocks!
Now I just need to understand why my parser throws a
StackOverflowException while parsing big files with memoize = true ;-)
Best regards,
Matthias
Matthias Köster schrieb:
> Hi Sebastian,
>
> The grammar I showed was just a way to solve a problem I had with my
> clojure grammar. And I haven't commited or otherwise kept a copy of it.
> But I think the correctly working grammar looked like:
>
> grammar org.xtext.example.mydsl.MyDsl
> hidden (WSS,SL_COMMENT)
>
> generate myDsl "http://www.xtext.org/example/mydsl/MyDsl"
>
> import "http://www.eclipse.org/emf/2002/Ecore" as ecore
>
> Program:
> literals+=Literal*;
>
> Literal:
> keyword=Keyword|
> name=QualifiedName|
> number=Number;
>
> QualifiedName:
> NAME (('$'|'.'|'/') NAME)*;
>
> Number:
> SIGN? DIGIT+(('.' DIGIT*)|('r' DIGIT+))?'M'?;
>
> Keyword:
> ':' NAME;
>
> // terminal rules
>
> terminal NAME:
> SIGN (SYMBOL_START (SYMBOL_START|DIGIT|SIGN)*)?|
> SYMBOL_START (SYMBOL_START|DIGIT|SIGN)*;
>
> terminal SIGN:
> ('-'|'+');
>
> terminal SYMBOL_START:
> 'a'..'z'|'A'..'Z'|'_'|'*'|'!'|'?'|'='|'#'|'<'|'>'|'&';
>
> terminal DIGIT:
> ('0'..'9');
>
> terminal WSS:
> (' '|'\t'|'\r'|'\n'|',')+;
>
> terminal SL_COMMENT:
> ';' !('\n'|'\r')* ('\r'? '\n')?;
>
> I switched the NAME rule back to a terminal rule because otherwise
> "name1 name2" would have been parsed as one Name. I still have this
> problem with QualifiedName: clojure - as well as Java ;-) - doesn't
> allow to put comments or whitespaces between the parts of a qualified
> name, but I understand how useful it is to have data type rules.
> Hopefully ANTLR 4 will support lexers that support LL(*) lexer rules so
> that Xtext can support this as well.
>
> Best regards,
> Matthias
>
> Sebastian Zarnekow schrieb:
>> Hi Matthias,
>>
>> if you can write an Antlr grammar without backtracking or semantic
>> predicates, it should be possible to write an Xtext grammar for the
>> very same language, too. How does your final grammar look like?
>>
>> Regards,
>> Sebastian
|
|
|
Goto Forum:
Current Time: Mon May 13 03:55:46 GMT 2024
Powered by FUDForum. Page generated in 0.03608 seconds
|