Skip to main content


Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Modeling » TMF (Xtext) » [Xtext] Does Xtext really support LL(*) parsing?
[Xtext] Does Xtext really support LL(*) parsing? [message #640551] Mon, 22 November 2010 13:07 Go to next message
Eclipse UserFriend
Originally posted by: koester.matthias.gmx.de

Hi,

The following grammar is consumed by the Xtext generator without an error:

grammar org.xtext.example.mydsl.MyDsl

generate myDsl "http://www.xtext.org/example/mydsl/MyDsl"

import "http://www.eclipse.org/emf/2002/Ecore" as ecore

Program:
literals+=Literal;

Literal:
keyword=Keyword|
name=QualifiedName|
number=Number;

QualifiedName:
Name (DELIM Name)+;

Name:
('-'|'+') (SYMBOL_START SYMBOL_TAIL*)?|
SYMBOL_START SYMBOL_TAIL*;

Number:
('-'|'+')? DIGIT+(('.' DIGIT*)|('r' DIGIT+))?'M'?;

Keyword:
':' Name;

terminal DELIM:
'$'|'.'|'/';

terminal SYMBOL_TAIL:
SYMBOL_START|DIGIT|'-'|'+';

terminal SYMBOL_START:
'a'..'z'|'A'..'Z'|'_'|'*'|'!'|'?'|'='|'#'|'<'|'>'|'%'|'&';

terminal DIGIT:
('0'..'9');



But when I now try to parse a number like "+12", the parser is
complaining about "no viable alternative at input '+'. I would assume
that an LL(*) could handle this situation, at least when it can
backtrack. Since it has an "unlimited" lookahead, it should be able to
detect that a prefix "+1" can only be matched by rule Number. But when I
debug the parser, it tries to match a Name, fails and doesn't backtrack.

Even when I enable backtracking it still doesn't recognize "+12". And it
seems to me that you avoided this problem in your arithmetics example!
There it doesn't seem to be possible to have a INT with signs. Altough I
think I have a good understanding of parser theory, I'm a little bit
confused...

Best regards,
Matthias
Re: [Xtext] Does Xtext really support LL(*) parsing? [message #640572 is a reply to message #640551] Mon, 22 November 2010 14:19 Go to previous messageGo to next message
Alexander Nittka is currently offline Alexander NittkaFriend
Messages: 1193
Registered: July 2009
Senior Member
Hi,

haven't tested your grammar, but did you enable the backtracking options in the generator workflow? For performance reasons (?) they are not activated by default.

Also I think SYMBOL_TAIL shoul not be made a terminal rule as it overlaps with SYMBOL_START and DIGIT

Alex
Re: [Xtext] Does Xtext really support LL(*) parsing? [message #640574 is a reply to message #640551] Mon, 22 November 2010 14:19 Go to previous messageGo to next message
Henrik Lindberg is currently offline Henrik LindbergFriend
Messages: 2509
Registered: July 2009
Senior Member
With just a quick glance...
seems like your Name: rule allows a single '-' or '+' as a name.
Since Name is used before Number in Literal it will be used when
backtracking is on.

Try changing the order of name and number in Literal and see if it makes
a difference.

Regards
- henrik

On 11/22/10 2:07 PM, Matthias Köster wrote:
> Hi,
>
> The following grammar is consumed by the Xtext generator without an error:
>
> grammar org.xtext.example.mydsl.MyDsl
>
> generate myDsl "http://www.xtext.org/example/mydsl/MyDsl"
>
> import "http://www.eclipse.org/emf/2002/Ecore" as ecore
>
> Program:
> literals+=Literal;
>
> Literal:
> keyword=Keyword|
> name=QualifiedName|
> number=Number;
>
> QualifiedName:
> Name (DELIM Name)+;
>
> Name:
> ('-'|'+') (SYMBOL_START SYMBOL_TAIL*)?|
> SYMBOL_START SYMBOL_TAIL*;
>
> Number:
> ('-'|'+')? DIGIT+(('.' DIGIT*)|('r' DIGIT+))?'M'?;
>
> Keyword:
> ':' Name;
>
> terminal DELIM:
> '$'|'.'|'/';
>
> terminal SYMBOL_TAIL:
> SYMBOL_START|DIGIT|'-'|'+';
>
> terminal SYMBOL_START:
> 'a'..'z'|'A'..'Z'|'_'|'*'|'!'|'?'|'='|'#'|'<'|'>'|'%'|'&';
>
> terminal DIGIT:
> ('0'..'9');
>
>
>
> But when I now try to parse a number like "+12", the parser is
> complaining about "no viable alternative at input '+'. I would assume
> that an LL(*) could handle this situation, at least when it can
> backtrack. Since it has an "unlimited" lookahead, it should be able to
> detect that a prefix "+1" can only be matched by rule Number. But when I
> debug the parser, it tries to match a Name, fails and doesn't backtrack.
>
> Even when I enable backtracking it still doesn't recognize "+12". And it
> seems to me that you avoided this problem in your arithmetics example!
> There it doesn't seem to be possible to have a INT with signs. Altough I
> think I have a good understanding of parser theory, I'm a little bit
> confused...
>
> Best regards,
> Matthias
Re: [Xtext] Does Xtext really support LL(*) parsing? [message #640676 is a reply to message #640572] Mon, 22 November 2010 17:58 Go to previous messageGo to next message
Eclipse UserFriend
Originally posted by: koester.matthias.gmx.de

Hi Alex,

I enabled backtracking for both ANTLR generator fragments.

Regards,
Matthias

Alexander Nittka schrieb:
> Hi,
>
> haven't tested your grammar, but did you enable the backtracking options
> in the generator workflow? For performance reasons (?) they are not
> activated by default.
>
> Also I think SYMBOL_TAIL shoul not be made a terminal rule as it
> overlaps with SYMBOL_START and DIGIT
>
> Alex
Re: [Xtext] Does Xtext really support LL(*) parsing? [message #640778 is a reply to message #640572] Tue, 23 November 2010 07:19 Go to previous messageGo to next message
Sebastian Zarnekow is currently offline Sebastian ZarnekowFriend
Messages: 3118
Registered: July 2009
Senior Member
Hi Alex,

backtracking is disabled by default to reduce the chance for unwanted
surprises.

Regards,
Sebastian
--
Need professional support for Eclipse Modeling?
Go visit: http://xtext.itemis.com

Am 22.11.10 15:19, schrieb Alexander Nittka:
> Hi,
>
> haven't tested your grammar, but did you enable the backtracking options
> in the generator workflow? For performance reasons (?) they are not
> activated by default.
>
> Also I think SYMBOL_TAIL shoul not be made a terminal rule as it
> overlaps with SYMBOL_START and DIGIT
>
> Alex
Re: [Xtext] Does Xtext really support LL(*) parsing? [message #640782 is a reply to message #640551] Tue, 23 November 2010 07:23 Go to previous messageGo to next message
Sebastian Zarnekow is currently offline Sebastian ZarnekowFriend
Messages: 3118
Registered: July 2009
Senior Member
Hi Matthias,

it seems that you have a conflict between the '+' sign as a keyword in
Number / Name and the '+' sign as symbol tail. I'd try to inline the
'+'/'-' from symbol tail into the rule that uses symbol tail. However
it's only a shot into the dark.

Regards,
Sebastian
--
Need professional support for Eclipse Modeling?
Go visit: http://xtext.itemis.com

Am 22.11.10 14:07, schrieb Matthias Köster:
> Hi,
>
> The following grammar is consumed by the Xtext generator without an error:
>
> grammar org.xtext.example.mydsl.MyDsl
>
> generate myDsl "http://www.xtext.org/example/mydsl/MyDsl"
>
> import "http://www.eclipse.org/emf/2002/Ecore" as ecore
>
> Program:
> literals+=Literal;
>
> Literal:
> keyword=Keyword|
> name=QualifiedName|
> number=Number;
>
> QualifiedName:
> Name (DELIM Name)+;
>
> Name:
> ('-'|'+') (SYMBOL_START SYMBOL_TAIL*)?|
> SYMBOL_START SYMBOL_TAIL*;
>
> Number:
> ('-'|'+')? DIGIT+(('.' DIGIT*)|('r' DIGIT+))?'M'?;
>
> Keyword:
> ':' Name;
>
> terminal DELIM:
> '$'|'.'|'/';
>
> terminal SYMBOL_TAIL:
> SYMBOL_START|DIGIT|'-'|'+';
>
> terminal SYMBOL_START:
> 'a'..'z'|'A'..'Z'|'_'|'*'|'!'|'?'|'='|'#'|'<'|'>'|'%'|'&';
>
> terminal DIGIT:
> ('0'..'9');
>
>
>
> But when I now try to parse a number like "+12", the parser is
> complaining about "no viable alternative at input '+'. I would assume
> that an LL(*) could handle this situation, at least when it can
> backtrack. Since it has an "unlimited" lookahead, it should be able to
> detect that a prefix "+1" can only be matched by rule Number. But when I
> debug the parser, it tries to match a Name, fails and doesn't backtrack.
>
> Even when I enable backtracking it still doesn't recognize "+12". And it
> seems to me that you avoided this problem in your arithmetics example!
> There it doesn't seem to be possible to have a INT with signs. Altough I
> think I have a good understanding of parser theory, I'm a little bit
> confused...
>
> Best regards,
> Matthias
Re: [Xtext] Does Xtext really support LL(*) parsing? [message #640799 is a reply to message #640782] Tue, 23 November 2010 08:40 Go to previous messageGo to next message
Eclipse UserFriend
Originally posted by: koester.matthias.gmx.de

Hi Sebastian,

Inlining did the trick, it now works correctly. But I had to enable
backtracking, which I still find confusing. I'm quite sure that I can
build an ANTLR grammar without backtracking. Backtracking is -
hopefullly - no problem for my project, but I'm feeling a little bit
uncomfortable with having to use it :-)
Sometimes it's hard for me to understand what Xtext does with my
grammar. As an old ANTLR user I sometimes feel a little bit helpless
with regards to the transformation Xtext applies... But my understanding
is getting better, especially thanks to the support of this newsgroup ;-)

Best regards,
Matthias

Sebastian Zarnekow schrieb:
> Hi Matthias,
>
> it seems that you have a conflict between the '+' sign as a keyword in
> Number / Name and the '+' sign as symbol tail. I'd try to inline the
> '+'/'-' from symbol tail into the rule that uses symbol tail. However
> it's only a shot into the dark.
>
> Regards,
> Sebastian
Re: [Xtext] Does Xtext really support LL(*) parsing? [message #640880 is a reply to message #640799] Tue, 23 November 2010 12:20 Go to previous messageGo to next message
Sebastian Zarnekow is currently offline Sebastian ZarnekowFriend
Messages: 3118
Registered: July 2009
Senior Member
Hi Matthias,

if you can write an Antlr grammar without backtracking or semantic
predicates, it should be possible to write an Xtext grammar for the very
same language, too. How does your final grammar look like?

Regards,
Sebastian
--
Need professional support for Eclipse Modeling?
Go visit: http://xtext.itemis.com

Am 23.11.10 09:40, schrieb Matthias Köster:
> Hi Sebastian,
>
> Inlining did the trick, it now works correctly. But I had to enable
> backtracking, which I still find confusing. I'm quite sure that I can
> build an ANTLR grammar without backtracking. Backtracking is -
> hopefullly - no problem for my project, but I'm feeling a little bit
> uncomfortable with having to use it :-)
> Sometimes it's hard for me to understand what Xtext does with my
> grammar. As an old ANTLR user I sometimes feel a little bit helpless
> with regards to the transformation Xtext applies... But my understanding
> is getting better, especially thanks to the support of this newsgroup ;-)
>
> Best regards,
> Matthias
>
> Sebastian Zarnekow schrieb:
>> Hi Matthias,
>>
>> it seems that you have a conflict between the '+' sign as a keyword in
>> Number / Name and the '+' sign as symbol tail. I'd try to inline the
>> '+'/'-' from symbol tail into the rule that uses symbol tail. However
>> it's only a shot into the dark.
>>
>> Regards,
>> Sebastian
Re: [Xtext] Does Xtext really support LL(*) parsing? [message #640897 is a reply to message #640880] Tue, 23 November 2010 13:35 Go to previous messageGo to next message
Eclipse UserFriend
Originally posted by: koester.matthias.gmx.de

Hi Sebastian,

The grammar I showed was just a way to solve a problem I had with my
clojure grammar. And I haven't commited or otherwise kept a copy of it.
But I think the correctly working grammar looked like:

grammar org.xtext.example.mydsl.MyDsl
hidden (WSS,SL_COMMENT)

generate myDsl "http://www.xtext.org/example/mydsl/MyDsl"

import "http://www.eclipse.org/emf/2002/Ecore" as ecore

Program:
literals+=Literal*;

Literal:
keyword=Keyword|
name=QualifiedName|
number=Number;

QualifiedName:
NAME (('$'|'.'|'/') NAME)*;

Number:
SIGN? DIGIT+(('.' DIGIT*)|('r' DIGIT+))?'M'?;

Keyword:
':' NAME;

// terminal rules

terminal NAME:
SIGN (SYMBOL_START (SYMBOL_START|DIGIT|SIGN)*)?|
SYMBOL_START (SYMBOL_START|DIGIT|SIGN)*;

terminal SIGN:
('-'|'+');

terminal SYMBOL_START:
'a'..'z'|'A'..'Z'|'_'|'*'|'!'|'?'|'='|'#'|'<'|'>'|'&';

terminal DIGIT:
('0'..'9');

terminal WSS:
(' '|'\t'|'\r'|'\n'|',')+;

terminal SL_COMMENT:
';' !('\n'|'\r')* ('\r'? '\n')?;

I switched the NAME rule back to a terminal rule because otherwise
"name1 name2" would have been parsed as one Name. I still have this
problem with QualifiedName: clojure - as well as Java ;-) - doesn't
allow to put comments or whitespaces between the parts of a qualified
name, but I understand how useful it is to have data type rules.
Hopefully ANTLR 4 will support lexers that support LL(*) lexer rules so
that Xtext can support this as well.

Best regards,
Matthias

Sebastian Zarnekow schrieb:
> Hi Matthias,
>
> if you can write an Antlr grammar without backtracking or semantic
> predicates, it should be possible to write an Xtext grammar for the very
> same language, too. How does your final grammar look like?
>
> Regards,
> Sebastian
Re: [Xtext] Does Xtext really support LL(*) parsing? [message #641456 is a reply to message #640897] Thu, 25 November 2010 14:50 Go to previous message
Eclipse UserFriend
Originally posted by: koester.matthias.gmx.de

Hi,

Today I finally realized how to solve the problem with hidden tokens in
my QualifiedName rule! I realized that hiden() can be used per rule and
not just per grammar! Great, my grammar works much better now and can
handle almost every weird Clojure construct. Xtext really provides more
clever solutions then I imagined! Xtext really rocks!

Now I just need to understand why my parser throws a
StackOverflowException while parsing big files with memoize = true ;-)

Best regards,
Matthias

Matthias Köster schrieb:
> Hi Sebastian,
>
> The grammar I showed was just a way to solve a problem I had with my
> clojure grammar. And I haven't commited or otherwise kept a copy of it.
> But I think the correctly working grammar looked like:
>
> grammar org.xtext.example.mydsl.MyDsl
> hidden (WSS,SL_COMMENT)
>
> generate myDsl "http://www.xtext.org/example/mydsl/MyDsl"
>
> import "http://www.eclipse.org/emf/2002/Ecore" as ecore
>
> Program:
> literals+=Literal*;
>
> Literal:
> keyword=Keyword|
> name=QualifiedName|
> number=Number;
>
> QualifiedName:
> NAME (('$'|'.'|'/') NAME)*;
>
> Number:
> SIGN? DIGIT+(('.' DIGIT*)|('r' DIGIT+))?'M'?;
>
> Keyword:
> ':' NAME;
>
> // terminal rules
>
> terminal NAME:
> SIGN (SYMBOL_START (SYMBOL_START|DIGIT|SIGN)*)?|
> SYMBOL_START (SYMBOL_START|DIGIT|SIGN)*;
>
> terminal SIGN:
> ('-'|'+');
>
> terminal SYMBOL_START:
> 'a'..'z'|'A'..'Z'|'_'|'*'|'!'|'?'|'='|'#'|'<'|'>'|'&';
>
> terminal DIGIT:
> ('0'..'9');
>
> terminal WSS:
> (' '|'\t'|'\r'|'\n'|',')+;
>
> terminal SL_COMMENT:
> ';' !('\n'|'\r')* ('\r'? '\n')?;
>
> I switched the NAME rule back to a terminal rule because otherwise
> "name1 name2" would have been parsed as one Name. I still have this
> problem with QualifiedName: clojure - as well as Java ;-) - doesn't
> allow to put comments or whitespaces between the parts of a qualified
> name, but I understand how useful it is to have data type rules.
> Hopefully ANTLR 4 will support lexers that support LL(*) lexer rules so
> that Xtext can support this as well.
>
> Best regards,
> Matthias
>
> Sebastian Zarnekow schrieb:
>> Hi Matthias,
>>
>> if you can write an Antlr grammar without backtracking or semantic
>> predicates, it should be possible to write an Xtext grammar for the
>> very same language, too. How does your final grammar look like?
>>
>> Regards,
>> Sebastian
Previous Topic:[SOLVED] Content Assistent not working as expected
Next Topic:Conditional template
Goto Forum:
  


Current Time: Mon May 13 03:55:46 GMT 2024

Powered by FUDForum. Page generated in 0.03608 seconds
.:: Contact :: Home ::.

Powered by: FUDforum 3.0.2.
Copyright ©2001-2010 FUDforum Bulletin Board Software

Back to the top