Skip to main content


Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Modeling » TMF (Xtext) » Lookahead value of xtext grammar
Lookahead value of xtext grammar [message #52116] Thu, 18 June 2009 02:06 Go to next message
Erol Akarsu is currently offline Erol AkarsuFriend
Messages: 2
Registered: July 2009
Junior Member
I am stress testing tmf xtext with a little large grammar.
I have to parse it with lookahead value 4 bceause lookahead =1 is not
enough for the nature of grammar.

How can I specify the lookahead value?

Regards

Erol Akarsu
Re: Lookahead value of xtext grammar [message #52382 is a reply to message #52116] Fri, 19 June 2009 06:09 Go to previous messageGo to next message
Knut Wannheden is currently offline Knut WannhedenFriend
Messages: 298
Registered: July 2009
Senior Member
Hi Erol,

If you use the ANTLR parser with Xtext you get an ANTLR v3 parser which
uses an LL(*) parsing algorithm -- i.e. arbitrary lookahead -- so you
shouldn't have to restrict the lookahead. But if you want you can do this
by specifying a nested <options> element to the
XtextAntlrGeneratorFragment in your MWE workflow (in this example I've
also enabled backtracking with memoization):

<fragment
class="de.itemis.xtext.antlr.XtextAntlrGeneratorFragment">
<options k="4" backtrack="true" memoize="true"/>
</fragment>

Hope that helps,

--knut

Erol Akarsu wrote:

> I am stress testing tmf xtext with a little large grammar.
> I have to parse it with lookahead value 4 bceause lookahead =1 is not
> enough for the nature of grammar.

> How can I specify the lookahead value?

> Regards

> Erol Akarsu
Re: Lookahead value of xtext grammar [message #871495 is a reply to message #52382] Mon, 14 May 2012 08:47 Go to previous messageGo to next message
Renaud Helias is currently offline Renaud HeliasFriend
Messages: 4
Registered: May 2012
Junior Member
I am stress testing tmf xtext with a little large grammar.
I have to parse it with lookahead value 4 bceause lookahead =1 is not
enough for the nature of grammar.

How can I specify the lookaheadSSSSS valueSSSSS ?

I personnaly need a lookahead (options {k=1;}) value for each branch of grammar, instead of just root (changing it in mwe2 doesn't cover full grammar but just from root of it)
I don't find a way in order to do that.

=========================================
stress used : C.g grammar from Terence Parr ( ANTLR=>grammar=>C Preprocessor )

C Hello Word failed on Xtext.

Fail example :
int toto = 45;//failed
int * toto = 45; //correct


Rules failing here :
declarator
	: pointer? direct_declarator
	| pointer
	;
pointer
	: '*' type_qualifier+ pointer?
	| '*' pointer
	| '*'
	;


index.php/fa/9661/0/

Remark for using MyDsl.xtext (my file is attached) : you have one occurence of ")=>" on rule "external_declaration".

For this project, I am based on Xtext Hello Word project.

Remarks for forcing compilation, DO NOT INSERT MyDsl.xtext FILE INTO PROJECT WORKSPACE (Xtext Editor is too low for this large grammar), just modify GenerateMyDsl.mwe2 as it:

//var grammarURI = "classpath:/org/xtext/example/mydsl/MyDsl.xtext"
var grammarURI = "file:///C:\\Users\\freemac\\workspaceSDK\\MyDsl.xtext"
// it permits to edit text outside of Eclipse, without provocate "Xtext Validation" bug (lost of performance, and crash on my poor computer)

fragment = parser.antlr.XtextAntlrGeneratorFragment {
options = AntLrOptionsWithKAsString {
backtrack = true
memoize = true
kAsString = "3"
}
}

fragment = parser.antlr.XtextAntlrUiGeneratorFragment {
options = AntLrOptionsWithKAsString {
backtrack = true
memoize = true
kAsString = "3"
}
}

Time of compilation : a few more than 3 hours.
Need a fix after : "The code of method specialStateTransition(int, IntStream) is exceeding the 65535 bytes limit", on InternalMyDslParser.java that is a big file, it can be corrected manually by making a sub-function launch that contains cases switch range 0 to 400. It gives somethings like that :
// function to correct
public int specialStateTransition(int s, IntStream _input) throws NoViableAltException {
TokenStream input = (TokenStream)_input;
int _s = s;
if (s<326)
try {
return toto(s);
} catch (TotoException e) {
// TODO Auto-generated catch block
// continue
}
switch ( s ) {
case 400 : 
//... end of current function by there.

//sub-function inserted
private int toto(int s) throws TotoException {
switch (s) {
case 0 : 
//.. cases 0 to 399
break;
}
throw new TotoException();
}		


InternalMyDslParser will be on both org.xtext.example.mydsl and org.xtext.example.mydsl.ui parts of Xtext Hello Word project.

My feels abouts this bug :
k parameter (for lookahead depth), used on mwe2 file, affects only root of grammar.
Note that I used there k=3 on mwe2, using maximum of k value that occurs on C.g
But fact is that sub sub sub rule of C.g don't seem reaching pointer rule.

My project : mamevhdl
Main project is generating main part of VHDL code from C++ code of Mame. Resulting arcade schematics with wires and bottom components (each components empty but with Mame names, helping to normalize of arcade VHDL projects)
I am really new on Xtext and ANTLR.

A frenchy boy,

Renaud Helias
  • Attachment: C.g
    (Size: 10.91KB, Downloaded 119 times)
  • Attachment: MyDsl.xtext
    (Size: 11.63KB, Downloaded 161 times)
  • Attachment: hello_c.jpg
    (Size: 20.06KB, Downloaded 1202 times)
Re: Lookahead value of xtext grammar [message #871968 is a reply to message #871495] Tue, 15 May 2012 08:22 Go to previous messageGo to next message
Sebastian Zarnekow is currently offline Sebastian ZarnekowFriend
Messages: 3108
Registered: July 2009
Senior Member
Hi Renaud,

please double check whether it's really necessary to explicitly pass the
lookahead? Antlr will usually figure that out automagically. Regarding
the bytecode limit you may want to explore the the classSplitting option
in the Antlr options.

Regards,
Sebastian
--
Need professional support for Eclipse Modeling?
Go visit: http://xtext.itemis.com

Am 14.05.12 10:47, schrieb Renaud Helias:
> I am stress testing tmf xtext with a little large grammar.
> I have to parse it with lookahead value 4 bceause lookahead =1 is not
> enough for the nature of grammar.
>
> How can I specify the lookaheadSSSSS valueSSSSS ?
>
> I personnaly need a lookahead (options {k=1;}) value for each branch of grammar, instead of just root (changing it in mwe2 doesn't cover full grammar but just from root of it)
> I don't find a way in order to do that.
>
> =========================================
> stress used : C.g grammar from Terence Parr ( ANTLR=>grammar=>C Preprocessor )
>
> C Hello Word failed on Xtext.
>
> Fail example :
> int toto = 45;//failed
> int * toto = 45; //correct
>
> Rules failing here :
> declarator
> : pointer? direct_declarator
> | pointer
> ;
> pointer
> : '*' type_qualifier+ pointer?
> | '*' pointer
> | '*'
> ;
>
>
>
> Remark for using MyDsl.xtext (my file is attached) : you have one occurence of ")=>" on rule "external_declaration".
>
> For this project, I am based on Xtext Hello Word project.
>
> Remarks for forcing compilation, DO NOT INSERT MyDsl.xtext FILE INTO PROJECT WORKSPACE (Xtext Editor is too low for this large grammar), just modify GenerateMyDsl.mwe2 as it:
>
> //var grammarURI = "classpath:/org/xtext/example/mydsl/MyDsl.xtext"
> var grammarURI = "file:///C:\\Users\\freemac\\workspaceSDK\\MyDsl.xtext"
> // it permits to edit text outside of Eclipse, without provocate "Xtext Validation" bug (lost of performance, and crash on my poor computer)
>
> fragment = parser.antlr.XtextAntlrGeneratorFragment {
> options = AntLrOptionsWithKAsString {
> backtrack = true
> memoize = true
> kAsString = "3"
> }
> }
>
> fragment = parser.antlr.XtextAntlrUiGeneratorFragment {
> options = AntLrOptionsWithKAsString {
> backtrack = true
> memoize = true
> kAsString = "3"
> }
> }
>
> Time of compilation : a few more than 3 hours.
> Need a fix after : "The code of method specialStateTransition(int, IntStream) is exceeding the 65535 bytes limit", on InternalMyDslParser.java that is a big file, it can be corrected manually by making a sub-function launch that contains cases switch range 0 to 400. It gives somethings like that :
>
> // function to correct
> public int specialStateTransition(int s, IntStream _input) throws NoViableAltException {
> TokenStream input = (TokenStream)_input;
> int _s = s;
> if (s<326)
> try {
> return toto(s);
> } catch (TotoException e) {
> // TODO Auto-generated catch block
> // continue
> }
> switch ( s ) {
> case 400 :
> //... end of current function by there.
>
> //sub-function inserted
> private int toto(int s) throws TotoException {
> switch (s) {
> case 0 :
> //.. cases 0 to 399
> break;
> }
> throw new TotoException();
> }
>
>
> InternalMyDslParser will be on both org.xtext.example.mydsl and org.xtext.example.mydsl.ui parts of Xtext Hello Word project.
>
> My feels abouts this bug :
> k parameter (for lookahead depth), used on mwe2 file, affects only root of grammar.
> Note that I used there k=3 on mwe2, using maximum of k value that occurs on C.g
> But fact is that sub sub sub rule of C.g don't seem reaching pointer rule.
>
> My project : mamevhdl
> Main project is generating main part of VHDL code from C++ code of Mame. Resulting arcade schematics with wires and bottom components (each components empty but with Mame names, helping to normalize of arcade VHDL projects)
> I am really new on Xtext and ANTLR.
>
> A frenchy boy,
>
> Renaud Helias
Re: Lookahead value of xtext grammar [message #873262 is a reply to message #871968] Thu, 17 May 2012 20:33 Go to previous messageGo to next message
Renaud Helias is currently offline Renaud HeliasFriend
Messages: 4
Registered: May 2012
Junior Member
Hi Sebastian,

It seems that there is no magic backtrack/LL(*) occurring in this sorcerer backtrack/LL(k) case.

k and backtrack are used directly on branches for prototyping a grammar (for example coming from a BNF), I think C.g is an evil case for its ANTLR algorithm test.
I thinked that k was only used for performance... damn.

I have done an other test : I compare use of parameters between C.g and my MyDsl.g (the same playing with backtrack/memoize options only at top, and without using k)

If my input file hello.c is :
    /* Hello World program */

    #include<stdio.h>
    int * r=4;
    * main()
    {
    printf("Hello World");
    }

My log output is :
Test de C.g
Test de MyDsl.g backtrack=true memoize=false
Test de MyDsl.g backtrack=true memoize=true
Test de MyDsl.g backtrack=true memoize=false et syntactic predicates


And then if my input file hello.c is :
    /* Hello World program */

    #include<stdio.h>
    int r=4;
    main()
    {
    printf("Hello World");
    }

My log output is :
Test de C.g
Test de MyDsl.g backtrack=true memoize=false
hello.c line 4:9 mismatched input '=' expecting ';'
hello.c line 5:9 no viable alternative at input ')'
hello.c line 6:4 mismatched input '{' expecting ';'
hello.c line 7:11 no viable alternative at input '"Hello World"'
Test de MyDsl.g backtrack=true memoize=true
hello.c line 4:9 mismatched input '=' expecting ';'
hello.c line 5:9 no viable alternative at input ')'
hello.c line 6:4 mismatched input '{' expecting ';'
hello.c line 7:11 no viable alternative at input '"Hello World"'
Test de MyDsl.g backtrack=true memoize=false et syntactic predicates
hello.c line 4:9 mismatched input '=' expecting ';'
hello.c line 5:9 no viable alternative at input ')'
hello.c line 6:4 mismatched input '{' expecting ';'
hello.c line 7:11 no viable alternative at input '"Hello World"'


Here attached this basic java project source code (not Xtext)

A frenchy boy,

Renaud

[Updated on: Thu, 17 May 2012 20:45]

Report message to a moderator

Re: Lookahead value of xtext grammar [message #873962 is a reply to message #873262] Sat, 19 May 2012 15:54 Go to previous messageGo to next message
Renaud Helias is currently offline Renaud HeliasFriend
Messages: 4
Registered: May 2012
Junior Member
Hi Sebastian,

I founded a semantic predicate on C.g, k parameter didn't affect here (k no sense here). I read somewhere that semantic predicate are not implemented on Xtext.
It is ok for me, to compare Xtext/model and ANTLR/semantic-predicates is like comparing DOM and SAX Cool

I think that LL(*) just cover LL(k), with or without using backtrack.

memoize parameter is just a optimisation parameter that didn't have any border effect.

Problem here came just from a special case where ANTLR crap itself :
index.php/fa/9780/0/
a : b c?;
b : ('toto'|IDENTIFIER)+;
c : IDENTIFIER | IDENTIFIER '=';

toto points p=


index.php/fa/9781/0/
Thanks a lot for your help,

++
Renaud
  • Attachment: toto_points_p_eq.png
    (Size: 16.13KB, Downloaded 1187 times)
  • Attachment: hello_c_ok.png
    (Size: 16.55KB, Downloaded 1206 times)
  • Attachment: MyDsl.g
    (Size: 11.40KB, Downloaded 219 times)
  • Attachment: C.g
    (Size: 10.91KB, Downloaded 121 times)
  • Attachment: MyDsl.xtext
    (Size: 11.77KB, Downloaded 245 times)

[Updated on: Sat, 19 May 2012 16:54]

Report message to a moderator

Re: Lookahead value of xtext grammar [message #873983 is a reply to message #873962] Sat, 19 May 2012 17:16 Go to previous message
Sebastian Zarnekow is currently offline Sebastian ZarnekowFriend
Messages: 3108
Registered: July 2009
Senior Member
Renaud,

please refer to the docs on Xtext's predicates.
I'm not sure exactly what you expect right now. What I got is that you
face some issues with the migration of a rather complex, pure Antlr
grammar to Xtext.

The given snippet is obviously ambiguous since the input

toto points

may either be
a [ b [ 'toto' points ] ]
or
a [ b [ 'toto' ] c [ points ] ]

thus Antlr will give you some warnings if you disable backtracking and
remove some alternatives from your grammar. This leads to the parse
errors for toto points p=.

Regards,
Sebastian
--
Need professional support for Eclipse Modeling?
Go visit: http://xtext.itemis.com

Am 19.05.12 17:54, schrieb Renaud Helias:
> Hi Sebastian,
>
> I founded a semantic predicate on C.g, k parameter didn't affect here (k no sense here). I read somewhere that semantic predicate are not implemented on Xtext.
> I think that LL(*) just cover LL(k), with or without using backtrack.
>
> memoize parameter is just a optimisation parameter that didn't have any border effect.
>
> Problem here came just from a special case where ANTLR crap itself :
>
> a : b c?;
> b : ('toto'|IDENTIFIER)+;
> c : IDENTIFIER | IDENTIFIER '=';
> toto points p=
>
> ++
> Renaud
Previous Topic:Autocomplete import example
Next Topic:XText 2.3.0M6 problem with setInitializer
Goto Forum:
  


Current Time: Sun Jul 12 08:00:58 GMT 2020

Powered by FUDForum. Page generated in 0.01894 seconds
.:: Contact :: Home ::.

Powered by: FUDforum 3.0.2.
Copyright ©2001-2010 FUDforum Bulletin Board Software

Back to the top