Skip to main content


Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Modeling » TMF (Xtext) » Generic grammar makes Eclipse throw OutOfMemoryError when running editor
Generic grammar makes Eclipse throw OutOfMemoryError when running editor [message #814059] Tue, 06 March 2012 01:26 Go to next message
Alex Ruiz is currently offline Alex RuizFriend
Messages: 103
Registered: March 2011
Senior Member
Greetings,

I'm trying to create an Xtext grammar for our BUILD language. The language seems to be pretty simple, but it is giving me a hard time Smile

The grammar I have is:
grammar com.google.eclipse.buildeditor2.BuildLang hidden(WS, SL_COMMENT)

import "http://www.eclipse.org/emf/2002/Ecore" as ecore
generate buildLang "http://www.google.com/eclipse/buildeditor2/BuildLang"

FileInput:
  statements+=SimpleStatement*;

SimpleStatement:
  statements+=UnknownRule (';' statements+=UnknownRule)* ';'?;

UnknownRule:
  id=ID '(' fields+=UnknownField (',' fields+=UnknownField)* (',')? ')';

UnknownField:
 name=FieldName '=' value=Value;

Value:
  StringLink | IntLink;

IntLink:
  target=INT;
  
StringLink:
  target=STRING;
  
// Terminals
terminal ID: 
  (('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'_'|'0'..'9')*);

terminal FieldName: ('a'..'z')*;

terminal INT returns ecore::EInt:
  ('-')?('0'..'9')+;

terminal SL_COMMENT:
  '#' !('\n' | '\r')* ('\r'? '\n')?;

terminal STRING:
  '"' ('\\' ('b' | 't' | 'n' | 'f' | 'r' | 'u' | '"' | "'" | '\\') | !('\\' | '"'))* '"' |
  "'" ('\\' ('b' | 't' | 'n' | 'f' | 'r' | 'u' | '"' | "'" | '\\') | !('\\' | "'"))* "'";

terminal WS:
  (' ' | '\t' | '\r' | '\n')+;


A typical file using this grammar will be like this:

cc_library(name = 'someCCLibrary',
src = "someDeps",
local = 1,);

The problem is that we have a finite set of defined rules (e.g. cc_library) but users are allowed to extend and created whatever they want! So I'm trying to create something very generic: a grammar that would accept any rule, with any fields.

This is what I get when I run .mwe2 file:

warning(200): ../com.google.eclipse.buildeditor2/src-gen/com/google/eclipse/buildeditor2/parser/antlr/internal/InternalBuildLang.g:159:2: Decision can match input such as "';' RULE_ID '(' RULE_FIELDNAME '=' RULE_STRING ',' RULE_FIELDNAME '=' RULE_STRING ')'" using multiple alternatives: 1, 2
As a result, alternative(s) 2 were disabled for that input
3128 [main] INFO  or.validation.JavaValidatorFragment  - generating Java-based EValidator API
warning(200): ../com.google.eclipse.buildeditor2.ui/src-gen/com/google/eclipse/buildeditor2/ui/contentassist/antlr/internal/InternalBuildLang.g:328:35: Decision can match input such as "';' RULE_ID '(' RULE_FIELDNAME '=' RULE_STRING ',' RULE_FIELDNAME '=' RULE_STRING ')'" using multiple alternatives: 1, 2
As a result, alternative(s) 2 were disabled for that input


And when I run the editor, I takes forever to load and dies with an OutOfMemoryError. It used to work when I had specific build rules in the grammar, instead of this "generic" ones.

Any hints/help will be greatly appreciated Smile

Many thanks in advance,
-Alex
Re: Generic grammar makes Eclipse throw OutOfMemoryError when running editor [message #814122 is a reply to message #814059] Tue, 06 March 2012 03:47 Go to previous messageGo to next message
Henrik Lindberg is currently offline Henrik LindbergFriend
Messages: 2509
Registered: July 2009
Senior Member
Basically, you have an "overlapping terminals" problem.

Your lexer will never produce a FieldName token because ID will always
be returned.

Try this:
UnknownField : name=ID '=' value=Value;

and remove the terminal FieldName.

And then validate that UnknownField's name complies with the more
restrictive rule ('a'..'z')*

I think that may cure your problem (but I did not read the rest of your
rules in great detail). At one point I did run into an issue with
trailing optional delimiters like ','? and I had to use a separate rule
- e.g.

endComma : ',' ;

and ... endComma? ; in the rule

But I don't think that is the case here. Try fixing the overlapping
terminal first and see if the problem goes away.

Regards
- henrik

On 2012-06-03 2:26, Alex Ruiz wrote:
> Greetings,
>
> I'm trying to create an Xtext grammar for our BUILD language. The
> language seems to be pretty simple, but it is giving me a hard time :)
>
> The grammar I have is:
>
> grammar com.google.eclipse.buildeditor2.BuildLang hidden(WS, SL_COMMENT)
>
> import "http://www.eclipse.org/emf/2002/Ecore" as ecore
> generate buildLang "http://www.google.com/eclipse/buildeditor2/BuildLang"
>
> FileInput:
> statements+=SimpleStatement*;
>
> SimpleStatement:
> statements+=UnknownRule (';' statements+=UnknownRule)* ';'?;
>
> UnknownRule:
> id=ID '(' fields+=UnknownField (',' fields+=UnknownField)* (',')? ')';
>
> UnknownField:
> name=FieldName '=' value=Value;
>
> Value:
> StringLink | IntLink;
>
> IntLink:
> target=INT;
>
> StringLink:
> target=STRING;
>
> // Terminals
> terminal ID: (('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'_'|'0'..'9')*);
>
> terminal FieldName: ('a'..'z')*;


>
> terminal INT returns ecore::EInt:
> ('-')?('0'..'9')+;
>
> terminal SL_COMMENT:
> '#' !('\n' | '\r')* ('\r'? '\n')?;
>
> terminal STRING:
> '"' ('\\' ('b' | 't' | 'n' | 'f' | 'r' | 'u' | '"' | "'" | '\\') |
> !('\\' | '"'))* '"' |
> "'" ('\\' ('b' | 't' | 'n' | 'f' | 'r' | 'u' | '"' | "'" | '\\') |
> !('\\' | "'"))* "'";
>
> terminal WS:
> (' ' | '\t' | '\r' | '\n')+;
>
>
> A typical file using this grammar will be like this:
>
> cc_library(name = 'someCCLibrary',
> src = "someDeps",
> local = 1,);
>
> The problem is that we have a finite set of defined rules (e.g.
> cc_library) but users are allowed to extend and created whatever they
> want! So I'm trying to create something very generic: a grammar that
> would accept any rule, with any fields.
>
> This is what I get when I run .mwe2 file:
>
>
> warning(200):
> ../com.google.eclipse.buildeditor2/src-gen/com/google/eclipse/buildeditor2/parser/antlr/internal/InternalBuildLang.g:159:2:
> Decision can match input such as "';' RULE_ID '(' RULE_FIELDNAME '='
> RULE_STRING ',' RULE_FIELDNAME '=' RULE_STRING ')'" using multiple
> alternatives: 1, 2
> As a result, alternative(s) 2 were disabled for that input
> 3128 [main] INFO or.validation.JavaValidatorFragment - generating
> Java-based EValidator API
> warning(200):
> ../com.google.eclipse.buildeditor2.ui/src-gen/com/google/eclipse/buildeditor2/ui/contentassist/antlr/internal/InternalBuildLang.g:328:35:
> Decision can match input such as "';' RULE_ID '(' RULE_FIELDNAME '='
> RULE_STRING ',' RULE_FIELDNAME '=' RULE_STRING ')'" using multiple
> alternatives: 1, 2
> As a result, alternative(s) 2 were disabled for that input
>
>
> And when I run the editor, I takes forever to load and dies with an
> OutOfMemoryError. It used to work when I had specific build rules in the
> grammar, instead of this "generic" ones.
>
> Any hints/help will be greatly appreciated :)
>
> Many thanks in advance,
> -Alex
Re: Generic grammar makes Eclipse throw OutOfMemoryError when running editor [message #814314 is a reply to message #814059] Tue, 06 March 2012 09:51 Go to previous messageGo to next message
Sebastian Zarnekow is currently offline Sebastian ZarnekowFriend
Messages: 3107
Registered: July 2009
Senior Member
Hi Alex,

Antlr tells you that it removed some alternatives from your otherwise
ambiguous grammar. That may lead to infinite error recovery and thereby
to an OOM.

The problem is, that the sequence

UnknownRule ; UnknownRule ; UnknownRule

may either be parsed as a FileInput with a single SimpleStatement which
has three statements or it can be parsed as a FileInput with two
instances of SimpleStatement:

UnknownRule ; UnknownRule ; (optional semi at end)
UnknownRule

I don't know about the processing that you want to do with your language
but it seems to me that the optional trailing semicolon is redundant.

Please note that FieldName matches an empty input which may not what
you'd expect. I recommend to remove it and use ID instead:

UnknownField:
name=ID '=' value=Value;

A validation rule could enforce that only lowercase letters are used as
FieldName.

Regards,
Sebastian
--
Need professional support for Eclipse Modeling?
Go visit: http://xtext.itemis.com


Am 06.03.12 02:26, schrieb Alex Ruiz:
> Greetings,
>
> I'm trying to create an Xtext grammar for our BUILD language. The
> language seems to be pretty simple, but it is giving me a hard time :)
>
> The grammar I have is:
>
> grammar com.google.eclipse.buildeditor2.BuildLang hidden(WS, SL_COMMENT)
>
> import "http://www.eclipse.org/emf/2002/Ecore" as ecore
> generate buildLang "http://www.google.com/eclipse/buildeditor2/BuildLang"
>
> FileInput:
> statements+=SimpleStatement*;
>
> SimpleStatement:
> statements+=UnknownRule (';' statements+=UnknownRule)* ';'?;
>
> UnknownRule:
> id=ID '(' fields+=UnknownField (',' fields+=UnknownField)* (',')? ')';
>
> UnknownField:
> name=FieldName '=' value=Value;
>
> Value:
> StringLink | IntLink;
>
> IntLink:
> target=INT;
>
> StringLink:
> target=STRING;
>
> // Terminals
> terminal ID: (('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'_'|'0'..'9')*);
>
> terminal FieldName: ('a'..'z')*;
>
> terminal INT returns ecore::EInt:
> ('-')?('0'..'9')+;
>
> terminal SL_COMMENT:
> '#' !('\n' | '\r')* ('\r'? '\n')?;
>
> terminal STRING:
> '"' ('\\' ('b' | 't' | 'n' | 'f' | 'r' | 'u' | '"' | "'" | '\\') |
> !('\\' | '"'))* '"' |
> "'" ('\\' ('b' | 't' | 'n' | 'f' | 'r' | 'u' | '"' | "'" | '\\') |
> !('\\' | "'"))* "'";
>
> terminal WS:
> (' ' | '\t' | '\r' | '\n')+;
>
>
> A typical file using this grammar will be like this:
>
> cc_library(name = 'someCCLibrary',
> src = "someDeps",
> local = 1,);
>
> The problem is that we have a finite set of defined rules (e.g.
> cc_library) but users are allowed to extend and created whatever they
> want! So I'm trying to create something very generic: a grammar that
> would accept any rule, with any fields.
>
> This is what I get when I run .mwe2 file:
>
>
> warning(200):
> ../com.google.eclipse.buildeditor2/src-gen/com/google/eclipse/buildeditor2/parser/antlr/internal/InternalBuildLang.g:159:2:
> Decision can match input such as "';' RULE_ID '(' RULE_FIELDNAME '='
> RULE_STRING ',' RULE_FIELDNAME '=' RULE_STRING ')'" using multiple
> alternatives: 1, 2
> As a result, alternative(s) 2 were disabled for that input
> 3128 [main] INFO or.validation.JavaValidatorFragment - generating
> Java-based EValidator API
> warning(200):
> ../com.google.eclipse.buildeditor2.ui/src-gen/com/google/eclipse/buildeditor2/ui/contentassist/antlr/internal/InternalBuildLang.g:328:35:
> Decision can match input such as "';' RULE_ID '(' RULE_FIELDNAME '='
> RULE_STRING ',' RULE_FIELDNAME '=' RULE_STRING ')'" using multiple
> alternatives: 1, 2
> As a result, alternative(s) 2 were disabled for that input
>
>
> And when I run the editor, I takes forever to load and dies with an
> OutOfMemoryError. It used to work when I had specific build rules in the
> grammar, instead of this "generic" ones.
>
> Any hints/help will be greatly appreciated :)
>
> Many thanks in advance,
> -Alex
Re: Generic grammar makes Eclipse throw OutOfMemoryError when running editor [message #814409 is a reply to message #814314] Tue, 06 March 2012 12:11 Go to previous message
Alex Ruiz is currently offline Alex RuizFriend
Messages: 103
Registered: March 2011
Senior Member
Henrik & Sebastian,

Thank you so much for your help, guys! Removing the terminal FieldName did the trick! Smile

Sebastian, you are right, optional trailing semicolon is redundant.

Cheers!
-Alex
Previous Topic:Problem with validation
Next Topic:How can I get JvmGenericType which has type parameters from build pathes
Goto Forum:
  


Current Time: Mon Mar 30 17:34:20 GMT 2020

Powered by FUDForum. Page generated in 0.02865 seconds
.:: Contact :: Home ::.

Powered by: FUDforum 3.0.2.
Copyright ©2001-2010 FUDforum Bulletin Board Software

Back to the top