Home » Modeling » TMF (Xtext) » Generic grammar makes Eclipse throw OutOfMemoryError when running editor
Generic grammar makes Eclipse throw OutOfMemoryError when running editor [message #814059] |
Tue, 06 March 2012 01:26 |
Alex Ruiz Messages: 103 Registered: March 2011 |
Senior Member |
|
|
Greetings,
I'm trying to create an Xtext grammar for our BUILD language. The language seems to be pretty simple, but it is giving me a hard time
The grammar I have is:
grammar com.google.eclipse.buildeditor2.BuildLang hidden(WS, SL_COMMENT)
import "http://www.eclipse.org/emf/2002/Ecore" as ecore
generate buildLang "http://www.google.com/eclipse/buildeditor2/BuildLang"
FileInput:
statements+=SimpleStatement*;
SimpleStatement:
statements+=UnknownRule (';' statements+=UnknownRule)* ';'?;
UnknownRule:
id=ID '(' fields+=UnknownField (',' fields+=UnknownField)* (',')? ')';
UnknownField:
name=FieldName '=' value=Value;
Value:
StringLink | IntLink;
IntLink:
target=INT;
StringLink:
target=STRING;
// Terminals
terminal ID:
(('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'_'|'0'..'9')*);
terminal FieldName: ('a'..'z')*;
terminal INT returns ecore::EInt:
('-')?('0'..'9')+;
terminal SL_COMMENT:
'#' !('\n' | '\r')* ('\r'? '\n')?;
terminal STRING:
'"' ('\\' ('b' | 't' | 'n' | 'f' | 'r' | 'u' | '"' | "'" | '\\') | !('\\' | '"'))* '"' |
"'" ('\\' ('b' | 't' | 'n' | 'f' | 'r' | 'u' | '"' | "'" | '\\') | !('\\' | "'"))* "'";
terminal WS:
(' ' | '\t' | '\r' | '\n')+;
A typical file using this grammar will be like this:
cc_library(name = 'someCCLibrary',
src = "someDeps",
local = 1,);
The problem is that we have a finite set of defined rules (e.g. cc_library) but users are allowed to extend and created whatever they want! So I'm trying to create something very generic: a grammar that would accept any rule, with any fields.
This is what I get when I run .mwe2 file:
warning(200): ../com.google.eclipse.buildeditor2/src-gen/com/google/eclipse/buildeditor2/parser/antlr/internal/InternalBuildLang.g:159:2: Decision can match input such as "';' RULE_ID '(' RULE_FIELDNAME '=' RULE_STRING ',' RULE_FIELDNAME '=' RULE_STRING ')'" using multiple alternatives: 1, 2
As a result, alternative(s) 2 were disabled for that input
3128 [main] INFO or.validation.JavaValidatorFragment - generating Java-based EValidator API
warning(200): ../com.google.eclipse.buildeditor2.ui/src-gen/com/google/eclipse/buildeditor2/ui/contentassist/antlr/internal/InternalBuildLang.g:328:35: Decision can match input such as "';' RULE_ID '(' RULE_FIELDNAME '=' RULE_STRING ',' RULE_FIELDNAME '=' RULE_STRING ')'" using multiple alternatives: 1, 2
As a result, alternative(s) 2 were disabled for that input
And when I run the editor, I takes forever to load and dies with an OutOfMemoryError. It used to work when I had specific build rules in the grammar, instead of this "generic" ones.
Any hints/help will be greatly appreciated
Many thanks in advance,
-Alex
|
|
|
Re: Generic grammar makes Eclipse throw OutOfMemoryError when running editor [message #814122 is a reply to message #814059] |
Tue, 06 March 2012 03:47 |
Henrik Lindberg Messages: 2509 Registered: July 2009 |
Senior Member |
|
|
Basically, you have an "overlapping terminals" problem.
Your lexer will never produce a FieldName token because ID will always
be returned.
Try this:
UnknownField : name=ID '=' value=Value;
and remove the terminal FieldName.
And then validate that UnknownField's name complies with the more
restrictive rule ('a'..'z')*
I think that may cure your problem (but I did not read the rest of your
rules in great detail). At one point I did run into an issue with
trailing optional delimiters like ','? and I had to use a separate rule
- e.g.
endComma : ',' ;
and ... endComma? ; in the rule
But I don't think that is the case here. Try fixing the overlapping
terminal first and see if the problem goes away.
Regards
- henrik
On 2012-06-03 2:26, Alex Ruiz wrote:
> Greetings,
>
> I'm trying to create an Xtext grammar for our BUILD language. The
> language seems to be pretty simple, but it is giving me a hard time :)
>
> The grammar I have is:
>
> grammar com.google.eclipse.buildeditor2.BuildLang hidden(WS, SL_COMMENT)
>
> import "http://www.eclipse.org/emf/2002/Ecore" as ecore
> generate buildLang "http://www.google.com/eclipse/buildeditor2/BuildLang"
>
> FileInput:
> statements+=SimpleStatement*;
>
> SimpleStatement:
> statements+=UnknownRule (';' statements+=UnknownRule)* ';'?;
>
> UnknownRule:
> id=ID '(' fields+=UnknownField (',' fields+=UnknownField)* (',')? ')';
>
> UnknownField:
> name=FieldName '=' value=Value;
>
> Value:
> StringLink | IntLink;
>
> IntLink:
> target=INT;
>
> StringLink:
> target=STRING;
>
> // Terminals
> terminal ID: (('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'_'|'0'..'9')*);
>
> terminal FieldName: ('a'..'z')*;
>
> terminal INT returns ecore::EInt:
> ('-')?('0'..'9')+;
>
> terminal SL_COMMENT:
> '#' !('\n' | '\r')* ('\r'? '\n')?;
>
> terminal STRING:
> '"' ('\\' ('b' | 't' | 'n' | 'f' | 'r' | 'u' | '"' | "'" | '\\') |
> !('\\' | '"'))* '"' |
> "'" ('\\' ('b' | 't' | 'n' | 'f' | 'r' | 'u' | '"' | "'" | '\\') |
> !('\\' | "'"))* "'";
>
> terminal WS:
> (' ' | '\t' | '\r' | '\n')+;
>
>
> A typical file using this grammar will be like this:
>
> cc_library(name = 'someCCLibrary',
> src = "someDeps",
> local = 1,);
>
> The problem is that we have a finite set of defined rules (e.g.
> cc_library) but users are allowed to extend and created whatever they
> want! So I'm trying to create something very generic: a grammar that
> would accept any rule, with any fields.
>
> This is what I get when I run .mwe2 file:
>
>
> warning(200):
> ../com.google.eclipse.buildeditor2/src-gen/com/google/eclipse/buildeditor2/parser/antlr/internal/InternalBuildLang.g:159:2:
> Decision can match input such as "';' RULE_ID '(' RULE_FIELDNAME '='
> RULE_STRING ',' RULE_FIELDNAME '=' RULE_STRING ')'" using multiple
> alternatives: 1, 2
> As a result, alternative(s) 2 were disabled for that input
> 3128 [main] INFO or.validation.JavaValidatorFragment - generating
> Java-based EValidator API
> warning(200):
> ../com.google.eclipse.buildeditor2.ui/src-gen/com/google/eclipse/buildeditor2/ui/contentassist/antlr/internal/InternalBuildLang.g:328:35:
> Decision can match input such as "';' RULE_ID '(' RULE_FIELDNAME '='
> RULE_STRING ',' RULE_FIELDNAME '=' RULE_STRING ')'" using multiple
> alternatives: 1, 2
> As a result, alternative(s) 2 were disabled for that input
>
>
> And when I run the editor, I takes forever to load and dies with an
> OutOfMemoryError. It used to work when I had specific build rules in the
> grammar, instead of this "generic" ones.
>
> Any hints/help will be greatly appreciated :)
>
> Many thanks in advance,
> -Alex
|
|
|
Re: Generic grammar makes Eclipse throw OutOfMemoryError when running editor [message #814314 is a reply to message #814059] |
Tue, 06 March 2012 09:51 |
Sebastian Zarnekow Messages: 3118 Registered: July 2009 |
Senior Member |
|
|
Hi Alex,
Antlr tells you that it removed some alternatives from your otherwise
ambiguous grammar. That may lead to infinite error recovery and thereby
to an OOM.
The problem is, that the sequence
UnknownRule ; UnknownRule ; UnknownRule
may either be parsed as a FileInput with a single SimpleStatement which
has three statements or it can be parsed as a FileInput with two
instances of SimpleStatement:
UnknownRule ; UnknownRule ; (optional semi at end)
UnknownRule
I don't know about the processing that you want to do with your language
but it seems to me that the optional trailing semicolon is redundant.
Please note that FieldName matches an empty input which may not what
you'd expect. I recommend to remove it and use ID instead:
UnknownField:
name=ID '=' value=Value;
A validation rule could enforce that only lowercase letters are used as
FieldName.
Regards,
Sebastian
--
Need professional support for Eclipse Modeling?
Go visit: http://xtext.itemis.com
Am 06.03.12 02:26, schrieb Alex Ruiz:
> Greetings,
>
> I'm trying to create an Xtext grammar for our BUILD language. The
> language seems to be pretty simple, but it is giving me a hard time :)
>
> The grammar I have is:
>
> grammar com.google.eclipse.buildeditor2.BuildLang hidden(WS, SL_COMMENT)
>
> import "http://www.eclipse.org/emf/2002/Ecore" as ecore
> generate buildLang "http://www.google.com/eclipse/buildeditor2/BuildLang"
>
> FileInput:
> statements+=SimpleStatement*;
>
> SimpleStatement:
> statements+=UnknownRule (';' statements+=UnknownRule)* ';'?;
>
> UnknownRule:
> id=ID '(' fields+=UnknownField (',' fields+=UnknownField)* (',')? ')';
>
> UnknownField:
> name=FieldName '=' value=Value;
>
> Value:
> StringLink | IntLink;
>
> IntLink:
> target=INT;
>
> StringLink:
> target=STRING;
>
> // Terminals
> terminal ID: (('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'_'|'0'..'9')*);
>
> terminal FieldName: ('a'..'z')*;
>
> terminal INT returns ecore::EInt:
> ('-')?('0'..'9')+;
>
> terminal SL_COMMENT:
> '#' !('\n' | '\r')* ('\r'? '\n')?;
>
> terminal STRING:
> '"' ('\\' ('b' | 't' | 'n' | 'f' | 'r' | 'u' | '"' | "'" | '\\') |
> !('\\' | '"'))* '"' |
> "'" ('\\' ('b' | 't' | 'n' | 'f' | 'r' | 'u' | '"' | "'" | '\\') |
> !('\\' | "'"))* "'";
>
> terminal WS:
> (' ' | '\t' | '\r' | '\n')+;
>
>
> A typical file using this grammar will be like this:
>
> cc_library(name = 'someCCLibrary',
> src = "someDeps",
> local = 1,);
>
> The problem is that we have a finite set of defined rules (e.g.
> cc_library) but users are allowed to extend and created whatever they
> want! So I'm trying to create something very generic: a grammar that
> would accept any rule, with any fields.
>
> This is what I get when I run .mwe2 file:
>
>
> warning(200):
> ../com.google.eclipse.buildeditor2/src-gen/com/google/eclipse/buildeditor2/parser/antlr/internal/InternalBuildLang.g:159:2:
> Decision can match input such as "';' RULE_ID '(' RULE_FIELDNAME '='
> RULE_STRING ',' RULE_FIELDNAME '=' RULE_STRING ')'" using multiple
> alternatives: 1, 2
> As a result, alternative(s) 2 were disabled for that input
> 3128 [main] INFO or.validation.JavaValidatorFragment - generating
> Java-based EValidator API
> warning(200):
> ../com.google.eclipse.buildeditor2.ui/src-gen/com/google/eclipse/buildeditor2/ui/contentassist/antlr/internal/InternalBuildLang.g:328:35:
> Decision can match input such as "';' RULE_ID '(' RULE_FIELDNAME '='
> RULE_STRING ',' RULE_FIELDNAME '=' RULE_STRING ')'" using multiple
> alternatives: 1, 2
> As a result, alternative(s) 2 were disabled for that input
>
>
> And when I run the editor, I takes forever to load and dies with an
> OutOfMemoryError. It used to work when I had specific build rules in the
> grammar, instead of this "generic" ones.
>
> Any hints/help will be greatly appreciated :)
>
> Many thanks in advance,
> -Alex
|
|
| |
Goto Forum:
Current Time: Fri Apr 19 14:41:32 GMT 2024
Powered by FUDForum. Page generated in 0.02082 seconds
|