Eclipse Community Forums: TMF (Xtext) » Exponent E cannot also be identifier

Help

Home

Home » Modeling » TMF (Xtext) » Exponent E cannot also be identifier

Show: Today's Messages :: Show Polls :: Message Navigator

Exponent E cannot also be identifier [message #527147]

Wed, 14 April 2010 09:00

Ed Willink

Messages: 7655
Registered: July 2009

Senior Member

Hi

I have the following grammar 'terminals' (for OCL)

terminal STRING_LITERAL:
"'" ( '\\' ('b'|'t'|'n'|'f'|'r'|'"'|"'"|'\\') | !('\\'|"'") )* "'"
;

terminal ID:
(('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'_'|'0'..'9')*)
| ("_" STRING_LITERAL)
;

terminal INT: // String to allow diverse re-use
('0'..'9')+
;

REAL_LITERAL returns ecore::EBigDecimal:
INT (('.' INT)| (('.' INT)? ('e'|'E') ('+'|'-')? INT))
;

(REAL_LITERAL is non-terminal to allow parser backtracking to sort out
"5..7" as "5" ".." "7" rather than "5." <stuck>)

I find that I can use "d" or "f" as ID but not "e" or "E". Presumably
these are being captured as the REAL_LITERAL exponent indicator, but
since REAL_LITERAL has a mandatory INT prefix, this seems wrong?

Is there an easy solution?

Regards

Ed Willink

Report message to a moderator

Re: Exponent E cannot also be identifier [message #527157 is a reply to message #527147]

Wed, 14 April 2010 09:39

Sven Efftinge

Messages: 1823
Registered: July 2009

Senior Member

Ed Willink schrieb:
> Hi
>
> I have the following grammar 'terminals' (for OCL)
>
> terminal STRING_LITERAL:
> "'" ( '\\' ('b'|'t'|'n'|'f'|'r'|'"'|"'"|'\\') | !('\\'|"'") )* "'"
> ;
>
> terminal ID:
> (('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'_'|'0'..'9')*)
> | ("_" STRING_LITERAL)
> ;
>
> terminal INT: // String to allow diverse re-use
> ('0'..'9')+
> ;
>
> REAL_LITERAL returns ecore::EBigDecimal:
> INT (('.' INT)| (('.' INT)? ('e'|'E') ('+'|'-')? INT))
> ;
>
> (REAL_LITERAL is non-terminal to allow parser backtracking to sort out
> "5..7" as "5" ".." "7" rather than "5." <stuck>)
>
> I find that I can use "d" or "f" as ID but not "e" or "E". Presumably
> these are being captured as the REAL_LITERAL exponent indicator, but
> since REAL_LITERAL has a mandatory INT prefix, this seems wrong?

The REAL_LITERAL rule is a so called datatype rule and is therefore part
of the parser. The 'e' and 'E' introduce new keywords which shadow the
ID lexer rule. You can either try to define REAL_LITERAL as a lexer rule
(won't be easy as you will have conflicts with INT) or you could
introduce an IDentifier parser rule like so:

Identifier :
ID | 'e' | 'E' | '_' STRING;

And call it where you have called ID before.

Sven

--
Need professional support for Xtext and EMF?
Go to: http://xtext.itemis.com
Twitter : @svenefftinge
Blog : blog.efftinge.de

Report message to a moderator

Re: Exponent E cannot also be identifier [message #527159 is a reply to message #527147]

Wed, 14 April 2010 09:40

Henrik Lindberg

Messages: 2509
Registered: July 2009

Senior Member

I use the following construct:

REAL hidden(): INT ? '.' (EXT_INT | INT);
terminal EXT_INT: INT ('e'|'E')('-'|'+') INT;

Hope that helps.
- henrik

On 4/14/10 11:00 AM, Ed Willink wrote:
> Hi
>
> I have the following grammar 'terminals' (for OCL)
>
> terminal STRING_LITERAL:
> "'" ( '\\' ('b'|'t'|'n'|'f'|'r'|'"'|"'"|'\\') | !('\\'|"'") )* "'"
> ;
>
> terminal ID:
> (('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'_'|'0'..'9')*)
> | ("_" STRING_LITERAL)
> ;
>
> terminal INT: // String to allow diverse re-use
> ('0'..'9')+
> ;
>
> REAL_LITERAL returns ecore::EBigDecimal:
> INT (('.' INT)| (('.' INT)? ('e'|'E') ('+'|'-')? INT))
> ;
>
> (REAL_LITERAL is non-terminal to allow parser backtracking to sort out
> "5..7" as "5" ".." "7" rather than "5." <stuck>)
>
> I find that I can use "d" or "f" as ID but not "e" or "E". Presumably
> these are being captured as the REAL_LITERAL exponent indicator, but
> since REAL_LITERAL has a mandatory INT prefix, this seems wrong?
>
> Is there an easy solution?
>
> Regards
>
> Ed Willink

Report message to a moderator

Re: Exponent E cannot also be identifier [message #527963 is a reply to message #527157]

Sat, 17 April 2010 13:10

Ed Willink

Messages: 7655
Registered: July 2009

Senior Member

Hi Sven

> Identifier :
> ID | 'e' | 'E' | '_' STRING;
>
> And call it where you have called ID before.

Thanks that work beautifully. But when I extend this policy to add more
than about 10 OCLinEcore keywords that are not reserved words in
EssentialOCL, i get three methods that exceed the 65536 byte limit.
Overall, my grammars are not that large, perhaps 100 rules, but the
generated grammars are huge; one is 3.4MB of Java. Is this a known
issue? Is there a workaround?

Regards

Ed Willink

Report message to a moderator

Re: Exponent E cannot also be identifier [message #527991 is a reply to message #527963]

Sun, 18 April 2010 12:18

Sebastian Zarnekow

Messages: 3118
Registered: July 2009

Senior Member

Hi Ed,

did you enable class splitting in your antlr generator fragment? This
feature should help to circumvent this bytcode limitation.

fragment = de.itemis.xtext.antlr.XtextAntlrGeneratorFragment {
options = {
classSplitting = true
}
}

This has to be configured for the ui parser as well.

Regards,
Sebastian
--
Need professional support for Eclipse Modeling?
Go visit: http://xtext.itemis.com

Am 17.04.10 15:10, schrieb Ed Willink:
> Hi Sven
>
>> Identifier :
>> ID | 'e' | 'E' | '_' STRING;
>>
>> And call it where you have called ID before.
>
> Thanks that work beautifully. But when I extend this policy to add more
> than about 10 OCLinEcore keywords that are not reserved words in
> EssentialOCL, i get three methods that exceed the 65536 byte limit.
> Overall, my grammars are not that large, perhaps 100 rules, but the
> generated grammars are huge; one is 3.4MB of Java. Is this a known
> issue? Is there a workaround?
>
> Regards
>
> Ed Willink

Report message to a moderator

Re: Exponent E cannot also be identifier [message #528001 is a reply to message #527991]

Sun, 18 April 2010 13:58

Ed Willink

Messages: 7655
Registered: July 2009

Senior Member

Hi Sebastian

> did you enable class splitting in your antlr generator fragment? This
> feature should help to circumvent this bytcode limitation.

Yes. In a variety of permutations with backtrack, to no beneficial effect.

I'm hoping to reorganise the particular rule causing the trouble and
find that the problem goes away.

> This has to be configured for the ui parser as well.

Yes, I'd done that.

Thanks for the help.

Seems like ANTLr needs some major improvement/help. The entire
equivalent grammar was more precise and much much smaller with LPG.
(104KB rather than 2106KB Java-wise, 36KB rather than 429KB Class-wise).

Regards

Ed Willink

Report message to a moderator

Re: Exponent E cannot also be identifier [message #528078 is a reply to message #528001]

Mon, 19 April 2010 09:08

Ed Willink

Messages: 7655
Registered: July 2009

Senior Member

Hi Sebastian

>> did you enable class splitting in your antlr generator fragment? This
>> feature should help to circumvent this bytcode limitation.
>
> Yes. In a variety of permutations with backtrack, to no beneficial effect.
>
> I'm hoping to reorganise the particular rule causing the trouble and
> find that the problem goes away.

No luck. Partitioing the problem rule to make it much simpler made no
difference. Both non-UI and UI grammars hit the limit.

I'm not convinced that class splitting is occurring (on M6).

I see hundreds of class files in the parseTreeConstruction bin folder,
and these exist whether classSplitting = true is specified or not.
I see no corresponding hundreds of files anywhere in the UI bin folder.
The entire UI bin comprises only 22 classes, compared to more like 700
for the non-UI (excluding model class files).

My MWE2 lines are:

fragment = de.itemis.xtext.antlr.XtextAntlrGeneratorFragment {
options = { backtrack = true classSplitting = true }
}

fragment = de.itemis.xtext.antlr.XtextAntlrUiGeneratorFragment {
options = { backtrack = true classSplitting = true }
}

Regards

Ed Willink

Report message to a moderator

Re: Exponent E cannot also be identifier [message #528168 is a reply to message #528078]

Mon, 19 April 2010 15:01

Sebastian Zarnekow

Messages: 3118
Registered: July 2009

Senior Member

Hi Ed,

please create a ticket with the problematic grammar attached.
Just for clarification: the ParseTReeConstructor has actually nothing to
do with the parser but with the serializer. It is the component that
creates an intermediate format from your model that will be serialized
to string afterwards.

Regards,
Sebastian
--
Need professional support for Eclipse Modeling?
Go visit: http://xtext.itemis.com

Am 19.04.10 11:08, schrieb Ed Willink:
> Hi Sebastian
>
>>> did you enable class splitting in your antlr generator fragment? This
>>> feature should help to circumvent this bytcode limitation.
>>
>> Yes. In a variety of permutations with backtrack, to no beneficial
>> effect.
>>
>> I'm hoping to reorganise the particular rule causing the trouble and
>> find that the problem goes away.
>
> No luck. Partitioing the problem rule to make it much simpler made no
> difference. Both non-UI and UI grammars hit the limit.
>
> I'm not convinced that class splitting is occurring (on M6).
>
> I see hundreds of class files in the parseTreeConstruction bin folder,
> and these exist whether classSplitting = true is specified or not.
> I see no corresponding hundreds of files anywhere in the UI bin folder.
> The entire UI bin comprises only 22 classes, compared to more like 700
> for the non-UI (excluding model class files).
>
> My MWE2 lines are:
>
> fragment = de.itemis.xtext.antlr.XtextAntlrGeneratorFragment {
> options = { backtrack = true classSplitting = true }
> }
>
> fragment = de.itemis.xtext.antlr.XtextAntlrUiGeneratorFragment {
> options = { backtrack = true classSplitting = true }
> }
>
> Regards
>
> Ed Willink

Report message to a moderator