Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Modeling » TMF (Xtext) » [Xtext] A very generic terminal definition
[Xtext] A very generic terminal definition [message #57051] Fri, 10 July 2009 15:47 Go to next message
Helko Glathe is currently offline Helko GlatheFriend
Messages: 55
Registered: July 2009
Member
Hi everyone.

I have the following problem:

In my DSL, I have a Token that can be very generic.

Possible example values are:
- "abc" -> Single Line String with quotation marks

- "abc
defg
xyz" -> multi line String with quotation marks

- "abc
defg
xyz" "hello" "wor
ld" -> Multiple Quotation Mark Strings, possibly over multiple Lines

- [188, 365, 512, 604] -> numbers, commas, brackets and whitespaces

- 10 -> INT

- 20.02 -> FLOAT

- helloWorld -> Single line String without Quotation Marks and without
whitspaces

- hello100World -> same as before, but with digits

- 100World

- hello100


I've tried to define a terminal for that:

For all Quotation Mark Strings, I have defined:
terminal ML_STRINGVALUE : ('"' -> '"')+;

All values, with and without Quotation Marks shall be possible for my
generic token.

All of my attempts resulted in warnings like:
warning(208):
../org.de.***.parse.mdl/src-gen/org/de/***/parse/parser/antl r/internal/InternalMDL.g:619:1:
The following token definitions are unreachable: RULE_INT

Thats currently the import part from my Definition:
grammar org.de.***.parse.MDL with org.eclipse.xtext.common.Terminals

generate mDL "http://www.de.org/***/parse/MDL"

MDLModel :
model=Model;

terminal NOSTRINGVALUE : ('a'..'z'|'A'..'Z'|'.'|'-'|','|INT)+;
terminal VECTOR : '[' -> ']';
terminal ML_STRINGVALUE : ('"' -> '"')+;
//terminal VALUE : (NOSTRINGVALUE|ML_STRINGVALUE);

ParamValuePair :
parameter=NOSTRINGVALUE value=(VECTOR|NOSTRINGVALUE|ML_STRINGVALUE)
;
...

Does anyone have an idea, how it is possible to also accept the values
without Quotation Marks and without getting warnings like "The following
token definitions are unreachable: RULE_INT"?

Also it is not possible yet, to have whitespaces between End and Start
Quotation Marks considering Multiple Quotation Mark Strings, possibly over
multiple Lines as mentioned above.



Kind regards, Helko
Re: [Xtext] A very generic terminal definition [message #57117 is a reply to message #57051] Sun, 12 July 2009 20:48 Go to previous messageGo to next message
Dénes Harmath is currently offline Dénes HarmathFriend
Messages: 157
Registered: July 2009
Senior Member
I hit the same problem recently. From the Xtext doc: "Note, that the order
of terminal rules is crucial for your grammar, as they may hide each
other. This is especially important for newly introduced rules in
connection with mixed rules from used grammars." Your FLOAT rule may
shadow the INT rule from org.eclipse.xtext.common.Terminals. What are the
number rules from your grammar?
Re: [Xtext] A very generic terminal definition [message #57311 is a reply to message #57117] Mon, 13 July 2009 07:15 Go to previous messageGo to next message
Helko Glathe is currently offline Helko GlatheFriend
Messages: 55
Registered: July 2009
Member
Hi Dennis.

Currently, these are my grammar rules.

terminal NOSTRINGVALUE : ('a'..'z'|'A'..'Z'|'.'|'-'|','|INT)+;
terminal VECTOR : '[' -> ']';
terminal ML_STRINGVALUE : ('"' -> '"')+;


I think NOSTRINGVALUE is the reason for the warning message, because this
rules covers the INT rule.


ParamValuePair :
parameter=NOSTRINGVALUE value=(VECTOR|NOSTRINGVALUE|ML_STRINGVALUE)
;

Rejecting INT from NOSTRINGVALUE and putting it to the alternative of
"value"


ParamValuePair :
parameter=NOSTRINGVALUE value=(VECTOR|NOSTRINGVALUE|ML_STRINGVALUE|INT)
;

will cause "Cannot find type for
'VECTOR|NOSTRINGVALUE|INT|ML_STRINGVALUE'."...


But back to main problem. To handle my generic token, I' thinking that
using predinfined rules from org.eclipse.xtext.common.Terminals is not
helpful here, because I will need a fine grained set of rule definition.

Do you agree with me?

regards, helko

Dennis Harmath wrote:

> I hit the same problem recently. From the Xtext doc: "Note, that the order
> of terminal rules is crucial for your grammar, as they may hide each
> other. This is especially important for newly introduced rules in
> connection with mixed rules from used grammars." Your FLOAT rule may
> shadow the INT rule from org.eclipse.xtext.common.Terminals. What are the
> number rules from your grammar?
Re: [Xtext] A very generic terminal definition [message #57336 is a reply to message #57051] Mon, 13 July 2009 07:17 Go to previous message
Sven Efftinge is currently offline Sven EfftingeFriend
Messages: 1771
Registered: July 2009
Senior Member
Hi Helko,

find my comments inlined :

Helko Glathe schrieb:
> Hi everyone.
>
> I have the following problem:
>
> In my DSL, I have a Token that can be very generic.
>
> Possible example values are:
> - "abc" -> Single Line String with quotation marks
>
> - "abc defg
> xyz" -> multi line String with quotation marks
>
> - "abc defg
> xyz" "hello" "wor
> ld" -> Multiple Quotation Mark Strings, possibly over multiple Lines
>
> - [188, 365, 512, 604] -> numbers, commas, brackets and whitespaces
>
> - 10 -> INT
>
> - 20.02 -> FLOAT
>
> - helloWorld -> Single line String without Quotation Marks and without
> whitspaces
>
> - hello100World -> same as before, but with digits
>
> - 100World
>
> - hello100
>
>
> I've tried to define a terminal for that:

Why do you want to define all this in one terminal rule?
It should be separeted into multiple rules, one for each of the outlines
syntaxes.

>
> For all Quotation Mark Strings, I have defined:
> terminal ML_STRINGVALUE : ('"' -> '"')+;
>

This shouldn't be a terminal rule, but a parser rule:

QuotationMarkStrings : values+=STRING;


> All values, with and without Quotation Marks shall be possible for my
> generic token.
>
> All of my attempts resulted in warnings like:
> warning(208):
> ./org.de.***.parse.mdl/src-gen/org/de/***/parse/parser/antlr /internal/InternalMDL.g:619:1:
> The following token definitions are unreachable: RULE_INT

This warning occurs if you have a token rule which shadows following
rules. See documentation or the other comment of Dennis Harmath.

>
> Thats currently the import part from my Definition:
> grammar org.de.***.parse.MDL with org.eclipse.xtext.common.Terminals
>
> generate mDL "http://www.de.org/***/parse/MDL"
>
> MDLModel :
> model=Model;
>
> terminal NOSTRINGVALUE : ('a'..'z'|'A'..'Z'|'.'|'-'|','|INT)+;
> terminal VECTOR : '[' -> ']';
> terminal ML_STRINGVALUE : ('"' -> '"')+;
> //terminal VALUE : (NOSTRINGVALUE|ML_STRINGVALUE);
>
> ParamValuePair :
> parameter=NOSTRINGVALUE value=(VECTOR|NOSTRINGVALUE|ML_STRINGVALUE)
> ;
> ..

You should do this with datatype rules, or even real parser rules.

For instance:

Vector : '[' values+=INT (',' values+=INT)+ ']';

TextValue : STRING | ID;

>
> Does anyone have an idea, how it is possible to also accept the values
> without Quotation Marks and without getting warnings like "The following
> token definitions are unreachable: RULE_INT"?
>
> Also it is not possible yet, to have whitespaces between End and Start
> Quotation Marks considering Multiple Quotation Mark Strings, possibly
> over multiple Lines as mentioned above.

It is possible, and automatically handled if you use datatype rules or
common parser rules. If you use terminal rules, you have to handle
whitespace explicitly. But you shouldn't use them for what you want to do.

Regards,
Sven


--
Need professional support on Xtext or Xtend?
Mail to: xtext (at) itemis.com
Twitter : @svenefftinge
Blog : blog.efftinge.de
Previous Topic:[xtext] Serializing parsed DSL Model
Next Topic:Grammar reuse
Goto Forum:
  


Current Time: Thu Dec 18 15:31:35 GMT 2014

Powered by FUDForum. Page generated in 0.02148 seconds
.:: Contact :: Home ::.

Powered by: FUDforum 3.0.2.
Copyright ©2001-2010 FUDforum Bulletin Board Software