Skip to main content


Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Modeling » TMF (Xtext) » Problem with data type rules and terminal rules Xtext 2.0M6
Problem with data type rules and terminal rules Xtext 2.0M6 [message #663419] Mon, 04 April 2011 23:32 Go to next message
Simon Stratmann is currently offline Simon StratmannFriend
Messages: 27
Registered: February 2011
Junior Member
Hi,

I am at a total loss. I cannot post my real grammar, but luckily I was able to reproduce the problem in a mock grammar, so please don't wonder why some of it looks weird.

My grammar is

grammar org.xtext.example.mydsl.MyDsl hidden (WS)

generate myDsl "http://www.xtext.org/example/mydsl/MyDsl"
import "http://www.eclipse.org/emf/2002/Ecore" as ecore

Model:
	test+=Test*;
	
Test:
	new_attribute | new_item | referral_test| chartest;

chartest :
	"char" c=CHARACTER
;
	
Referable:
	item_name | attribute_name
;
	
new_item :
	"item" name=item_name;

item_name :
	name=IDENTIFIER;

new_attribute :
	"attribute" name=attribute_name;

attribute_name :
	name=IDENTIFIER;
	
referral_test :
	"use" first=[Referable|IDENTIFIER] "'" second=[Referable|IDENTIFIER]
;

CHARACTER:
//Not a terminal rule so it does not consume the ' for attribute access 
	"'" ("a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" | "i" | "j" | "k" | "l" | "m" | "n" | "o" | "p")
	"'";

terminal IDENTIFIER:
	('a'..'z' | 'A'..'Z')*;


terminal WS:
	' ' | '\t' | '\n' | '\r';


I need to be able to use constructs like "item'attribute" on the one hand and on the other hand characters like 'x'. For the character rule not to consume the "a" in "item'attribute" I tried changing it to a data type rule.

With this grammar a line like "item a" gives the error "mismatched input 'a' expecting RULE_IDENTIFIER", while "a" is colored like a keyword.
But the line "item q" works, because the q is not part of CHARACTER. "item ab" works, too.
"char 'a'" works (no errors are shown), but the "a" is marked as a keyword, too.

Perhaps I misunderstand Datatype rules and terminal rules and their difference, but why are those letters that are defined in the CHARACTER recognized as a keyword, while the other ones are not? How can I make the char_test rule recognize character constructs "properly"?

Thanks so much, this has been bugging me for a while now.

Simon
Re: Problem with data type rules and terminal rules Xtext 2.0M6 [message #663427 is a reply to message #663419] Tue, 05 April 2011 01:50 Go to previous messageGo to next message
Henrik Lindberg is currently offline Henrik LindbergFriend
Messages: 2509
Registered: July 2009
Senior Member
It is very easy to be confused about how terminal rules work.
It is important to remember that terminal rules are transformed to rules
executed by the lexer, and it is the lexer that hacks up the input
stream (the text) into tokens on which the grammar operates.

Secondly, all keywords have special handling - if the text matched by
the lexer is a keyword, it will be deliver as a keyword token.

Hence, your IDENTIFIER rule will consume all letters, but since the
IDENTIFIER 'a' is a keyword the grammar will get a KEYWORD_a token.

You can work around this in a couple of different ways:

- use the built in keyword escape of ^, that when placed before a
keyword will turn it into an ID (without the ^). But the user will have
to enter '^a' instead of just 'a'.

- (easiest) just use ID instead of CHARACTER, and add a validation rule
that the string is a single character and acceptable.

@Check
public void checkCharacter(chartest o) {
if(o.getC().length() != 1 || !"abcdef".contains(o.getC())
error(....);
}

If you want code completion you have to add that.

- create a rule that represents both ID + all keywords that are
accepted. i.e. something like

terminal ID : (a..z|A..Z)* ;
IDENTIFIER: ID | "a" | "b" | ... ;

You will also need to handle the fact that the characters "a"... will be
styled as keywords (can be turned off with semantic highlighting rules).

- create a datatype that is a CHARACTER and has datatype conversion that
checks that the string it gets is a single character and is acceptable.

CHARACTER returns myModel::Character : ID ;

You will probably want to handle code completion proposals since data
type instances are not proposed automatically (the framework can not
guess what values to propose).

- (advanced) use an external lexer - total overkill for your mock
example, but may be the best solution if your real grammar is
complex/has many otherwise overlapping terminals. With an external lexer
you can have predicates that makes the lexer only deliver certain tokens
under certain (lexical) circumstances (lexical = you can not 'deliver
different tokens for different grammar rules' as they do not get to see
the tokens until after the lexer has done its work).

Some general rules of advice:
- use as few terminals as possible
- make the terminals as long as possible (contrast: if every character
is a separate token you will consume a lot more resources).
- be "lenient" in your grammar and catch errors in validation as you can
provide better feedback and issue codes for quickfixes

I would use the "easiest" above solution if that works in your real
grammar/model.

Regards
- henrik

On 4/5/11 1:32 AM, Simon Stratmann wrote:
> Hi,
>
> I am at a total loss. I cannot post my real grammar, but luckily I was
> able to reproduce the problem in a mock grammar, so please don't wonder
> why some of it looks weird.
>
> My grammar is
>
> grammar org.xtext.example.mydsl.MyDsl hidden (WS)
>
> generate myDsl "http://www.xtext.org/example/mydsl/MyDsl"
> import "http://www.eclipse.org/emf/2002/Ecore" as ecore
>
> Model:
> test+=Test*;
>
> Test:
> new_attribute | new_item | referral_test| chartest;
>
> chartest :
> "char" c=CHARACTER
> ;
>
> Referable:
> item_name | attribute_name
> ;
>
> new_item :
> "item" name=item_name;
>
> item_name :
> name=IDENTIFIER;
>
> new_attribute :
> "attribute" name=attribute_name;
>
> attribute_name :
> name=IDENTIFIER;
>
> referral_test :
> "use" first=[Referable|IDENTIFIER] "'" second=[Referable|IDENTIFIER]
> ;
>
> CHARACTER:
> //Not a terminal rule so it does not consume the ' for attribute access
> "'" ("a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" | "i" | "j" | "k" |
> "l" | "m" | "n" | "o" | "p")
> "'";
>
> terminal IDENTIFIER:
> ('a'..'z' | 'A'..'Z')*;
>
>
> terminal WS:
> ' ' | '\t' | '\n' | '\r';
>
>
> I need to be able to use constructs like "item'attribute" on the one
> hand and on the other hand characters like 'x'. For the character rule
> not to consume the "a" in "item'attribute" I tried changing it to a data
> type rule.
>
> With this grammar a line like "item a" gives the error "mismatched input
> 'a' expecting RULE_IDENTIFIER", while "a" is colored like a keyword.
> But the line "item q" works, because the q is not part of CHARACTER.
> "item ab" works, too.
> "char 'a'" works (no errors are shown), but the "a" is marked as a
> keyword, too.
>
> Perhaps I misunderstand Datatype rules and terminal rules and their
> difference, but why are those letters that are defined in the CHARACTER
> recognized as a keyword, while the other ones are not? How can I make
> the char_test rule recognize character constructs "properly"?
>
> Thanks so much, this has been bugging me for a while now.
>
> Simon
Re: Problem with data type rules and terminal rules Xtext 2.0M6 [message #663479 is a reply to message #663427] Tue, 05 April 2011 09:27 Go to previous message
Simon Stratmann is currently offline Simon StratmannFriend
Messages: 27
Registered: February 2011
Junior Member
Hendrik,

thanks for your help! The forum software has some kind of bug which caused me lose all what I had written when I tried to preview the forum post and I am too sleep deprived to repeat everything.

Anyway, I found a solution to my problem looking at your advice:

CHARACTER:
	"'" IDENTIFIER "'";


I'll have to add a check for characters that are allowed in the IDENTIFIER but not in a character literal, but that won't be a problem

Thanks so much.

Simon

[Updated on: Tue, 05 April 2011 09:28]

Report message to a moderator

Previous Topic:Strange error "mismatched input '<EOF>' expecting ';' on line 0"
Next Topic:Configuring local and global scope providers at the same time
Goto Forum:
  


Current Time: Thu Apr 25 14:16:40 GMT 2024

Powered by FUDForum. Page generated in 0.03836 seconds
.:: Contact :: Home ::.

Powered by: FUDforum 3.0.2.
Copyright ©2001-2010 FUDforum Bulletin Board Software

Back to the top