Skip to main content


Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Modeling » TMF (Xtext) » Confusion about terminals(In what case do terminals become unreachable)
icon5.gif  Confusion about terminals [message #1742730] Wed, 07 September 2016 00:46 Go to next message
Waqas Ilyas is currently offline Waqas IlyasFriend
Messages: 80
Registered: July 2009
Member
Hi,

Have a look at the grammar below:
SysDefTree:
	(roots+=RootNode)+
;

RootNode:
	'{'
		(properties+=Property)+
	'};'
;

Property:
	name=ID '=' values+=PropertyValue ';'
;


PropertyValue:
	NumberValue | ByteValue | StringValue
;


NumberValue:
	'<' value=Number '>'
;

ByteValue hidden():
	'[' WS* ((bytes+=HexDigit) (bytes+=HexDigit) WS*)+ ']'
;

StringValue:
	value=STRING
;

HexDigit:
	('0'|'1'|'2'|'3'|'4'|'5'|'6'|'7'|'8'|'9'|'a'|'A'|'b'|'B'|'c'|'C'|'d'|'D'|'e'|'E'|'f'|'F')
;

Number:
	SINGLE_DIGIT | INT | HEX | HEX_INT_PREFIX | HEX_NO_PREFIX
;

// ======================== Terminals =========================================

terminal SINGLE_DIGIT:
	('0'..'9')
;

terminal HEX_INT_PREFIX:
	SINGLE_DIGIT+ ('a'..'f' | 'A'..'F') (SINGLE_DIGIT | 'a'..'f' | 'A'..'F' | '_')*
;

terminal HEX_NO_PREFIX:
	('a'..'f' | 'A'..'F') (SINGLE_DIGIT | 'a'..'f' | 'A'..'F' | '_')*
;

terminal HEX:
	('0x' | '0X') (SINGLE_DIGIT | 'a'..'f' | 'A'..'F' | '_')+
;

terminal INT returns ecore::EInt:
	SINGLE_DIGIT+
;

terminal ID:
	 (SINGLE_DIGIT | 'a'..'z' | 'A'..'Z' | '_' | '#') ('a'..'z' | 'A'..'Z' | '_' | '-' | SINGLE_DIGIT | '#' | ',')*
;

terminal STRING	: 
			'"' ( '\\' . /* 'b'|'t'|'n'|'f'|'r'|'u'|'"'|"'"|'\\' */ | !('\\'|'"') )* '"' |
			"'" ( '\\' . /* 'b'|'t'|'n'|'f'|'r'|'u'|'"'|"'"|'\\' */ | !('\\'|"'") )* "'"
		; 
terminal ML_COMMENT	: '/*' -> '*/';

terminal WS			: (' '|'\t'|'\r'|'\n')+;


When I run the Xtext generator I get the following error:
The following token definitions can never be matched because prior tokens match the same input: RULE_SINGLE_DIGIT

If I remove the rule ByteValue, the error goes away. Why is that?

I understand that I can play around with terminals and Number rule as well to get rid of the error but that results in an undesirable grammar. Moreover, the terminals are somewhat complex but they serve the intended purpose (won't mind suggestions for improvements there).
Re: Confusion about terminals [message #1742738 is a reply to message #1742730] Wed, 07 September 2016 06:07 Go to previous messageGo to next message
Christian Dietrich is currently offline Christian DietrichFriend
Messages: 13884
Registered: July 2009
Senior Member
Keywords become terminal rules as well. Thus '1' will be matched as keyword in the hex digit terminal rule which is called by the byte value rule only.
You may tell the workflow to create a pure antlr debug grammar

ParserGenerator = {debuggrammar=true} or something like that

To analyze this.

Since terminal rules are context free they are not allowed to overlap each other

So you might have problems with the ID as well.
Maybe you should loosen your grammar (fewer terminal rules, maybe use of datatype rules which are parser rules) and use value converters to do the range checking


Need professional support for Xtext, Xpand, EMF?
Go to: https://www.itemis.com/en/it-services/methods-and-tools/xtext
Twitter : @chrdietrich
Blog : https://www.dietrich-it.de
Re: Confusion about terminals [message #1742813 is a reply to message #1742738] Wed, 07 September 2016 19:40 Go to previous message
Waqas Ilyas is currently offline Waqas IlyasFriend
Messages: 80
Registered: July 2009
Member
Ok that makes a lot of sense, thanks for the explanation. It means I still understand to some degree how this works.

By the way, I agree that the terminals are overly complicated, and overlapping. In fact, I know that HEX_NO_PREFIX and HEX_INT_PREFIX is actually shadowing ID in many places. I have a version of the grammar, where I removed all but comment, white-space and string terminals. Even though such parser rule based terminals allow context to be defined better and represent the specs more accurately but they break down terminals to be almost always as one character. This results in everything appearing in the editor as a keyword, and also double clicking any word always selects only one character, which is annoying.
Previous Topic:xtext add cross reference in expression
Next Topic:Grammar generation bug when adding cross-reference
Goto Forum:
  


Current Time: Fri Apr 16 18:34:54 GMT 2021

Powered by FUDForum. Page generated in 0.02359 seconds
.:: Contact :: Home ::.

Powered by: FUDforum 3.0.2.
Copyright ©2001-2010 FUDforum Bulletin Board Software

Back to the top