Skip to main content


Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Modeling » TMF (Xtext) » Terminals with whitespace
Terminals with whitespace [message #1062987] Tue, 11 June 2013 15:47 Go to next message
Neal Kruis is currently offline Neal KruisFriend
Messages: 15
Registered: October 2012
Junior Member
I am trying to create an Xtext project for a language whose general form is a set of comma separated lists. For example:

ObjectName,
Field 1, !- Comment describing field 1
Field 2, !- Comment describing field 2
Field N; !- Comment describing filed N


Here is my problem: The fields can be any alphanumeric string and may contain spaces. So I originally created the terminal this way:

terminal ALPHANUMERIC :
	( 
	 ('0'..'9' | 'A'..'Z' | 'a'..'z')	|
	 ('0'..'9' | 'A'..'Z' | 'a'..'z') ('0'..'9' | 'A'..'Z' | 'a'..'z' | ' ')* ('0'..'9' | 'A'..'Z' | 'a'..'z')
	)
;


Whitespace is allowed on either side of the field (the actual string will trim off whitespace between the fields and commas).

The problem is that when I use this, and have a space between the field and the comma, e.g.:

Field X ,


The parser interprets the space as part of the alphanumeric string and complains because it thinks the field is not complete (i.e. it does not end in a non-space character).

Anyone have a suggestion of how to get around this problem?
Re: Terminals with whitespace [message #1063002 is a reply to message #1062987] Tue, 11 June 2013 17:22 Go to previous messageGo to next message
Henrik Lindberg is currently offline Henrik LindbergFriend
Messages: 2509
Registered: July 2009
Senior Member
If you are using the default terminals, you will end up with
overlapping terminals, and lots of problems.

Why not simply use a data rule - e.g. something like:

AlphaNumeric : (ID | Number | NonCommaPunctuation)+ ;
NonCommaPunctuation : ':' | ';' | ... ;

....

Regards
- henrik

On 2013-11-06 17:47, Neal Kruis wrote:
> I am trying to create an Xtext project for a language whose general form
> is a set of comma separated lists. For example:
>
>
> ObjectName,
> Field 1, !- Comment describing field 1
> Field 2, !- Comment describing field 2
> Field N; !- Comment describing filed N
>
>
> Here is my problem: The fields can be any alphanumeric string and may
> contain spaces. So I originally created the terminal this way:
>
>
> terminal ALPHANUMERIC :
> ( ('0'..'9' | 'A'..'Z' | 'a'..'z') |
> ('0'..'9' | 'A'..'Z' | 'a'..'z') ('0'..'9' | 'A'..'Z' | 'a'..'z' |
> ' ')* ('0'..'9' | 'A'..'Z' | 'a'..'z')
> )
> ;
>
>
> Whitespace is allowed on either side of the field (the actual string
> will trim off whitespace between the fields and commas).
>
> The problem is that when I use this, and have a space between the field
> and the comma, e.g.:
>
>
> Field X ,
>
>
> The parser interprets the space as part of the alphanumeric string and
> complains because it thinks the field is not complete (i.e. it does not
> end in a non-space character).
>
> Anyone have a suggestion of how to get around this problem?
Re: Terminals with whitespace [message #1063016 is a reply to message #1063002] Tue, 11 June 2013 19:15 Go to previous messageGo to next message
Neal Kruis is currently offline Neal KruisFriend
Messages: 15
Registered: October 2012
Junior Member
How does your example allow for spaces in the Alphanumeric type?
Re: Terminals with whitespace [message #1063027 is a reply to message #1063016] Tue, 11 June 2013 20:17 Go to previous messageGo to next message
Christian Dietrich is currently offline Christian DietrichFriend
Messages: 14665
Registered: July 2009
Senior Member
Since WS is hidden by default it can be contained between the tokens
Of the data type rule as well

--
Need training, onsite consulting or any other kind of help for Xtext?
Go visit http://xtext.itemis.com or send a mail to xtext at itemis dot de


Twitter : @chrdietrich
Blog : https://www.dietrich-it.de
Re: Terminals with whitespace [message #1063044 is a reply to message #1063027] Tue, 11 June 2013 21:20 Go to previous messageGo to next message
Neal Kruis is currently offline Neal KruisFriend
Messages: 15
Registered: October 2012
Junior Member
The problem is that I don't want spaces hidden from the tokens.

I want the parser to be able to recognize everything between the first non-whitespace character and the last non-whitespace character (including spaces) as a field. All the other whitespace between the commas (or semicolon if it's the last field) should be hidden.

[Updated on: Tue, 11 June 2013 21:27]

Report message to a moderator

Re: Terminals with whitespace [message #1063060 is a reply to message #1063044] Wed, 12 June 2013 01:37 Go to previous messageGo to next message
Henrik Lindberg is currently offline Henrik LindbergFriend
Messages: 2509
Registered: July 2009
Senior Member
On 2013-11-06 23:20, Neal Kruis wrote:
> The problem is that I don't want spaces hidden from the tokens. I want
> the parser to be able to recognize everything between the first
> non-whitespace character and the last non-whitespace character
> (including spaces) as a field. All the other whitespace between the
> commas (or semicolon if it's the last field) can be hidden.

You can specify which tokens should be hidden per rule. By default WS is
hidden. If you want it visible, you declare the other rules as being hidden.

my_rule hidden(SL_COMMENT, ML_COMMENT) :
# WS is visible here, and must be specified where it may appear
ID WS ID WS ID
;

The rule above would parse any 3 ID with WS between them. The whitespace
matched by WS is included in the resulting string.

Input of 'A B C' thus produces the token text "A B C" for my_rule.

Hope this helps.

- henrik
Re: Terminals with whitespace [message #1063223 is a reply to message #1063060] Wed, 12 June 2013 17:22 Go to previous messageGo to next message
Neal Kruis is currently offline Neal KruisFriend
Messages: 15
Registered: October 2012
Junior Member
Wouldn't this limit me to two whitespace characters though? I need to allow any number of white space characters, e.g.:

"A", "A B", "A B C", or "A B C D", etc.
Re: Terminals with whitespace [message #1063225 is a reply to message #1063223] Wed, 12 June 2013 17:31 Go to previous messageGo to next message
Christian Dietrich is currently offline Christian DietrichFriend
Messages: 14665
Registered: July 2009
Senior Member
Nope:

terminal WS : (' '|'\t'|'\r'|'\n')+;

--
Need training, onsite consulting or any other kind of help for Xtext?
Go visit http://xtext.itemis.com or send a mail to xtext at itemis dot de


Twitter : @chrdietrich
Blog : https://www.dietrich-it.de
Re: Terminals with whitespace [message #1063226 is a reply to message #1063225] Wed, 12 June 2013 17:38 Go to previous messageGo to next message
Christian Dietrich is currently offline Christian DietrichFriend
Messages: 14665
Registered: July 2009
Senior Member
P.S.

the rule then would loo like

ID (WS ID)*


Twitter : @chrdietrich
Blog : https://www.dietrich-it.de
Re: Terminals with whitespace [message #1063233 is a reply to message #1063226] Wed, 12 June 2013 18:33 Go to previous messageGo to next message
Neal Kruis is currently offline Neal KruisFriend
Messages: 15
Registered: October 2012
Junior Member
Ok. Thanks for all of your patience with me as I'm still climbing the Xtext learning curve.

I'm still having the same issue that I originally had where, because whitespace (specifically tabs and spaces) is allowed in the alphanumeric filed, the parser thinks any whitespace between the Field token and the following comma is going to be a continuation of the Field token and I get an error because it doesn't detect the ending alphanumeric character.

Here is what I have:
InputDefinition hidden(SL_COMMENT, WS):
	objects += Object*
;


Object:
	
	
	fields+=Field',' 
	(fields+=Field',')*
	fields+=Field';'
;

Field:
	
	ALPHANUMERIC
;
			
terminal ALPHANUMERIC :
	('0'..'9' | 'A'..'Z' | 'a'..'z')* (('\t' | ' ')+ ('0'..'9' | 'A'..'Z' | 'a'..'z')+)*
;
			
terminal SL_COMMENT 	: '!' !('\n'|'\r')* ('\r'? '\n')?;

terminal WS			: (' '|'\t'|'\r'|'\n')+;

terminal OTHER : .;


So in this example:

Object,
Field 1 ,
Field 2,
Filed N;



I get an error because of the space after field 1 which the parser is interpreting as an incomplete addition to the field token instead of hidden whitespace before the comma.

I hope that makes sense. Please let me know if I can clarify anything in my example.

[Updated on: Wed, 12 June 2013 18:33]

Report message to a moderator

Re: Terminals with whitespace [message #1063237 is a reply to message #1063233] Wed, 12 June 2013 19:05 Go to previous message
Christian Dietrich is currently offline Christian DietrichFriend
Messages: 14665
Registered: July 2009
Senior Member
Hi,

still dont know why you care about ws at all the flowing seem to work for me

grammar org.xtext.example.mydsl1.MyDsl hidden(SL_COMMENT, WS)

import "http://www.eclipse.org/emf/2002/Ecore"

generate myDsl "http://www.xtext.org/example/mydsl1/MyDsl"

InputDefinition:
	objects += Object*
;


Object:
	
	
	fields+=Field',' 
	(fields+=Field',')*
	fields+=Field';'
;

Field:
	
	ALPHANUMERIC+
;
			
terminal ALPHANUMERIC :
	('0'..'9' | 'A'..'Z' | 'a'..'z')+
;
			
terminal SL_COMMENT 	: '!' !('\n'|'\r')* ('\r'? '\n')?;

terminal WS			: (' '|'\t'|'\r'|'\n')+;

terminal OTHER : .;


Twitter : @chrdietrich
Blog : https://www.dietrich-it.de
Previous Topic:Scoping
Next Topic:new import section mechanism
Goto Forum:
  


Current Time: Sat Apr 20 00:38:36 GMT 2024

Powered by FUDForum. Page generated in 0.04941 seconds
.:: Contact :: Home ::.

Powered by: FUDforum 3.0.2.
Copyright ©2001-2010 FUDforum Bulletin Board Software

Back to the top