Eclipse Community Forums: TMF (Xtext) » Catch Unquoted String

Help

Home

Home » Modeling » TMF (Xtext) » Catch Unquoted String(how to catch unquoted and quoted strings)

Show: Today's Messages :: Show Polls :: Message Navigator

Catch Unquoted String [message #798574]

Tue, 14 February 2012 21:21

stefan bosshard

Messages: 2
Registered: February 2012

Junior Member

I am trying to write a DSL for an existing 'language'. The primary aim is to get a parser. I got that going with one exception. The 'language' allows for strings to be unquoted as long as they do not contain spaces. Both string varieties coeexist. I cannot use the ID terminal because some strings contain non-ASCII characters from French, Spanish, etc.

Having tried a great many approaches I find myself at a loss. Can you suggest how I should go about this problem?

Thanks a lot in advance

/stefan

Report message to a moderator

Re: Catch Unquoted String [message #798618 is a reply to message #798574]

Tue, 14 February 2012 22:42

Henrik Lindberg

Messages: 2509
Registered: July 2009

Senior Member

You can solve this by redefining the ID terminal to contain a wider
range of characters. If you need to restrict the set of characters in
some places, you can add validation rules.

A second way to do this, is to define a datatype rule that consists of
combination of ID and EXT_ID. EXT_ID would be a terminal similar to ID
but only for the additional characters.

UnquotedString : (ID | EXT_ID)+ ;

A more elegant solution would be to use an external lexer where you you
use the terminal that allow all characters that are valid in the
unquoted string, but before returning a token, it checks if it a valid
more restrictive ID, and if so instead returns this token. (You can not
do this with the xtext grammar alone.

terminal UNQUOTEDSTRING : [ ...long list of char ranges] ;
terminal ID : ... standard ID ...

(If an external lexer is not use, the ID token would never be found as
the UNQUOTEDSTRING has higher precedence, but you need to declare them
in the grammar to make it possible to return these tokens).

The external lexer approach will be slightly more efficient as the parse
tree will be smaller (best case in favor of the external lexer would be
if every other character was a 'non-ascii' char.

Hope that helps.

Regards
- henrik

On 2012-14-02 22:22, stefan bosshard wrote:
> I am trying to write a DSL for an existing 'language'. The primary aim
> is to get a parser. I got that going with one exception. The 'language'
> allows for strings to be unquoted as long as they do not contain spaces.
> Both string varieties coeexist. I cannot use the ID terminal because
> some strings contain non-ASCII characters from French, Spanish, etc.
>
> Having tried a great many approaches I find myself at a loss. Can you
> suggest how I should go about this problem?
> Thanks a lot in advance
>
> /stefan

Report message to a moderator

Re: Catch Unquoted String [message #802743 is a reply to message #798618]

Mon, 20 February 2012 11:39

stefan bosshard

Messages: 2
Registered: February 2012

Junior Member

Thanks Henrik

your advice - no surprise - worked well, even if I had to change a couple of other things as well. but I have now what I wanted

regards
/stefan

Report message to a moderator

Previous Topic:	[ report ""
Next Topic:	create Predicates with XTend

Goto Forum:

-=] Back to Top [=-

[ Syndicate this forum (XML) ] [

]

Current Time: Fri Apr 26 18:56:27 GMT 2024

.:: Contact :: Home ::.

Breadcrumbs

Sign up to our Newsletter