Skip to main content


Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Modeling » TMF (Xtext) » Catch Unquoted String(how to catch unquoted and quoted strings)
Catch Unquoted String [message #798574] Tue, 14 February 2012 21:21 Go to next message
stefan bosshard is currently offline stefan bosshardFriend
Messages: 2
Registered: February 2012
Junior Member
I am trying to write a DSL for an existing 'language'. The primary aim is to get a parser. I got that going with one exception. The 'language' allows for strings to be unquoted as long as they do not contain spaces. Both string varieties coeexist. I cannot use the ID terminal because some strings contain non-ASCII characters from French, Spanish, etc.

Having tried a great many approaches I find myself at a loss. Can you suggest how I should go about this problem?

Thanks a lot in advance

/stefan
Re: Catch Unquoted String [message #798618 is a reply to message #798574] Tue, 14 February 2012 22:42 Go to previous messageGo to next message
Henrik Lindberg is currently offline Henrik LindbergFriend
Messages: 2509
Registered: July 2009
Senior Member
You can solve this by redefining the ID terminal to contain a wider
range of characters. If you need to restrict the set of characters in
some places, you can add validation rules.

A second way to do this, is to define a datatype rule that consists of
combination of ID and EXT_ID. EXT_ID would be a terminal similar to ID
but only for the additional characters.

UnquotedString : (ID | EXT_ID)+ ;

A more elegant solution would be to use an external lexer where you you
use the terminal that allow all characters that are valid in the
unquoted string, but before returning a token, it checks if it a valid
more restrictive ID, and if so instead returns this token. (You can not
do this with the xtext grammar alone.

terminal UNQUOTEDSTRING : [ ...long list of char ranges] ;
terminal ID : ... standard ID ...

(If an external lexer is not use, the ID token would never be found as
the UNQUOTEDSTRING has higher precedence, but you need to declare them
in the grammar to make it possible to return these tokens).

The external lexer approach will be slightly more efficient as the parse
tree will be smaller (best case in favor of the external lexer would be
if every other character was a 'non-ascii' char.

Hope that helps.

Regards
- henrik

On 2012-14-02 22:22, stefan bosshard wrote:
> I am trying to write a DSL for an existing 'language'. The primary aim
> is to get a parser. I got that going with one exception. The 'language'
> allows for strings to be unquoted as long as they do not contain spaces.
> Both string varieties coeexist. I cannot use the ID terminal because
> some strings contain non-ASCII characters from French, Spanish, etc.
>
> Having tried a great many approaches I find myself at a loss. Can you
> suggest how I should go about this problem?
> Thanks a lot in advance
>
> /stefan
Re: Catch Unquoted String [message #802743 is a reply to message #798618] Mon, 20 February 2012 11:39 Go to previous message
stefan bosshard is currently offline stefan bosshardFriend
Messages: 2
Registered: February 2012
Junior Member
Thanks Henrik

your advice - no surprise - worked well, even if I had to change a couple of other things as well. but I have now what I wanted

regards
/stefan
Previous Topic:[ report ""
Next Topic:create Predicates with XTend
Goto Forum:
  


Current Time: Fri Apr 26 18:56:27 GMT 2024

Powered by FUDForum. Page generated in 0.03039 seconds
.:: Contact :: Home ::.

Powered by: FUDforum 3.0.2.
Copyright ©2001-2010 FUDforum Bulletin Board Software

Back to the top