Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Modeling » TMF (Xtext) » Context sensitive lexing
Context sensitive lexing [message #1188843] Fri, 15 November 2013 21:11 Go to next message
Andrew Gacek is currently offline Andrew Gacek
Messages: 32
Registered: October 2011
Member
I'm writing an Xtext grammar for a language which has built-in regular expression syntax. For example, the user can write

regex_match(/ab+c/, x)


The problem I'm having is how to parse the regular expression so that I can warn the user about syntax errors within the regular expression. Naively, if I try to create a parser rule for what a regex is I might do something like

Regex:
  '/' RegexBody '/'
;

RegexBody:
  RegexBody '?'
| RegexBody '*'
| REGEX_TOKEN
;

REGEX_TOKEN:
  ~('?' | '*' | '/')
;


Ignoring the LL(*) issues, the problem is that REGEX_TOKEN overlaps with ID. Indeed if we look at something like /ab+c/ this is lexed as '/' 'ab' '+' '/'.

What I would really like is a way to make the lexer context-sensitive so that between the '/' characters it would treat characters differently. Is there a good way to do this in Xtext? Or more generally, a good way to handle this kind of nested sublanguage of regular expressions within a more general language?

Thanks,
Andrew

[Updated on: Fri, 15 November 2013 21:12]

Report message to a moderator

Re: Context sensitive lexing [message #1188924 is a reply to message #1188843] Fri, 15 November 2013 22:09 Go to previous message
Christian Dietrich is currently offline Christian Dietrich
Messages: 6315
Registered: July 2009
Senior Member
Hi searching for xtext + external lexer might show you some hints
Previous Topic:Extend XBase grammar
Next Topic:Regular Expression language
Goto Forum:
  


Current Time: Mon Sep 22 14:36:18 GMT 2014

Powered by FUDForum. Page generated in 0.02392 seconds
.:: Contact :: Home ::.

Powered by: FUDforum 3.0.2.
Copyright ©2001-2010 FUDforum Bulletin Board Software