Context sensitive lexing [message #1188843] |
Fri, 15 November 2013 21:11 |
Andrew Gacek Messages: 32 Registered: October 2011 |
Member |
|
|
I'm writing an Xtext grammar for a language which has built-in regular expression syntax. For example, the user can write
The problem I'm having is how to parse the regular expression so that I can warn the user about syntax errors within the regular expression. Naively, if I try to create a parser rule for what a regex is I might do something like
Regex:
'/' RegexBody '/'
;
RegexBody:
RegexBody '?'
| RegexBody '*'
| REGEX_TOKEN
;
REGEX_TOKEN:
~('?' | '*' | '/')
;
Ignoring the LL(*) issues, the problem is that REGEX_TOKEN overlaps with ID. Indeed if we look at something like /ab+c/ this is lexed as '/' 'ab' '+' '/'.
What I would really like is a way to make the lexer context-sensitive so that between the '/' characters it would treat characters differently. Is there a good way to do this in Xtext? Or more generally, a good way to handle this kind of nested sublanguage of regular expressions within a more general language?
Thanks,
Andrew
[Updated on: Fri, 15 November 2013 21:12] Report message to a moderator
|
|
|
|
Powered by
FUDForum. Page generated in 0.03805 seconds