|(no subject) [message #710334 is a reply to message #710202]
||Thu, 04 August 2011 23:15
| Henrik Lindberg
Registered: July 2009
There are three options (as I see it):|
1) define the grammar using a clever set of terminals and data rules in
your .xtext grammar
2) override the generated lexer class, and do special processing for a
small selection of tokens
3) Use an external lexer
1) Sounds like you already ruled this out (it quickly gets messy with
overlapping terminals and whitepsace handling). A pro is that if you do
specify the grammar, you get a lot for free with code completion inside
the special section etc.
2) This approach is not recommended for more that a couple of special
tokens. You can override the entry point to the lexer and thus take
alternate action on certain tokens, and prevent it to take generated
actions etc. You then simply use guice to bind your implementation
instead of the generated one. (IIRC I use this approach in the Eclipse
b3 project, as I was not aware of using an external lexer). As a
starting point you can look at what is generated for a small sample
grammar as that makes it straight forward to figure out how it works.
3) You can use an "external lexer" with Xtext. That means that you can
replace the generated lexer with one generated from ANTLR source. In
ANTLR you can provide your own logic with the rules. It is not as
difficult as it sounds. I use this approach in Cloudsmith/Geppetto (at
github) as there are several difficult to parse features in the puppet
language grammar (regular expressions and expression interpolation in
strings). The only tricky thing is the requirement to (manually) sync
the token enumerator values. Integrating the external lexer with mwe is
easy as it is a supported option.
Take a look at
- The PPOverridingLexer is of type 2, but only sets up a
"lastSignificantToken" to be used in the lexer.
- PPLexer.g is the external lexer in ANTLR.
Hope that helps.
On 8/4/11 10:06 PM, Ferdinando Villa wrote:
> in the DSL I'm developing, I'd like to support "expressions" that would
> look like regular strings to the language, delimited by unambiguous
> characters like open/close square brackets, to be returned as strings
> and parsed outside the grammar. The obvious choice would be something like
> terminal EXPR: '[' -> ']';
> but obviously, the contents of the expression may contain nested pairs
> of  so this would stop at the first closed bracket. Obviously what I
> need is not a terminal, but I also wouldn't like to write a whole parser
> - just an extended lexer rule in Java that reads the input until the
> next matching bracket. I spent quite a bit of time with the docs without
> finding a way although I'm pretty sure this is possible.
> Advice please? Thanks so much in advance.
Powered by FUDForum
. Page generated in 0.01886 seconds