|
|
|
|
|
Re: Customized InternalMyDSLLexer [message #736001 is a reply to message #735947] |
Thu, 13 October 2011 10:16 |
Henrik Lindberg Messages: 2509 Registered: July 2009 |
Senior Member |
|
|
On 10/13/11 9:35 AM, Mathieu Garcia wrote:
> Yes Daniel,
> As I wrote above, I did it, but some methods in InternalMyDSLLexer are
> final. So, this solution is not very convenient.
>
> Using an external lexer seems to be the good solution but I don't know
> how implement it.
>
> Mathieu
Basically, you write the lexer.g file (in antlr syntax) and configure
the mwe workflow for your DSL to use the external lexer that is then
generated.
If you are looking at Geppetto - start in the runtime module:
// contributed by
org.eclipse.xtext.generator.parser.antlr.ex.rt.AntlrGeneratorFragment
public com.google.inject.Provider<PPOverridingLexer>
providePPOverridingLexer() {
return
org.eclipse.xtext.parser.antlr.LexerProvider.create(PPOverridingLexer.class);
}
Look at PPOverridingLexer - which is needed to keep track of last seen
token. It inherits from the generated PPLexer. If you don't need to do
something special like this, then use the generated lexer (i.e. in my
case PPLexer) directly.
The package org.cloudsmith.geppetto.pp.dsl.lexer contains the .g file
and some surrounding supporting stuff.
The mwe workflow contains this:
// Use externally specified lexer
fragment = parser.antlr.ex.ExternalAntlrLexerFragment {
lexerGrammar = "org.cloudsmith.geppetto.pp.dsl.lexer.PPLexer"
runtime = true
antlrParam = "-lib"
antlrParam =
"${runtimeProject}/src-gen/org/cloudsmith/geppetto/pp/dsl/parser/antlr/lexer"
}
Hope that helps.
If you don't know how the lexer works, I suggest you write a very small
grammar and look at what is generated.
The only thing I found to be a bit tricky is that the token values have
to be kept in sync. This is a bit messy if you are adding/removing
keywords or terminals in your grammar.
In short, you keep the terminals and keywords in your grammar. The
definitions of the terminals will never be used, but the generated token
values for your keywords and terminals are. The tokens are just a number
(and the text that was lexed for that token), so if the grammar thinks
"if" is 34 and your lexer delivers 35 when seeing "if" you get very
strange results.
It should be possible to figure out how it works in general from looking
at the Geppetto logic - it is not very complicated, but if you really
try to dig into the string expression interpolation logic you need to
know that double quoted strings can contain ${expr}, $varName or
${varName}, and that it is possible to escape these constructs with \.
You can also look at mwe2 - it is also using an external lexer, and it
is where I picked up how to do some of this.
- henrik
|
|
|
|
Powered by
FUDForum. Page generated in 0.03046 seconds