Supporting unquoted string with spaces [message #1639487] |
Fri, 27 February 2015 12:10  |
Eclipse User |
|
|
|
Hi guys,
I am trying to support unquoted strings with spaces so I have built this grammar.
/*
* Document
*/
Document:
name=UNQUOTED_STRING;
/*
* Terminals
*/
terminal WS:
(' ' | '\t')+;
terminal EOL:
LINE_BREAK;
terminal UNQUOTED_STRING:
!NON_QUOTED_STRING_START -> NON_QUOTED_STRING_END;
terminal fragment LINE_BREAK:
('\r' | '\n');
terminal fragment NON_QUOTED_STRING_START:
'"'|"'"|'0'..'9'|'!'|'#'|'$'|'('|')'|'*'|'+'|','|'-'|'.'|'/'|':'|'<'|'='|'>'|'?'|'['|']'|'{'|'}'|'|'|'%'|'^'|'@'|'\r'|'\n'|' '|'\t';
terminal fragment NON_QUOTED_STRING_END:
('!'|'#'|'$'|'('|')'|'*'|','|'.'|'/'|':'|'<'|'='|'>'|'?'|'['|']'|'{'|'}'|'|'|'%'|'^'|'\r'|'\n');
And then I tested the lexer using the following code:
InternalMyDslLexer lexer = new InternalMyDslLexer(new ANTLRStringStream("Data"));
Token token = lexer.nextToken();
while (token.getType() != -1) {
System.out.println(token);
token = lexer.nextToken();
}
The lexer is not able to parse my string however depending on the input it works ok:
Inputs:
Data --> Does not work
Data with spaces --> Works ok
Data\n --> Works ok
Do you have any idea why my lexer is not able to parse the first input which is a single word
Thanks in advance.
Regards,
Luis
|
|
|
|
Re: Supporting unquoted string with spaces [message #1644862 is a reply to message #1644143] |
Mon, 02 March 2015 02:53   |
Eclipse User |
|
|
|
Hi
Seems interesting. I nearly replied to your original message suggesting
that using Xtext for a lexing problem was crazy, but you seem to have a
new way of using Xtext that I do not understand.
Please elaborate on "you will need to split lexer and parser using the a
fragment because text by default use a lexer/parser in the same file".
I'm only aware of grammar splitting by the grammar...with... daisy
chain. I'm not sure which fragment you refer to: both of the existing
AntlrGeneratorFragments, your custom fragment or ...
Regards
Ed Willink
On 01/03/2015 23:35, Luis De Bello wrote:
> Hi guys,
>
> I am replying to my own answer ,maybe this can be useful to others. I
> was able to support unquoted string with spaces, using a custom lexer,
> I enclose my terminals in Xtext and the portion of code of my lexer
>
> Xtext file:
> terminal UNQUOTED_STRING:
> !NON_QUOTED_STRING_START !(NON_QUOTED_STRING_END)*;
>
> terminal fragment NON_QUOTED_STRING_START:
> '"'|"'"|'0'..'9'|'!'|'#'|'$'|'('|')'|'*'|'+'|','|'-'|'.'|'/'|':'|'<'|'='|'>'|'?'|'['|']'|'{'|'}'|'|'|'%'|'^'|'@'|'\r'|'\n'|'
> '|'\t';
>
> terminal fragment NON_QUOTED_STRING_END:
> ('!'|'#'|'$'|'('|')'|'*'|','|'.'|'/'|':'|'<'|'='|'>'|'?'|'['|']'|'{'|'}'|'|'|'%'|'^'|'\r'|'\n');
>
>
> Lexer grammar:
> RULE_UNQUOTED_STRING : {!isKeyword()}?=>
> ~(RULE_NON_QUOTED_STRING_START) ({!isIsolatedKeyword()}?=>
> ~(RULE_NON_QUOTED_STRING_END))*;
>
> The isKeyword and isIsolatedKeyword are two methods implement to check
> for keywords using lookahead functionality it will depend on each
> implementation.
>
> I hope this will be useful for others also ,you will need to split
> lexer and parser using the a fragment because text by default use a
> lexer/parser in the same file.
>
> Regards,
> Luis
|
|
|
Re: Supporting unquoted string with spaces [message #1647584 is a reply to message #1644862] |
Tue, 03 March 2015 09:05  |
Eclipse User |
|
|
|
Hi Ed,
I tried to say that you will need to use a fragment which is provided by Xtext to split the lexer and parser grammar.
// Splitting lexer and parser generation, this is use to replace the default fragment "parser.antlr.XtextAntlrGeneratorFragment"
fragment = org.eclipse.xtext.generator.parser.antlr.ex.rt.AntlrGeneratorFragment {}
After splitting this file you can start adding some context to your lexer using one additional fragment
// Uses ANTLR Tools to compile a custom lexer and will also add a binding in the runtime module to use the Lexer
fragment = parser.antlr.ex.ExternalAntlrLexerFragment {
// A grammar file extension with .g will be expected in this package (Should be stored in src folder)
lexerGrammar = "org.mule.tooling.dfl.parser.antlr.lexer.InternalDFLLexer"
runtime = true
antlrParam = "-lib"// This is the folder where the lexer will be created
antlrParam = "${runtimeProject}/src-gen/org/mule/tooling/dfl/parser/antlr/lexer"
}
Now you have your own lexer grammar and you can add some predicates and context using LA techniques. The only issue with predicates is that Xtext only handles NoViableAltException for the recovery mode so you will have to override the method next token or use an awful hack as replace the "FailedPredicateException" to "NoViableAltException" does works for me but it is not a nice solution.
I hope to make myself clear and I hope this will be useful for you
Regards,
Luis
|
|
|
Powered by
FUDForum. Page generated in 1.04231 seconds