|
|
|
|
|
|
|
|
|
|
Re: parsing arguments without separating whitespace [message #1092311 is a reply to message #1092303] |
Thu, 22 August 2013 15:51 |
|
sorry i dont have the time to test.
and both are meachanisms are std ways to solve the "xtext finds a keyword problem"
and your grammar seems to contain no whitespace at all (is this really wanted).
since the parser is eager to eat everything up unless
Syntax: FOURX id=ID;
ID : '^'?(PRIMITIVE|NONPRIM|'_'| 'x')(PRIMITIVE|NONPRIM|NUMBER|'_'| 'x')*;
FOURX : 'x' 'x' 'x' 'x';
terminal PRIMITIVE: 'B'|'D'|'F'|'I'|'J'|'V'|'Z';
terminal NONPRIM: 'A'..'Z'|'a'..'z';
terminal NUMBER:'0'..'9';
Twitter : @chrdietrich
Blog : https://www.dietrich-it.de
|
|
|
|
|
|
Re: parsing arguments without separating whitespace [message #1095604 is a reply to message #1095552] |
Tue, 27 August 2013 09:22 |
Michael Schnupp Messages: 8 Registered: August 2013 |
Junior Member |
|
|
Claudio Heeg wrote on Tue, 27 August 2013 03:55
I'm sorry, but I can't quite follow.
You do want to have 'xxxx' (in this case) as a keyword somewhere in your language but also as a possible name of a function?
No, I want "end" as a keyword and "sendMessage" as a valid identifier(e.g. method name).
Similarly, in the xxxx-Example I want xxxx as a keyword and "_xxxx_" as a valid identifier.
[Updated on: Tue, 27 August 2013 09:23] Report message to a moderator
|
|
|
Re: parsing arguments without separating whitespace [message #1095662 is a reply to message #1095604] |
Tue, 27 August 2013 11:03 |
Claudio Heeg Messages: 75 Registered: April 2013 |
Member |
|
|
I see.
So the problem once again seems to be the greediness of the lexer, i.e. seeing "II" as method arguments (if ID is a terminal), but lexing as much as possible into one token, thus making it an ID instead of two seperated PRIMITIVEs.
Is it possible for you to restrict Identifiers not to begin with a "PRIMITIVE", or can that not be changed within the language itself?
Otherwise I'm afraid I'm at a loss here and hope someone more knowledgeable will come along.
[Updated on: Tue, 27 August 2013 11:03] Report message to a moderator
|
|
|
|
|
Re: parsing arguments without separating whitespace [message #1096020 is a reply to message #1090904] |
Tue, 27 August 2013 21:36 |
Henrik Lindberg Messages: 2509 Registered: July 2009 |
Senior Member |
|
|
On 2013-21-08 17:41, Michael Schnupp wrote:
> Hello,
>
> I try to build an xtext grammar for parsing smali code.
>
> Method names are normal IDs, simple data types are just single letters
> "I" for Integer, "D" for double, etc.
>
> Method signatures are written like "add(II)I" which denotes a method
> with name "add" taking two integers and returning an integer.
>
> Another valid signature would be "III(III)I" - a method with name "III"
> taking three Integer "III".
>
> The most natural gramar would be:
>
> ID'('Type*')'Type
>
> Unfortunately the lexer always tags strings like "III" as an ID, even
> when Type* would be correct.
>
> What is the right way to build such a gramar?
I have followed the conversation that followed this post, and it seems
that however you try to work around the issues there is no way it is
completely right.
I was in a similar situation with the Puppet Language. The only way I
know how to solve this in a good way is to use an external lexer where
you have full control over the lexing. I.e. define the grammar in a
natural way, and solve all the problems in an external ANLTR based lexer
where you have full control. (Xtext supports this).
It takes a bit of setup, but works really well otherwise. I would not be
able to handle the Puppet language without it.
My implementation is in cloudsmith / geppetto @ github
Regards
- henrik
|
|
|
|
Powered by
FUDForum. Page generated in 0.04460 seconds