parsing arguments without separating whitespace [message #1090904] |
Tue, 20 August 2013 16:27  |
Eclipse User |
|
|
|
Hello,
I try to build an xtext grammar for parsing smali code.
Method names are normal IDs, simple data types are just single letters "I" for Integer, "D" for double, etc.
Method signatures are written like "add(II)I" which denotes a method with name "add" taking two integers and returning an integer.
Another valid signature would be "III(III)I" - a method with name "III" taking three Integer "III".
The most natural gramar would be:
Unfortunately the lexer always tags strings like "III" as an ID, even when Type* would be correct.
What is the right way to build such a gramar?
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Re: parsing arguments without separating whitespace [message #1096020 is a reply to message #1090904] |
Tue, 27 August 2013 17:36   |
Eclipse User |
|
|
|
On 2013-21-08 17:41, Michael Schnupp wrote:
> Hello,
>
> I try to build an xtext grammar for parsing smali code.
>
> Method names are normal IDs, simple data types are just single letters
> "I" for Integer, "D" for double, etc.
>
> Method signatures are written like "add(II)I" which denotes a method
> with name "add" taking two integers and returning an integer.
>
> Another valid signature would be "III(III)I" - a method with name "III"
> taking three Integer "III".
>
> The most natural gramar would be:
>
> ID'('Type*')'Type
>
> Unfortunately the lexer always tags strings like "III" as an ID, even
> when Type* would be correct.
>
> What is the right way to build such a gramar?
I have followed the conversation that followed this post, and it seems
that however you try to work around the issues there is no way it is
completely right.
I was in a similar situation with the Puppet Language. The only way I
know how to solve this in a good way is to use an external lexer where
you have full control over the lexing. I.e. define the grammar in a
natural way, and solve all the problems in an external ANLTR based lexer
where you have full control. (Xtext supports this).
It takes a bit of setup, but works really well otherwise. I would not be
able to handle the Puppet language without it.
My implementation is in cloudsmith / geppetto @ github
Regards
- henrik
|
|
|
Re: parsing arguments without separating whitespace [message #1096442 is a reply to message #1096020] |
Wed, 28 August 2013 07:56  |
Eclipse User |
|
|
|
Hi,
I also tried to solve a lexer problem by using datatype. It was a dead end, because of simmilar side effects. My solution at the moment is a lexer with semantic predicates.
Maybe a semantic predicate for the token "Type" can help here, too. It depends on the grammar whether it is possible to find a predicate that decides whether it is ID or Type.
Semantic predicates are not supported by the xtext generated lexer and a custom lexer is required. I don't liked to build a custom lexer and just tweak the xtext generated lexer each time within the mwe2 workflow ( see http://www.eclipse.org/forums/index.php/mv/msg/494581/1073856/#msg_1073856 , it is just quick and dirty and can obviously improved a lot).
[Updated on: Wed, 28 August 2013 07:56] by Moderator
|
|
|
Powered by
FUDForum. Page generated in 0.07766 seconds