Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Modeling » TMF (Xtext) » Debugging lexer & parser
Debugging lexer & parser [message #635699] Wed, 27 October 2010 14:24 Go to next message
Dénes Harmath is currently offline Dénes Harmath
Messages: 157
Registered: July 2009
Senior Member
Hi all,

what is the quickest & simplest way to visualize the token stream & the AST of a file for debugging purposes?

Thanks in advance,
thSoft
Re: Debugging lexer & parser [message #635700 is a reply to message #635699] Wed, 27 October 2010 14:32 Go to previous messageGo to next message
Meinte Boersma is currently offline Meinte Boersma
Messages: 433
Registered: July 2009
Location: Leiden, Netherlands
Senior Member
Open the DSL instance (file) using the Sample Reflective Ecore Model editor (Open with > Other...). That gives you the actual EMF model (instance).

Re: Debugging lexer & parser [message #635708 is a reply to message #635699] Wed, 27 October 2010 15:10 Go to previous messageGo to next message
Sebastian Zarnekow is currently offline Sebastian Zarnekow
Messages: 2809
Registered: July 2009
Senior Member
Hi Dennis,

this one may help to visualize the node model:
https://github.com/ralfebert/org.eclipselabs.xtext.nodeoutli ne

Regards,
Sebastian
--
Need professional support for Eclipse Modeling?
Go visit: http://xtext.itemis.com

Am 27.10.10 20:24, schrieb Dennis Harmath:
> Hi all,
>
> what is the quickest & simplest way to visualize the token stream & the
> AST of a file for debugging purposes?
>
> Thanks in advance,
> thSoft
Re: Debugging lexer & parser [message #635724 is a reply to message #635699] Wed, 27 October 2010 16:33 Go to previous messageGo to next message
Dénes Harmath is currently offline Dénes Harmath
Messages: 157
Registered: July 2009
Senior Member
Thanks, both are fine ways for visualizing the node model. But how can I quickly see the output of the lexer, the token stream?
Re: Debugging lexer & parser [message #635734 is a reply to message #635724] Wed, 27 October 2010 17:31 Go to previous messageGo to next message
Sebastian Zarnekow is currently offline Sebastian Zarnekow
Messages: 2809
Registered: July 2009
Senior Member
Hi Dennis,

I'd write a unit test and invoke the lexer for sample input. The tokens
provide a useful toString so adding all of the to an arraylist and
inspecting this one may be helpful. This (propably sort of outdated)
blog post may provide some inspiration:
http://blogs.itemis.de/stundzig/archives/726

Regards,
Sebastian
--
Need professional support for Eclipse Modeling?
Go visit: http://xtext.itemis.com

Am 27.10.10 22:33, schrieb Dennis Harmath:
> Thanks, both are fine ways for visualizing the node model. But how can I
> quickly see the output of the lexer, the token stream?
Re: Debugging lexer & parser [message #993316 is a reply to message #635734] Sun, 23 December 2012 22:06 Go to previous messageGo to next message
Barrie Treloar is currently offline Barrie Treloar
Messages: 52
Registered: July 2009
Member
Sebastian Zarnekow wrote on Wed, 27 October 2010 17:31
Hi Dennis,

I'd write a unit test and invoke the lexer for sample input. The tokens
provide a useful toString so adding all of the to an arraylist and
inspecting this one may be helpful. This (propably sort of outdated)
blog post may provide some inspiration:
http://blogs.itemis.de/stundzig/archives/726


Thanks for this, it has been useful.

I've updated your example to support XText 2 at http://baerrach.blogspot.com.au/2012/12/lexer-and-parsers-tests-for-xtext.html (it also allows instantiation so you can debug token streams from other classes)

I found that I could mostly get away with the results from ParserHelper2.parse() and traversing the DSL classes to unit test my grammar.

It wasn't until I tried some insane stuff with lexer rules and character negation to get "unquoted strings" (what I am stuck with) working in the grammar that the lexer stopped behaving correctly.

terminal fragment WS_CHAR:
    ' ' | '\t' | '\r' | '\n';

terminal UNQUOTED_STRING:
    !'"' (!('"' | WS_CHAR))*;


At this point being able to see what the tokens are is extremely useful.
It clearly shows that lexer rules are being hidden and swallowed up incorrectly by this rule.

In the end I couldn't resolve this with character negation, so I settled for an incorrect grammar knowing the workaround was to quote the damn string.
Trying to include punctuation correctly causes the lexer rules to hide INT and other "punctation" keywords that get used.

terminal fragment UNQUOTED_STRING_FIRST_CHARACTER:
    ('a'..'z' | 'A'..'Z');

terminal fragment UNQUOTED_STRING_OTHER_CHARACTERS:
    (UNQUOTED_STRING_FIRST_CHARACTER | '0'..'9' | '.' | '_' | '-' | '\\');

terminal UNQUOTED_STRING:
    (UNQUOTED_STRING_FIRST_CHARACTER) (UNQUOTED_STRING_OTHER_CHARACTERS)*;
Re: Debugging lexer & parser [message #993372 is a reply to message #993316] Mon, 24 December 2012 02:21 Go to previous messageGo to next message
Alexander Nittka is currently offline Alexander Nittka
Messages: 1151
Registered: July 2009
Senior Member
Hi,

is there a reason, you can't make your unquoted string definition a datatype rule rather than a terminal rule?

Alex


Need training, onsite consulting or any other kind of help for Xtext?
Go visit http://xtext.itemis.com or send a mail to xtext@itemis.de
Re: Debugging lexer & parser [message #993416 is a reply to message #993372] Mon, 24 December 2012 05:11 Go to previous messageGo to next message
Barrie Treloar is currently offline Barrie Treloar
Messages: 52
Registered: July 2009
Member
Alexander Nittka wrote on Mon, 24 December 2012 02:21
Hi,

is there a reason, you can't make your unquoted string definition a datatype rule rather than a terminal rule?


Because the common Terminals package does so for Strings?
I assumed that was the best way for unquoted strings.

I had looked at Parser rules but it says

Quote:
Character ranges, wildcards, the until token and the negation as well as the EOF token are only available for terminal rules.


Which I assumed also meant datatype rules.

Since unquoted string can be anything that does not contain a space in it, I couldn't work out how to build this with anything but a Terminal rule. Which in the end I just gave up and stripped it back to bare bones and made it work with the existing instance of the grammar I had to test against.

I'm still very new to all this, so if you can craft a Datatype Rule I'm willing to give it a go.
Re: Debugging lexer & parser [message #993446 is a reply to message #993416] Mon, 24 December 2012 07:14 Go to previous messageGo to next message
Henrik Lindberg is currently offline Henrik Lindberg
Messages: 2498
Registered: July 2009
Senior Member
Use of data rules is good unless you end up with lots of tiny tokens,
have to have long lists of things to include (because 'not' is not
possible), or end up with lots of fiddling with hidden on/off (which has
subtle unwanted effects in several places).

When you reach that point, you can write a more powerful external lexer.
Thought that could be good to know if you start feeling you are digging
yourself into a hole.

Regards
- henrik

On 2012-24-12 11:11, Barrie Treloar wrote:
> Alexander Nittka wrote on Mon, 24 December 2012 02:21
>> Hi,
>>
>> is there a reason, you can't make your unquoted string definition a
>> datatype rule rather than a terminal rule?
>
>
> Because the common Terminals package does so for Strings? I assumed that
> was the best way for unquoted strings.
>
> I had looked at
> http://www.eclipse.org/Xtext/documentation.html#parser_rules but it says
>
> Quote:
>> Character ranges, wildcards, the until token and the negation as well
>> as the EOF token are only available for terminal rules.
>
>
> Which I assumed also meant datatype rules.
>
> Since unquoted string can be anything that does not contain a space in
> it, I couldn't work out how to build this with anything but a Terminal
> rule. Which in the end I just gave up and stripped it back to bare
> bones and made it work with the existing instance of the grammar I had
> to test against.
>
> I'm still very new to all this, so if you can craft a Datatype Rule I'm
> willing to give it a go.
Re: Debugging lexer & parser [message #993620 is a reply to message #993446] Mon, 24 December 2012 19:48 Go to previous message
Barrie Treloar is currently offline Barrie Treloar
Messages: 52
Registered: July 2009
Member
Henrik Lindberg wrote on Mon, 24 December 2012 07:14
Use of data rules is good unless you end up with lots of tiny tokens,
have to have long lists of things to include (because 'not' is not
possible), or end up with lots of fiddling with hidden on/off (which has
subtle unwanted effects in several places).

When you reach that point, you can write a more powerful external lexer.
Thought that could be good to know if you start feeling you are digging
yourself into a hole.


Good to know I've found the limitations and am heading in the right direction, i.e. ignoring the holes in the grammar with suitable work arounds Smile
Previous Topic:Reference Question for a goto like statement
Next Topic:Difference between Juno and Indigo
Goto Forum:
  


Current Time: Fri Jul 25 01:19:44 EDT 2014

Powered by FUDForum. Page generated in 0.23088 seconds