Eclipse Community Forums: TMF (Xtext) » Debugging lexer & parser

Help

Home

Home » Modeling » TMF (Xtext) » Debugging lexer & parser

Show: Today's Messages :: Show Polls :: Message Navigator

Debugging lexer & parser [message #635699]

Wed, 27 October 2010 18:24

Dénes Harmath

Messages: 157
Registered: July 2009

Senior Member

Hi all,

what is the quickest & simplest way to visualize the token stream & the AST of a file for debugging purposes?

Thanks in advance,
thSoft

Report message to a moderator

Re: Debugging lexer & parser [message #635700 is a reply to message #635699]

Wed, 27 October 2010 18:32

Meinte Boersma

Messages: 434
Registered: July 2009
Location: Leiden, Netherlands

Senior Member

Open the DSL instance (file) using the Sample Reflective Ecore Model editor (Open with > Other...). That gives you the actual EMF model (instance).

Xtext blogs: executable models...again? | workshop material | custom scoping with Xtend

Report message to a moderator

Re: Debugging lexer & parser [message #635708 is a reply to message #635699]

Wed, 27 October 2010 19:10

Sebastian Zarnekow

Messages: 3118
Registered: July 2009

Senior Member

Hi Dennis,

this one may help to visualize the node model:
https://github.com/ralfebert/org.eclipselabs.xtext.nodeoutli ne

Regards,
Sebastian
--
Need professional support for Eclipse Modeling?
Go visit: http://xtext.itemis.com

Am 27.10.10 20:24, schrieb Dennis Harmath:
> Hi all,
>
> what is the quickest & simplest way to visualize the token stream & the
> AST of a file for debugging purposes?
>
> Thanks in advance,
> thSoft

Report message to a moderator

Re: Debugging lexer & parser [message #635724 is a reply to message #635699]

Wed, 27 October 2010 20:33

Dénes Harmath

Messages: 157
Registered: July 2009

Senior Member

Thanks, both are fine ways for visualizing the node model. But how can I quickly see the output of the lexer, the token stream?

Report message to a moderator

Re: Debugging lexer & parser [message #635734 is a reply to message #635724]

Wed, 27 October 2010 21:31

Sebastian Zarnekow

Messages: 3118
Registered: July 2009

Senior Member

Hi Dennis,

I'd write a unit test and invoke the lexer for sample input. The tokens
provide a useful toString so adding all of the to an arraylist and
inspecting this one may be helpful. This (propably sort of outdated)
blog post may provide some inspiration:
http://blogs.itemis.de/stundzig/archives/726

Regards,
Sebastian
--
Need professional support for Eclipse Modeling?
Go visit: http://xtext.itemis.com

Am 27.10.10 22:33, schrieb Dennis Harmath:
> Thanks, both are fine ways for visualizing the node model. But how can I
> quickly see the output of the lexer, the token stream?

Report message to a moderator

Re: Debugging lexer & parser [message #993316 is a reply to message #635734]

Mon, 24 December 2012 03:06

Barrie Treloar

Messages: 55
Registered: July 2009

Member

Sebastian Zarnekow wrote on Wed, 27 October 2010 17:31

Hi Dennis,

I'd write a unit test and invoke the lexer for sample input. The tokens
provide a useful toString so adding all of the to an arraylist and
inspecting this one may be helpful. This (propably sort of outdated)
blog post may provide some inspiration:
http://blogs.itemis.de/stundzig/archives/726

Thanks for this, it has been useful.

I've updated your example to support XText 2 at http://baerrach.blogspot.com.au/2012/12/lexer-and-parsers-tests-for-xtext.html (it also allows instantiation so you can debug token streams from other classes)

I found that I could mostly get away with the results from ParserHelper2.parse() and traversing the DSL classes to unit test my grammar.

It wasn't until I tried some insane stuff with lexer rules and character negation to get "unquoted strings" (what I am stuck with) working in the grammar that the lexer stopped behaving correctly.

terminal fragment WS_CHAR:
    ' ' | '\t' | '\r' | '\n';

terminal UNQUOTED_STRING:
    !'"' (!('"' | WS_CHAR))*;

At this point being able to see what the tokens are is extremely useful.
It clearly shows that lexer rules are being hidden and swallowed up incorrectly by this rule.

In the end I couldn't resolve this with character negation, so I settled for an incorrect grammar knowing the workaround was to quote the damn string.
Trying to include punctuation correctly causes the lexer rules to hide INT and other "punctation" keywords that get used.

terminal fragment UNQUOTED_STRING_FIRST_CHARACTER:
    ('a'..'z' | 'A'..'Z');

terminal fragment UNQUOTED_STRING_OTHER_CHARACTERS:
    (UNQUOTED_STRING_FIRST_CHARACTER | '0'..'9' | '.' | '_' | '-' | '\\');

terminal UNQUOTED_STRING:
    (UNQUOTED_STRING_FIRST_CHARACTER) (UNQUOTED_STRING_OTHER_CHARACTERS)*;

Report message to a moderator

Re: Debugging lexer & parser [message #993372 is a reply to message #993316]

Mon, 24 December 2012 07:21

Alexander Nittka

Messages: 1193
Registered: July 2009

Senior Member

Hi,

is there a reason, you can't make your unquoted string definition a datatype rule rather than a terminal rule?

Alex

Need training, onsite consulting or any other kind of help for Xtext?
Go visit http://xtext.itemis.com or send a mail to xtext@itemis.de

Report message to a moderator

Re: Debugging lexer & parser [message #993416 is a reply to message #993372]

Mon, 24 December 2012 10:11

Barrie Treloar

Messages: 55
Registered: July 2009

Member

Alexander Nittka wrote on Mon, 24 December 2012 02:21

Hi,

is there a reason, you can't make your unquoted string definition a datatype rule rather than a terminal rule?

Because the common Terminals package does so for Strings?
I assumed that was the best way for unquoted strings.

I had looked at Parser rules but it says

Quote:

Character ranges, wildcards, the until token and the negation as well as the EOF token are only available for terminal rules.

Which I assumed also meant datatype rules.

Since unquoted string can be anything that does not contain a space in it, I couldn't work out how to build this with anything but a Terminal rule. Which in the end I just gave up and stripped it back to bare bones and made it work with the existing instance of the grammar I had to test against.

I'm still very new to all this, so if you can craft a Datatype Rule I'm willing to give it a go.

Report message to a moderator

Re: Debugging lexer & parser [message #993446 is a reply to message #993416]

Mon, 24 December 2012 12:14

Henrik Lindberg

Messages: 2509
Registered: July 2009

Senior Member

Use of data rules is good unless you end up with lots of tiny tokens,
have to have long lists of things to include (because 'not' is not
possible), or end up with lots of fiddling with hidden on/off (which has
subtle unwanted effects in several places).

When you reach that point, you can write a more powerful external lexer.
Thought that could be good to know if you start feeling you are digging
yourself into a hole.

Regards
- henrik

On 2012-24-12 11:11, Barrie Treloar wrote:
> Alexander Nittka wrote on Mon, 24 December 2012 02:21
>> Hi,
>>
>> is there a reason, you can't make your unquoted string definition a
>> datatype rule rather than a terminal rule?
>
>
> Because the common Terminals package does so for Strings? I assumed that
> was the best way for unquoted strings.
>
> I had looked at
> http://www.eclipse.org/Xtext/documentation.html#parser_rules but it says
>
> Quote:
>> Character ranges, wildcards, the until token and the negation as well
>> as the EOF token are only available for terminal rules.
>
>
> Which I assumed also meant datatype rules.
>
> Since unquoted string can be anything that does not contain a space in
> it, I couldn't work out how to build this with anything but a Terminal
> rule. Which in the end I just gave up and stripped it back to bare
> bones and made it work with the existing instance of the grammar I had
> to test against.
>
> I'm still very new to all this, so if you can craft a Datatype Rule I'm
> willing to give it a go.

Report message to a moderator

Re: Debugging lexer & parser [message #993620 is a reply to message #993446]

Tue, 25 December 2012 00:48

Barrie Treloar

Messages: 55
Registered: July 2009

Member

Henrik Lindberg wrote on Mon, 24 December 2012 07:14

Use of data rules is good unless you end up with lots of tiny tokens,
have to have long lists of things to include (because 'not' is not
possible), or end up with lots of fiddling with hidden on/off (which has
subtle unwanted effects in several places).

When you reach that point, you can write a more powerful external lexer.
Thought that could be good to know if you start feeling you are digging
yourself into a hole.

Good to know I've found the limitations and am heading in the right direction, i.e. ignoring the holes in the grammar with suitable work arounds Smile

Report message to a moderator

Previous Topic:	Reference Question for a goto like statement
Next Topic:	Difference between Juno and Indigo

Goto Forum:

-=] Back to Top [=-

[ Syndicate this forum (XML) ] [

]

Current Time: Wed Apr 24 17:51:44 GMT 2024

.:: Contact :: Home ::.

Breadcrumbs

Sign up to our Newsletter