Eclipse Community Forums: TMF (Xtext) » endless Loop with extern lexer

Help

Home

Home » Modeling » TMF (Xtext) » endless Loop with extern lexer

Show: Today's Messages :: Show Polls :: Message Navigator

endless Loop with extern lexer [message #964155]

Tue, 30 October 2012 09:52

Sebastian Glonner

Messages: 4
Registered: September 2012

Junior Member

Hello

I am writing a xml grammar with xtext. Therefore i am using an external lexer.
My problem appears when starting content assist while inside a RULE_STRING inside a tag. Example:

<sometag someattr="somestring"> </sometag>

pressing ctrl + space inside >"somestring"< will cause a endless loop.

I tryed to investigate why this is happening.

Because i am not allowing rule_pcdata inside tags with my tagMode variable.

this is the function of the lexer.

    public final void mRULE_PCDATA() throws RecognitionException {
        try {
            int _type = RULE_PCDATA;
            int _channel = DEFAULT_TOKEN_CHANNEL;
            // ../com.groupion.ixml.sdk.xtext/src/com/groupion/ixml/sdk/xtext/lexer/XMLLexer.g:39:13: ({...}? => (~ ( ( '<' | '%' ) ) )+ )
            // ../com.groupion.ixml.sdk.xtext/src/com/groupion/ixml/sdk/xtext/lexer/XMLLexer.g:39:15: {...}? => (~ ( ( '<' | '%' ) ) )+
            {
            if ( !(( !tagMode )) ) {
                throw new FailedPredicateException(input, "RULE_PCDATA", " !tagMode ");
            }
            // ../com.groupion.ixml.sdk.xtext/src/com/groupion/ixml/sdk/xtext/lexer/XMLLexer.g:39:31: (~ ( ( '<' | '%' ) ) )+
            int cnt2=0;
            loop2:
            do {
                int alt2=2;
                int LA2_0 = input.LA(1);

                if ( ((LA2_0>='\u0000' && LA2_0<='$')||(LA2_0>='&' && LA2_0<=';')||(LA2_0>='=' && LA2_0<='\uFFFF')) ) {
                    alt2=1;
                }


                switch (alt2) {
            	case 1 :
            	    // ../com.groupion.ixml.sdk.xtext/src/com/groupion/ixml/sdk/xtext/lexer/XMLLexer.g:39:31: ~ ( ( '<' | '%' ) )
            	    {
            	    if ( (input.LA(1)>='\u0000' && input.LA(1)<='$')||(input.LA(1)>='&' && input.LA(1)<=';')||(input.LA(1)>='=' && input.LA(1)<='\uFFFF') ) {
            	        input.consume();

            	    }
            	    else {
            	        MismatchedSetException mse = new MismatchedSetException(null,input);
            	        recover(mse);
            	        throw mse;}


            	    }
            	    break;

            	default :
            	    if ( cnt2 >= 1 ) break loop2;
                        EarlyExitException eee =
                            new EarlyExitException(2, input);
                        throw eee;
                }
                cnt2++;
            } while (true);


            }

            state.type = _type;
            state.channel = _channel;
        }
        finally {
        }
    }

And the endless loop happening because he runs into the
throw new FailedPredicateException(input, "RULE_PCDATA", " !tagMode ");
and that seam to happen because the current sign is not consumed.

for example simply adding "input.comsumed()" before the exception will be thrown causes the endless loop to disapear.
Am i doing something wrong or might this happen cause of the not completly supported extern lexer / xml grammar. And what would be the best workaround for this problem ?

I am not sure why the predicate is failing anyway.
the rule_pcdata "is not allowed" inside tags and should not be called?
The parsing/lexing without content assist and in all other cases works great!

I added my xtext file and the external lexer.

Thanks for any suggestions.

lexer grammar XMLLexer;


@header {
package com.groupion.ixml.sdk.xtext.lexer;

// Hack: Use our own Lexer superclass by means of import. 
// Currently there is no other way to specify the superclass for the lexer.
import org.eclipse.xtext.parser.antlr.Lexer;
}

@members {
    boolean tagMode = true;
}


KEYWORD_9 : '<?xml version="1.0" encoding="UTF-8"?>';

KEYWORD_8 : { tagMode }?=> 'DOCTYPE ixml SYSTEM';

KEYWORD_7 : { tagMode }?=> 'ixml';

KEYWORD_4 : { tagMode }?=> '/>' { tagMode = false; } ;

KEYWORD_5 : '<!' { tagMode = true; } ;

KEYWORD_6 : '</' { tagMode = true; } ;

KEYWORD_1 : '<' { tagMode = true; } ;

KEYWORD_2 : { tagMode }?=> '=';

KEYWORD_3 : { tagMode }?=> '>' { tagMode = false; } ;



RULE_TAG_NAME : { tagMode }?=> RULE_ID ':' RULE_ID?;

RULE_PCDATA : { !tagMode }?=> ~(('<'|'%'))+;

RULE_ML_COMMENT : '<!--' ( options {greedy=false;} : . )* '-->';

RULE_IXML_VAR : '%' (RULE_IXML_IDENT ('[' (RULE_IXML_VAR|RULE_IXML_IDENT)? ']'|'.' RULE_IXML_IDENT)*)? '%'?;

fragment RULE_IXML_IDENT : ('a'..'z'|'A'..'Z'|'_'|'0'..'9')+;

RULE_CDATA : '<![CDATA[' ( options {greedy=false;} : . )*']]';


//define rules from org.eclipse.xtext.common.Terminals

RULE_ID : { tagMode }?=> '^'? ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'_'|'0'..'9')*;

RULE_INT : { tagMode }?=> ('0'..'9')+;

RULE_STRING : { tagMode }?=> ('"' ('\\' ('b'|'t'|'n'|'f'|'r'|'u'|'"'|'\''|'\\')|~(('\\'|'"')))* '"'|'\'' ('\\' ('b'|'t'|'n'|'f'|'r'|'u'|'"'|'\''|'\\')|~(('\\'|'\'')))* '\'');

RULE_SL_COMMENT : '//' ~(('\n'|'\r'))* ('\r'? '\n')?;

RULE_WS : { tagMode }?=> (' '|'\t'|'\r'|'\n')+;

RULE_ANY_OTHER : { tagMode }?=> .;

grammar com.groupion.ixml.sdk.xtext.IXML with org.eclipse.xtext.common.Terminals

generate iXML "my url"

IXML: 
	'<?xml version="1.0" encoding="UTF-8"?>'
	'<!''DOCTYPE ixml SYSTEM' dtdUrl = STRING '>' 
	// Dirty fix: Allow pcdata to allow whitespaces here
	PCDATA	 
	rootTag = RootTag
;

RootTag: {rootContent}
	'<' 'ixml' '>'
		(content += TagContent)*
	'</' 'ixml' '>'	
;

Tag:
	StdTag | EmptyTag
;

StdTag:
	startTag = StartTag
		(content += TagContent)*
	endTag = EndTag
;

StartTag:
	'<' name = TagName WS? attributes += Attribute* '>'
;

EndTag:
	'</' name = TagName '>'
;

EmptyTag:
	'<' name = TagName WS? attributes += Attribute* '/>'
;

TagContent:
	Tag | {pcdata} PCDATA | {ixml_var} IXML_VAR | {cdata} CDATA 
;


Attribute:
	name = ID '=' (value = STRING) WS?
;	


TagName:
	ID | TAG_NAME
;

terminal TAG_NAME:
	ID':'ID?
;

terminal PCDATA:
	!('<'|'%')+
;


/*
 * change default comment to html comments
 */
terminal ML_COMMENT :
	'<!--' -> '-->'
;

terminal fragment IXML_IDENT :
	('a'..'z'|'A'..'Z'|'_'|'0'..'9')+
;
terminal IXML_VAR :
	'%'(IXML_IDENT( '[' (IXML_VAR | IXML_IDENT)? ']' | '.'IXML_IDENT )*)?'%'?
;

terminal CDATA:
	'<![CDATA[' -> ']]'
;

Report message to a moderator

Re: endless Loop with extern lexer [message #1155519 is a reply to message #964155]

Sat, 26 October 2013 00:17

Chris Ainsley

Messages: 78
Registered: March 2010
Location: UK

Member

Hello Sebastian,

This seems to be an identical problem to the problem I'm also having with a stateful lexer (for content assist only).

See my issue here : http://www.eclipse.org/forums/index.php/t/566064/

Assuming that this message finds you, could you let me know how you worked around this issue?

Chris

Report message to a moderator

Re: endless Loop with extern lexer [message #1158529 is a reply to message #964155]

Mon, 28 October 2013 01:10

Henrik Lindberg

Messages: 2509
Registered: July 2009

Senior Member

Please note that there are several lexers generated; not only the main
lexer!

You need to replace them all with your custom lexer. This is slightly
trickier than just binding the same lexer multiple times as the API is
slightly different.

You can take a look at puppetlabs / geppetto @ github where I made this
work for Geppetto (same lexer used for all types of lexing).

Hope that helps
Regards
- henrik

On 2012-30-10 10:52, Sebastian Glonner wrote:
> Hello
>
> I am writing a xml grammar with xtext. Therefore i am using an external
> lexer.
> My problem appears when starting content assist while inside a
> RULE_STRING inside a tag. Example:
>
> <sometag someattr="somestring"> </sometag>
>
> pressing ctrl + space inside >"somestring"< will cause a endless loop.
>
> I tryed to investigate why this is happening.
> Because i am not allowing rule_pcdata inside tags with my tagMode variable.
>
> this is the function of the lexer.
> public final void mRULE_PCDATA() throws RecognitionException {
> try {
> int _type = RULE_PCDATA;
> int _channel = DEFAULT_TOKEN_CHANNEL;
> //
> ../com.groupion.ixml.sdk.xtext/src/com/groupion/ixml/sdk/xtext/lexer/XMLLexer.g:39:13:
> ({...}? => (~ ( ( '<' | '%' ) ) )+ )
> //
> ../com.groupion.ixml.sdk.xtext/src/com/groupion/ixml/sdk/xtext/lexer/XMLLexer.g:39:15:
> {...}? => (~ ( ( '<' | '%' ) ) )+
> {
> if ( !(( !tagMode )) ) {
> throw new FailedPredicateException(input, "RULE_PCDATA",
> " !tagMode ");
> }
> //
> ../com.groupion.ixml.sdk.xtext/src/com/groupion/ixml/sdk/xtext/lexer/XMLLexer.g:39:31:
> (~ ( ( '<' | '%' ) ) )+
> int cnt2=0;
> loop2:
> do {
> int alt2=2;
> int LA2_0 = input.LA(1);
>
> if ( ((LA2_0>='\u0000' && LA2_0<='$')||(LA2_0>='&' &&
> LA2_0<=';')||(LA2_0>='=' && LA2_0<='\uFFFF')) ) {
> alt2=1;
> }
>
>
> switch (alt2) {
> case 1 :
> //
> ../com.groupion.ixml.sdk.xtext/src/com/groupion/ixml/sdk/xtext/lexer/XMLLexer.g:39:31:
> ~ ( ( '<' | '%' ) )
> {
> if ( (input.LA(1)>='\u0000' &&
> input.LA(1)<='$')||(input.LA(1)>='&' &&
> input.LA(1)<=';')||(input.LA(1)>='=' && input.LA(1)<='\uFFFF') ) {
> input.consume();
>
> }
> else {
> MismatchedSetException mse = new
> MismatchedSetException(null,input);
> recover(mse);
> throw mse;}
>
>
> }
> break;
>
> default :
> if ( cnt2 >= 1 ) break loop2;
> EarlyExitException eee =
> new EarlyExitException(2, input);
> throw eee;
> }
> cnt2++;
> } while (true);
>
>
> }
>
> state.type = _type;
> state.channel = _channel;
> }
> finally {
> }
> }
>
>
> And the endless loop happening because he runs into the throw new
> FailedPredicateException(input, "RULE_PCDATA", " !tagMode ");
> and that seam to happen because the current sign is not consumed.
>
> for example simply adding "input.comsumed()" before the exception will
> be thrown causes the endless loop to disapear.
> Am i doing something wrong or might this happen cause of the not
> completly supported extern lexer / xml grammar. And what would be the
> best workaround for this problem ?
>
> I am not sure why the predicate is failing anyway.
> the rule_pcdata "is not allowed" inside tags and should not be called?
> The parsing/lexing without content assist and in all other cases works
> great!
>
> I added my xtext file and the external lexer.
>
> Thanks for any suggestions.
>
> lexer grammar XMLLexer;
>
>
> @header {
> package com.groupion.ixml.sdk.xtext.lexer;
>
> // Hack: Use our own Lexer superclass by means of import. // Currently
> there is no other way to specify the superclass for the lexer.
> import org.eclipse.xtext.parser.antlr.Lexer;
> }
>
> @members {
> boolean tagMode = true;
> }
>
>
> KEYWORD_9 : '<?xml version="1.0" encoding="UTF-8"?>';
>
> KEYWORD_8 : { tagMode }?=> 'DOCTYPE ixml SYSTEM';
>
> KEYWORD_7 : { tagMode }?=> 'ixml';
>
> KEYWORD_4 : { tagMode }?=> '/>' { tagMode = false; } ;
>
> KEYWORD_5 : '<!' { tagMode = true; } ;
>
> KEYWORD_6 : '</' { tagMode = true; } ;
>
> KEYWORD_1 : '<' { tagMode = true; } ;
>
> KEYWORD_2 : { tagMode }?=> '=';
>
> KEYWORD_3 : { tagMode }?=> '>' { tagMode = false; } ;
>
>
>
> RULE_TAG_NAME : { tagMode }?=> RULE_ID ':' RULE_ID?;
>
> RULE_PCDATA : { !tagMode }?=> ~(('<'|'%'))+;
>
> RULE_ML_COMMENT : '';
>
> RULE_IXML_VAR : '%' (RULE_IXML_IDENT ('['
> (RULE_IXML_VAR|RULE_IXML_IDENT)? ']'|'.' RULE_IXML_IDENT)*)? '%'?;
>
> fragment RULE_IXML_IDENT : ('a'..'z'|'A'..'Z'|'_'|'0'..'9')+;
>
> RULE_CDATA : '<![CDATA[' ( options {greedy=false;} : . )*']]';
>
>
> //define rules from org.eclipse.xtext.common.Terminals
>
> RULE_ID : { tagMode }?=> '^'? ('a'..'z'|'A'..'Z'|'_')
> ('a'..'z'|'A'..'Z'|'_'|'0'..'9')*;
>
> RULE_INT : { tagMode }?=> ('0'..'9')+;
>
> RULE_STRING : { tagMode }?=> ('"' ('\\'
> ('b'|'t'|'n'|'f'|'r'|'u'|'"'|'\''|'\\')|~(('\\'|'"')))* '"'|'\'' ('\\'
> ('b'|'t'|'n'|'f'|'r'|'u'|'"'|'\''|'\\')|~(('\\'|'\'')))* '\'');
>
> RULE_SL_COMMENT : '//' ~(('\n'|'\r'))* ('\r'? '\n')?;
>
> RULE_WS : { tagMode }?=> (' '|'\t'|'\r'|'\n')+;
>
> RULE_ANY_OTHER : { tagMode }?=> .;
>
> grammar com.groupion.ixml.sdk.xtext.IXML with
> org.eclipse.xtext.common.Terminals
>
> generate iXML "my url"
>
> IXML: '<?xml version="1.0" encoding="UTF-8"?>'
> '<!''DOCTYPE ixml SYSTEM' dtdUrl = STRING '>' // Dirty fix:
> Allow pcdata to allow whitespaces here
> PCDATA rootTag = RootTag
> ;
>
> RootTag: {rootContent}
> '<' 'ixml' '>'
> (content += TagContent)*
> '</' 'ixml' '>'
> ;
>
> Tag:
> StdTag | EmptyTag
> ;
>
> StdTag:
> startTag = StartTag
> (content += TagContent)*
> endTag = EndTag
> ;
>
> StartTag:
> '<' name = TagName WS? attributes += Attribute* '>'
> ;
>
> EndTag:
> '</' name = TagName '>'
> ;
>
> EmptyTag:
> '<' name = TagName WS? attributes += Attribute* '/>'
> ;
>
> TagContent:
> Tag | {pcdata} PCDATA | {ixml_var} IXML_VAR | {cdata} CDATA ;
>
>
> Attribute:
> name = ID '=' (value = STRING) WS?
> ;
>
>
> TagName:
> ID | TAG_NAME
> ;
>
> terminal TAG_NAME:
> ID':'ID?
> ;
>
> terminal PCDATA:
> !('<'|'%')+
> ;
>
>
> /*
> * change default comment to html comments
> */
> terminal ML_COMMENT :
> ''
> ;
>
> terminal fragment IXML_IDENT :
> ('a'..'z'|'A'..'Z'|'_'|'0'..'9')+
> ;
> terminal IXML_VAR :
> '%'(IXML_IDENT( '[' (IXML_VAR | IXML_IDENT)? ']' | '.'IXML_IDENT
> )*)?'%'?
> ;
>
> terminal CDATA:
> '<![CDATA[' -> ']]'
> ;

Report message to a moderator

Re: endless Loop with extern lexer [message #1159015 is a reply to message #1158529]

Mon, 28 October 2013 08:37

Chris Ainsley

Messages: 78
Registered: March 2010
Location: UK

Member

Hi Henrik,

I've looked at geppetto before as an example of integrating custom lexers and it very much helped to get my DSL working in Xtext, and everything is working, with the one exception of content assist within non empty quoted text.

I understand that the Lexer APIs are different between the three different lexers used by XText but I was of the understanding that there was no problem in making a stateful custom lexer (until now).

I posted source code of the problem into the related thread.

If there is an obvious problem with my lexer to cause such a crash, then it should be found within this project. My assumption at the moment is that my stateful lexer cannot return back the correct token if provided a partial character stream (as it would be in the wrong state), and that the content assist code is providing a partial fragment of my DSL, and somehow entering a deadly loop of no escape when no tokens can be matched.

If Xtext assumes a stateless lexer for content assist then there is nothing I can do to resolve this problem. Is there anyone on the Xtext team that can confirm or refute such a requirement?

Chris

Report message to a moderator

Re: endless Loop with extern lexer [message #1160450 is a reply to message #1159015]

Tue, 29 October 2013 06:11

Henrik Lindberg

Messages: 2509
Registered: July 2009

Senior Member

On 2013-28-10 9:37, Chris Ainsley wrote:
> Hi Henrik,
>
>
> I've looked at geppetto before as an example of integrating custom
> lexers and it very much helped to get my DSL working in Xtext, and
> everything is working, with the one exception of content assist within
> non empty quoted text.
>
> I understand that the Lexer APIs are different between the three
> different lexers used by XText but I was of the understanding that there
> was no problem in making a stateful custom lexer (until now).
>
> I posted source code of the problem into the
> http://www.eclipse.org/forums/index.php/t/566064/ thread.
>
> If there is an obvious problem with my lexer to cause such a crash, then
> it should be found within this project. My assumption at the moment is
> that my stateful lexer cannot return back the correct token if provided
> a partial character stream (as it would be in the wrong state), and that
> the content assist code is providing a partial fragment of my DSL, and
> somehow entering a deadly loop of no escape when no tokens can be matched.
>
> If Xtext assumes a stateless lexer for content assist then there is
> nothing I can do to resolve this problem. Is there anyone on the Xtext
> team that can confirm or refute such a requirement?
>
> Chris

I think the lexers are always context free; in that they will be given
any snippet of source text - however, regions are also involved in this,
It cannot for instance safely assume that it can start at any position
(i.e. in the middle of a string).

IIRC I had to also tweak the region calculations.

The Geppetto lexer keeps track of previous significant token, but has
a special case of being fed something partial (no prior token seen).

Hope that helps figuring out what is going on.

Regards
- henrik

Report message to a moderator

Re: endless Loop with extern lexer [message #1163423 is a reply to message #1160450]

Thu, 31 October 2013 02:11

Chris Ainsley

Messages: 78
Registered: March 2010
Location: UK

Member

Hi Henrik,

Thanks for your ongoing tolerance of this very tricky issue.

Thanks to your previous response, I've managed to tweak the custom lexer more to track the previous token but it seems that even with absolute permissive rules on RULE_STRING, and the rule appearing before anything else, it still will not match RULE_STRING when I press ctrl+space inside a quoted string with one or more letters in it.

You mentioned "IIRC I had to also tweak the region calculations.". I think that I may have to perform the same tweaking.

Do you recall where and how this happens in Gepetto?

Chris

Report message to a moderator

Re: endless Loop with extern lexer [message #1164391 is a reply to message #1163423]

Thu, 31 October 2013 16:51

Henrik Lindberg

Messages: 2509
Registered: July 2009

Senior Member

On 2013-31-10 3:11, Chris Ainsley wrote:
> Hi Henrik,
>
> Thanks for your ongoing tolerance of this very tricky issue.
>
np

> Thanks to your previous response, I've managed to tweak the custom lexer
> more to track the previous token but it seems that even with absolute
> permissive rules on RULE_STRING, and the rule appearing before anything
> else, it still will not match RULE_STRING when I press ctrl+space inside
> a quoted string with one or more letters in it.
>
> You mentioned "IIRC I had to also tweak the region calculations.". I
> think that I may have to perform the same tweaking.
>
> Do you recall where and how this happens in Gepetto?
>

It is this class: PPTokenTypeToPartionMapper. It is doing most of the
work. Also check out where it is used. IIRC there was an issue when
multiple tokens should be joined into the same partition (I used the
term "region" earlier, what I meant was "partition"). That happens in
PPPartitionTokenScanner. But you may need to look for other places where
it is also used.

Hope that helps

- henrik

Report message to a moderator

Re: endless Loop with extern lexer [message #1166583 is a reply to message #1164391]

Sat, 02 November 2013 03:26

Chris Ainsley

Messages: 78
Registered: March 2010
Location: UK

Member

Hi Henrik,

There is so much specialisation in the Gepetto codebase, its hard for me to understand what exactly is crashing Xtext. I still don't understand how the partition mapper relates to code completion at all, and looking at the Gepetto code, it becomes even more confusing as I don't exactly know how your class overrides are affecting the default behaviour of XText (as I don't know how the default behaviour fully works due to loose coupling and undocumented methods).

I suppose the point I'm making is that I started to destroy my codebase trying to merge in snippets extracted from Gepetto without being able to understand what the intent of the code is (as there is no real documentation about this low level code anywhere). I had to revert all changes just to get the code back to the previous known crashing state.

I truly thank you for your assistance in this matter, but it may be quicker to look at the sample project I created ( in this thread and to suggest direct remedial action. Without detailed knowledge of the inner workings of antlr and xtext, I can't adapt your Gepetto specific lexer workarounds to the general case (it seems like you are solving far more than this type of issue in your own code).

I can't appreciate how busy you must be, but maybe if you could contact me to solve this single isolated issue, then I think that would the only possible chance of me getting a working editor - ever. If you would assist me, I would take responsibility for documenting the solution, so that other people having the same problem (and I am not the first) would be able to understand it in laymans terms.

Sorry for my insufficient understanding here. Up until this point I had a good grasp of custom lexing but it seems that the interface between the lexer and regions just got the better of me and I'm stuck with a program I can't fix.

Chris

[Updated on: Sat, 02 November 2013 03:27]

Report message to a moderator

Re: endless Loop with extern lexer [message #1166630 is a reply to message #1166583]

Sat, 02 November 2013 04:13

Henrik Lindberg

Messages: 2509
Registered: July 2009

Senior Member

On 2013-02-11 4:26, Chris Ainsley wrote:
>
> Hi Henrik,
>
> There is so much specialisation in the Gepetto codebase, its hard for me
> to understand what exactly is crashing Xtext. I still don't understand
> how the partition mapper relates to code completion at all, and looking
> at the Gepetto code, it becomes even more confusing as I don't exactly
> know how your class overrides are affecting the default behaviour of
> XText (as I don't know how the default behaviour fully works due to
> loose coupling and undocumented methods).
>
> I suppose the point I'm making is that I started to destroy my codebase
> trying to merge in snippets extracted from Gepetto without being able to
> understand what the intent of the code is (as there is no real
> documentation about this low level code anywhere). I had to revert all
> changes just to get the code back to the previous known crashing state.
>
It gets complex quickly and most of Geppetto (except the parts under the
xtext namespace) where never intended to be reusable outside of
Geppetto. The parts under xtext were made as generic as possible - but I
also had deadlines for Geppetto to consider.

> I truly thank you for your assistance in this matter, but it may be
> quicker to look at the sample project I created ( in
> http://www.eclipse.org/forums/index.php/t/566064/ thread and to suggest
> direct remedial action.
>
I am sorry, but that will just take too much time before I am able to
get to the point where it is possible to pin-point the particular
problem you are seeing.

I spent many hours (days and weeks actually) debugging what was going on
in Geppetto. (Use of the debugger is really the only way to learn the
low level behavior, as you point out - there really is no documentation
at that level).

> I truly thank you for your assistance here, but without detailed
> knowledge of the inner workings of antlr and xtext, I can't adapt your
> Gepetto specific lexer workarounds to the general case (it seems like
> you are solving far more than this type of issue in your own code).
>
Yes, there are many issues that I worked around. They do not all relate
to use of an external lexer but rather the reasons why certain things
had to be solved in the lexer.

> I fully appreciate you are busy, but if you could contact me to solve
> this single isolated issue, then I think that would the only possible
> chance of me getting a working editor - ever. If you would assist me, I
> would take responsibility for documenting the solution, so that other
> people having the same problem (and I am not the first) would be able to
> understand it in laymans terms.
>
> Sorry for my insufficient understanding here. Up until this point I had
> a good grasp of custom lexing but it seems that the interface between
> the lexer and regions just got the better of me and I'm stuck with a
> program I can't fix.
>
I had the same feeling a couple of times when I got stuck. I can try
helping with advice.

The regions play an important role as IIRC they define demarcation of
positions the lexer will be given when doing partial parsing. The lexer
itself is really not involved until it is given something to lex - so
regions matter to the higher levels.

OTOH - I have no clue if that is the problem you are seeing or not.

I would look at the stacktrace for your NPE - and set a breakpoint at
some point higher up in the call stack and then just step through down
to the NPE. (I know it is painful and at first you will be stepping
through lots of low level code. Once you get familiar with the
completely uninteresting parts you can filter so that the debugger skips
those parts.

Let's see, what else...

Since xtext generates multiple lexers from the grammar and uses the
terminal rules as they are written in the grammar for these auxiliary
lexers they will (naturally) be based on the wrong terminals (even if
you have terminal rules in your grammar that are close, the generated
lexer will still be wrong. I did not want to write multiple external
lexers so I wrapped the one and only in a class that adapted it so it
can function as "the lexer" for all the lexing jobs (highlighting, etc).
That part of Geppetto should not be too hard to figure out.

I am currently working on other things than Geppetto (busy on the actual
language Geppetto is an IDE for) and it will be some time (weeks to
months) before I am going to be working on Geppetto again - meanwhile we
(Cloudsmith) have transitioned all the code to Puppet Labs and I am
currently not set up properly with a dev environment for the
transitioned code (everything moved to a new namespace). Once I am
working on Geppetto actively again it is easier for me to answer
questions etc. Meanwhile there will be plenty of IIRC remarks :-)

If you are in a hurry, I suggest contacting Itemis and buying some
consulting from them.

Another thing you could do is to try to run and debug Geppetto. Set a
breakpoint in the region calculation code and see from where it gets
called. Then find the corresponding place in your project.
Although it requires work to setup it is fairly well automated with
Buckminster and there are instructions (although they may be behind -
but should you want to try this approach we (me and Thomas Hallgren)
would help you out since those instructions should be up to date. (If
everything works ok it should take you an hour or so to set that up).

You can probably find the corresponding class in your project without
debugging Geppetto. There should be enough clues what it extends, where
it is mentioned in guice module configurations etc.

Once you started debugging and getting a better understanding of how the
various parts work it is also easier to ask specific questions about
what is going on, and this makes it far more likely that you get good
answers in this forum. (That is how I learned).

I hope some of those pointers may help you, and again I am sorry that I
don't have the bandwidth to dig deeper into your code.

Regards
- henrik

Report message to a moderator

Re: endless Loop with extern lexer [message #1166636 is a reply to message #1166583]

Sat, 02 November 2013 04:16

Henrik Lindberg

Messages: 2509
Registered: July 2009

Senior Member

Doh, you were seeing an endless loop, not a NPE.

So start with a breakpoint someplace safe and step towards the void...
To find that place try invoking the code completion for a sample
language and break in code you know gets called from the code completion
- look at the stack there, and then do that for your own language.

- henrik

Report message to a moderator

Re: endless Loop with extern lexer [message #1169451 is a reply to message #1166636]

Mon, 04 November 2013 02:04

Chris Ainsley

Messages: 78
Registered: March 2010
Location: UK

Member

Henrik Lindberg wrote on Sat, 02 November 2013 13:16

Doh, you were seeing an endless loop, not a NPE.

So start with a breakpoint someplace safe and step towards the void...
To find that place try invoking the code completion for a sample
language and break in code you know gets called from the code completion
- look at the stack there, and then do that for your own language.

- henrik

Thanks for taking the time for a long and reasoned response, and of course I understand you don't have the bandwidth to take on anything like this.

After dozens of hours spent on working on this from multiple angles - the debugger hasn't helped me with this thus far; without anyone familiar with the low level code to assist I think "the definition of insanity" applies here.

Consultancy is probably the answer to this but I'm not currently in a position to be able to acquire any more consultancy, so I guess I'll leave the issue here for now and document that the users should avoid auto-complete within quoted strings. I'm beat.

Thanks for all your help again Henrik.

[Updated on: Mon, 04 November 2013 02:09]

Report message to a moderator

Previous Topic:	How do I set the package for my generated Java files?
Next Topic:	Single project for Xtext

Goto Forum:

-=] Back to Top [=-

[ Syndicate this forum (XML) ] [

]

Current Time: Fri Apr 26 05:53:25 GMT 2024

.:: Contact :: Home ::.

Breadcrumbs

Sign up to our Newsletter