Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Modeling » TMF (Xtext) » endless Loop with extern lexer
endless Loop with extern lexer [message #964155] Tue, 30 October 2012 05:52
Sebastian Glonner is currently offline Sebastian Glonner
Messages: 4
Registered: September 2012
Junior Member
Hello

I am writing a xml grammar with xtext. Therefore i am using an external lexer.
My problem appears when starting content assist while inside a RULE_STRING inside a tag. Example:

<sometag someattr="somestring"> </sometag>


pressing ctrl + space inside >"somestring"< will cause a endless loop.

I tryed to investigate why this is happening.

Because i am not allowing rule_pcdata inside tags with my tagMode variable.

this is the function of the lexer.

    public final void mRULE_PCDATA() throws RecognitionException {
        try {
            int _type = RULE_PCDATA;
            int _channel = DEFAULT_TOKEN_CHANNEL;
            // ../com.groupion.ixml.sdk.xtext/src/com/groupion/ixml/sdk/xtext/lexer/XMLLexer.g:39:13: ({...}? => (~ ( ( '<' | '%' ) ) )+ )
            // ../com.groupion.ixml.sdk.xtext/src/com/groupion/ixml/sdk/xtext/lexer/XMLLexer.g:39:15: {...}? => (~ ( ( '<' | '%' ) ) )+
            {
            if ( !(( !tagMode )) ) {
                throw new FailedPredicateException(input, "RULE_PCDATA", " !tagMode ");
            }
            // ../com.groupion.ixml.sdk.xtext/src/com/groupion/ixml/sdk/xtext/lexer/XMLLexer.g:39:31: (~ ( ( '<' | '%' ) ) )+
            int cnt2=0;
            loop2:
            do {
                int alt2=2;
                int LA2_0 = input.LA(1);

                if ( ((LA2_0>='\u0000' && LA2_0<='$')||(LA2_0>='&' && LA2_0<=';')||(LA2_0>='=' && LA2_0<='\uFFFF')) ) {
                    alt2=1;
                }


                switch (alt2) {
            	case 1 :
            	    // ../com.groupion.ixml.sdk.xtext/src/com/groupion/ixml/sdk/xtext/lexer/XMLLexer.g:39:31: ~ ( ( '<' | '%' ) )
            	    {
            	    if ( (input.LA(1)>='\u0000' && input.LA(1)<='$')||(input.LA(1)>='&' && input.LA(1)<=';')||(input.LA(1)>='=' && input.LA(1)<='\uFFFF') ) {
            	        input.consume();

            	    }
            	    else {
            	        MismatchedSetException mse = new MismatchedSetException(null,input);
            	        recover(mse);
            	        throw mse;}


            	    }
            	    break;

            	default :
            	    if ( cnt2 >= 1 ) break loop2;
                        EarlyExitException eee =
                            new EarlyExitException(2, input);
                        throw eee;
                }
                cnt2++;
            } while (true);


            }

            state.type = _type;
            state.channel = _channel;
        }
        finally {
        }
    }



And the endless loop happening because he runs into the
throw new FailedPredicateException(input, "RULE_PCDATA", " !tagMode ");
and that seam to happen because the current sign is not consumed.

for example simply adding "input.comsumed()" before the exception will be thrown causes the endless loop to disapear.
Am i doing something wrong or might this happen cause of the not completly supported extern lexer / xml grammar. And what would be the best workaround for this problem ?

I am not sure why the predicate is failing anyway.
the rule_pcdata "is not allowed" inside tags and should not be called?
The parsing/lexing without content assist and in all other cases works great!

I added my xtext file and the external lexer.

Thanks for any suggestions.

lexer grammar XMLLexer;


@header {
package com.groupion.ixml.sdk.xtext.lexer;

// Hack: Use our own Lexer superclass by means of import. 
// Currently there is no other way to specify the superclass for the lexer.
import org.eclipse.xtext.parser.antlr.Lexer;
}

@members {
    boolean tagMode = true;
}


KEYWORD_9 : '<?xml version="1.0" encoding="UTF-8"?>';

KEYWORD_8 : { tagMode }?=> 'DOCTYPE ixml SYSTEM';

KEYWORD_7 : { tagMode }?=> 'ixml';

KEYWORD_4 : { tagMode }?=> '/>' { tagMode = false; } ;

KEYWORD_5 : '<!' { tagMode = true; } ;

KEYWORD_6 : '</' { tagMode = true; } ;

KEYWORD_1 : '<' { tagMode = true; } ;

KEYWORD_2 : { tagMode }?=> '=';

KEYWORD_3 : { tagMode }?=> '>' { tagMode = false; } ;



RULE_TAG_NAME : { tagMode }?=> RULE_ID ':' RULE_ID?;

RULE_PCDATA : { !tagMode }?=> ~(('<'|'%'))+;

RULE_ML_COMMENT : '<!--' ( options {greedy=false;} : . )* '-->';

RULE_IXML_VAR : '%' (RULE_IXML_IDENT ('[' (RULE_IXML_VAR|RULE_IXML_IDENT)? ']'|'.' RULE_IXML_IDENT)*)? '%'?;

fragment RULE_IXML_IDENT : ('a'..'z'|'A'..'Z'|'_'|'0'..'9')+;

RULE_CDATA : '<![CDATA[' ( options {greedy=false;} : . )*']]';


//define rules from org.eclipse.xtext.common.Terminals

RULE_ID : { tagMode }?=> '^'? ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'_'|'0'..'9')*;

RULE_INT : { tagMode }?=> ('0'..'9')+;

RULE_STRING : { tagMode }?=> ('"' ('\\' ('b'|'t'|'n'|'f'|'r'|'u'|'"'|'\''|'\\')|~(('\\'|'"')))* '"'|'\'' ('\\' ('b'|'t'|'n'|'f'|'r'|'u'|'"'|'\''|'\\')|~(('\\'|'\'')))* '\'');

RULE_SL_COMMENT : '//' ~(('\n'|'\r'))* ('\r'? '\n')?;

RULE_WS : { tagMode }?=> (' '|'\t'|'\r'|'\n')+;

RULE_ANY_OTHER : { tagMode }?=> .;


grammar com.groupion.ixml.sdk.xtext.IXML with org.eclipse.xtext.common.Terminals

generate iXML "my url"

IXML: 
	'<?xml version="1.0" encoding="UTF-8"?>'
	'<!''DOCTYPE ixml SYSTEM' dtdUrl = STRING '>' 
	// Dirty fix: Allow pcdata to allow whitespaces here
	PCDATA	 
	rootTag = RootTag
;

RootTag: {rootContent}
	'<' 'ixml' '>'
		(content += TagContent)*
	'</' 'ixml' '>'	
;

Tag:
	StdTag | EmptyTag
;

StdTag:
	startTag = StartTag
		(content += TagContent)*
	endTag = EndTag
;

StartTag:
	'<' name = TagName WS? attributes += Attribute* '>'
;

EndTag:
	'</' name = TagName '>'
;

EmptyTag:
	'<' name = TagName WS? attributes += Attribute* '/>'
;

TagContent:
	Tag | {pcdata} PCDATA | {ixml_var} IXML_VAR | {cdata} CDATA 
;


Attribute:
	name = ID '=' (value = STRING) WS?
;	


TagName:
	ID | TAG_NAME
;

terminal TAG_NAME:
	ID':'ID?
;

terminal PCDATA:
	!('<'|'%')+
;


/*
 * change default comment to html comments
 */
terminal ML_COMMENT :
	'<!--' -> '-->'
;

terminal fragment IXML_IDENT :
	('a'..'z'|'A'..'Z'|'_'|'0'..'9')+
;
terminal IXML_VAR :
	'%'(IXML_IDENT( '[' (IXML_VAR | IXML_IDENT)? ']' | '.'IXML_IDENT )*)?'%'?
;

terminal CDATA:
	'<![CDATA[' -> ']]'
;
Previous Topic:Eclipse/Xtext crashing
Next Topic:XbaseScopeProvider.createTypeScope removed in Xtext 2.4
Goto Forum:
  


Current Time: Sun May 26 00:58:00 EDT 2013

Powered by FUDForum. Page generated in 0.01761 seconds