|
Re: Extraneous / mismatched input [message #1080037 is a reply to message #1078805] |
Mon, 05 August 2013 11:50 |
Claudio Heeg Messages: 75 Registered: April 2013 |
Member |
|
|
From what I see, the main problem are the overlapping Terminals in CHARDATA and ID.
As "somecontent" is a valid ID and ID is the first matching Terminal, it is lexed as an ID. Note the difference in behaviour when CHARDATA is defined first.
See also: http://zarnekow.blogspot.de/2012/11/xtext-corner-6-data-types-terminals-why.html
Content:
{Content} (tag=Tag | charData=CharDataType);
CharDataType:
ID|CHARDATA
;
terminal WS:
(' '|'\t'|'\r'|'\n')+;
terminal ID:
('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'_'|'0'..'9')*;
terminal CHARDATA:
(!('<'|'&'|']]>'|'>'))*;
That works, but hell if I know whether that's best or even good practice.
Also, I don't know for sure, but are "/" allowed in Chardata? If they are not, simply disallow it in the Terminal to fix the second error.
[Updated on: Mon, 05 August 2013 11:59] Report message to a moderator
|
|
|
Re: Extraneous / mismatched input [message #1082264 is a reply to message #1080037] |
Thu, 08 August 2013 10:07 |
Uli B Messages: 36 Registered: January 2012 |
Member |
|
|
Thank you very much, Claudio. Indeed, the CharDataType solves the first issue. (BTW: I came across that blog while googling before posting here. It states the problem, but does not provide a solution, does it?).
For the second issue: Simply disallowing the "/" will not help. Firstly, it is allowed in CHARDATA, and secondly, the grammar will look slightly extended with:
Tag:
{Tag} ('<' name=TagName attributes+=Attribute* ('/>' | ('>' content+=Content* ('</' nameEnd=[Tag|TagName] '>'))));
Attribute:
name=ID '=' value=STRING;
I wonder why the analyzing process doesn't simply recognize the '<' ID '/>' sequence, as ID is defined before CHARDATA and does not allow the "/". So I would expect it to stop at the "/" and take the ID - and stop at anything not in ID, like WS between the ID and eventually following attributes. I tried with introducing terminals for the '<', '/>', etc. Does not help.
What can I do?
[Updated on: Thu, 08 August 2013 10:12] Report message to a moderator
|
|
|
Re: Extraneous / mismatched input [message #1084891 is a reply to message #1082264] |
Mon, 12 August 2013 07:26 |
Claudio Heeg Messages: 75 Registered: April 2013 |
Member |
|
|
The problem is that the lexer is greedy, iirc.
That means as many tokens as possible are consumed to define a keyword.
I also don't know whether that behaviour can be changed directly, I'm afraid. Hope someone more knowledgeable comes along in a while.
Something along the lines of disallowing "/" at the end of CHARDATA might help?
[Updated on: Mon, 12 August 2013 08:00] Report message to a moderator
|
|
|
|
Re: Extraneous / mismatched input [message #1085228 is a reply to message #1085035] |
Mon, 12 August 2013 16:12 |
Uli B Messages: 36 Registered: January 2012 |
Member |
|
|
Hm, unfortunately CharData is not CDATA. (The original definition from the XML spec is
CharData ::= [^<&]* - ([^<&]* ']]>' [^<&]*) )
To simplify things, I could define it as Everything between a right angle bracket '>' and a left angle bracket '<', excluding the brackets. But as far as I see this is also not possible with Xtext/antlr, since the Until Token terminal always includes the surrounding keywords, right?.
Currently I'm going to subclass the Lexer class, following an idea described in http://www.eclipse.org/forums/index.php/t/200863/, to consider the context (basically: allow CharData only when the last token was a '>', otherwise match an ID), but up to now, this ends up with a lot of mismatched character '>' expecting set null errors ... sigh!
But maybe I am anyway on a completely wrong path. My original intention was not to write an Xtext grammar for xml. Rather, I want to be able to reference xml files from an Xtext-based DSL. This DSL is already working fine so far. Now, I want to pick some names (content assist) from an xml file, having something like 'include abc.xml' in such DSL files. Is there a way to achieve this, possibly by creating a model using EMF and a schema .xsd? (I'm not asking for how to do the include, but for how to create a model that can be imported. Somewhere I read this is only possible when the included files are .xmi ... and this way I came here ...)
|
|
|
|
|
Powered by
FUDForum. Page generated in 0.02814 seconds