Skip to main content


Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Modeling » TMF (Xtext) » Using "weird" character literals
Using "weird" character literals [message #759883] Wed, 30 November 2011 13:31 Go to next message
Vlad Dumitrescu is currently offline Vlad DumitrescuFriend
Messages: 430
Registered: July 2009
Location: Gothenburg
Senior Member
Hi!

I an defining
terminal WS:
	('\u0000'..'\u000c' | '\u000e'..' ' | '\u0080'..'\u00A0')+
;

which should work as ANTLR recognizes these literals. When generating, The following error is shown:

5762 [main] ERROR enerator.CompositeGeneratorFragment  - An invalid XML character (Unicode: 0x0) was found in the element content:
        at org.eclipse.emf.ecore.xmi.impl.XMLSaveImpl$Escape.convert(XMLSaveImpl.java:3400)
...
	at org.eclipse.xtext.generator.grammarAccess.GrammarAccessFragment.generate(GrammarAccessFragment.java:82)
	at org.eclipse.xtext.generator.CompositeGeneratorFragment.generate(CompositeGeneratorFragment.java:81)


The other steps seem to be ok, but I wonder if this is something that needs to be fixed in the generator, and if the generated code should be ok.

best regards,
Vlad
Re: Using "weird" character literals [message #759888 is a reply to message #759883] Wed, 30 November 2011 13:40 Go to previous messageGo to next message
Vlad Dumitrescu is currently offline Vlad DumitrescuFriend
Messages: 430
Registered: July 2009
Location: Gothenburg
Senior Member
The generated code is incomplete and unusable.

Is there any good way to write these literal chars?

regards,
Vlad
Re: Using "weird" character literals [message #759901 is a reply to message #759883] Wed, 30 November 2011 14:03 Go to previous messageGo to next message
Henrik Lindberg is currently offline Henrik LindbergFriend
Messages: 2509
Registered: July 2009
Senior Member
Afaik, there is no way around a 0x0 unicode character.
Did you try using '\u0001' instead of '\u0000' ?

- henrik

On 2011-30-11 14:31, Vlad Dumitrescu wrote:
> Hi!
>
> I an defining
> terminal WS:
> ('\u0000'..'\u000c' | '\u000e'..' ' | '\u0080'..'\u00A0')+
> ;
> which should work as ANTLR recognizes these literals. When generating,
> The following error is shown:
>
> 5762 [main] ERROR enerator.CompositeGeneratorFragment - An invalid XML
> character (Unicode: 0x0) was found in the element content:
> at
> org.eclipse.emf.ecore.xmi.impl.XMLSaveImpl$Escape.convert(XMLSaveImpl.java:3400)
>
> ...
> at
> org.eclipse.xtext.generator.grammarAccess.GrammarAccessFragment.generate(GrammarAccessFragment.java:82)
>
> at
> org.eclipse.xtext.generator.CompositeGeneratorFragment.generate(CompositeGeneratorFragment.java:81)
>
>
> The other steps seem to be ok, but I wonder if this is something that
> needs to be fixed in the generator, and if the generated code should be ok.
>
> best regards,
> Vlad
>
Re: Using "weird" character literals [message #759910 is a reply to message #759901] Wed, 30 November 2011 14:45 Go to previous messageGo to next message
Vlad Dumitrescu is currently offline Vlad DumitrescuFriend
Messages: 430
Registered: July 2009
Location: Gothenburg
Senior Member
Hi,

Yes, I tried with \u0001 too and the result is similar.

Actually, I notice that not even '\b' works. It seems that some generation step doesn't escape these non-printable characters correctly? '\u0080' works fine, for example, so it's probably only those under 0x20 except \n \t \r.

regards,
Vlad
Re: Using "weird" character literals [message #759948 is a reply to message #759910] Wed, 30 November 2011 16:32 Go to previous messageGo to next message
Sebastian Zarnekow is currently offline Sebastian ZarnekowFriend
Messages: 3108
Registered: July 2009
Senior Member
Hi Vlad,

the grammar is persisted as XMI 1.0 for runtime purposes. However, XMI
1.0 does not allow to escape characters.

The parser should look good, though.

Regards,
Sebastian
--
Need professional support for Eclipse Modeling?
Go visit: http://xtext.itemis.com

Am 30.11.11 15:45, schrieb Vlad Dumitrescu:
> Hi,
>
> Yes, I tried with \u0001 too and the result is similar.
>
> Actually, I notice that not even '\b' works. It seems that some
> generation step doesn't escape these non-printable characters correctly?
> '\u0080' works fine, for example, so it's probably only those under 0x20
> except \n \t \r.
>
> regards,
> Vlad
>
Re: Using "weird" character literals [message #759987 is a reply to message #759948] Wed, 30 November 2011 19:22 Go to previous messageGo to next message
Vlad Dumitrescu is currently offline Vlad DumitrescuFriend
Messages: 430
Registered: July 2009
Location: Gothenburg
Senior Member
Hi Sebastian,

No, whenever this error occurs and I try to run the code, there are errors that imply that the runtime tries to read the xmi file and it is empty.

regards,
Vlad

Caused by: org.eclipse.emf.ecore.resource.impl.ResourceSetImpl$1DiagnosticWrappedException: org.xml.sax.SAXParseException: Premature end of file.
at org.eclipse.emf.ecore.resource.impl.ResourceSetImpl.handleDemandLoadException(ResourceSetImpl.java:315)
at org.eclipse.emf.ecore.resource.impl.ResourceSetImpl.demandLoadHelper(ResourceSetImpl.java:274)
at org.eclipse.emf.ecore.resource.impl.ResourceSetImpl.getResource(ResourceSetImpl.java:397)
at org.eclipse.xtext.resource.SynchronizedXtextResourceSet.getResource(SynchronizedXtextResourceSet.java:23)
Re: Using "weird" character literals [message #759988 is a reply to message #759987] Wed, 30 November 2011 19:26 Go to previous messageGo to next message
Vlad Dumitrescu is currently offline Vlad DumitrescuFriend
Messages: 430
Registered: July 2009
Location: Gothenburg
Senior Member
To follow up, the xmi file contains for example

  <rules xsi:type="xtext:TerminalRule" name="WS">
    <type metamodel="//@metamodelDeclarations.0">
      <classifier xsi:type="ecore:EDataType" href="http://www.eclipse.org/emf/2002/Ecore#//EString"/>
    </type>
    <alternatives xsi:type="xtext:Alternatives" cardinality="+">
      <elements xsi:type="xtext:Keyword" value=" "/>
      <elements xsi:type="xtext:Keyword" value="&#x9;"/>
      <elements xsi:type="xtext:Keyword" value="&#xD;"/>
      <elements xsi:type="xtext:Keyword" value="&#xA;"/>
    </alternatives>
  </rules>


Why can't the value be &#xB; or &#x0; or whatever? It feels like a bug in the xml serializer...

regards,
Vlad
Re: Using &amp;quot;weird&amp;quot; character literals [message #805047 is a reply to message #759988] Thu, 23 February 2012 09:20 Go to previous messageGo to next message
Vlad Dumitrescu is currently offline Vlad DumitrescuFriend
Messages: 430
Registered: July 2009
Location: Gothenburg
Senior Member
Hi,

I'll revive this question, because I still have to handle files that contain stray control characters and maybe others would stumble on the problem - and the answer is easier than I thought.

Since XML 1.0 doesn't allow encoding control characters and I couldn't find a way to make it output XML 1.1, I realized that I can instead use the following whitespace definition that doesn't require entering the forbidden characters (since I am only concerned with ISO-8859-1 documents):

terminal WS:
	(!('!'..'~'|'\u00a0'..'\u00ff'))+
;


best regards,
Vlad
Re: Using &amp;amp;quot;weird&amp;amp;quot; character literals [message #805063 is a reply to message #805047] Thu, 23 February 2012 09:40 Go to previous message
Jan Koehnlein is currently offline Jan KoehnleinFriend
Messages: 760
Registered: July 2009
Location: Hamburg
Senior Member
I remember we tried to set the output to XML 1.1 by default but it
caused error in other locations. So I assume it's better you better
stick to your workaround.


Am 23.02.12 10:20, schrieb Vlad Dumitrescu:
> Hi,
>
> I'll revive this question, because I still have to handle files that
> contain stray control characters and maybe others would stumble on the
> problem - and the answer is easier than I thought.
>
> Since XML 1.0 doesn't allow encoding control characters and I couldn't
> find a way to make it output XML 1.1, I realized that I can instead use
> the following whitespace definition that doesn't require entering the
> forbidden characters (since I am only concerned with ISO-8859-1 documents):
>
>
> terminal WS:
> (!('!'..'~'|'\u00a0'..'\u00ff'))+
> ;
>
>
> best regards,
> Vlad
>


--
Need professional support for Eclipse Modeling?
Go visit: http://xtext.itemis.com


---
Get professional support from the Xtext committers at www.typefox.io
Previous Topic:Integration with GMF Editor
Next Topic:navigating inverse of references?
Goto Forum:
  


Current Time: Sat May 30 19:37:57 GMT 2020

Powered by FUDForum. Page generated in 0.02290 seconds
.:: Contact :: Home ::.

Powered by: FUDforum 3.0.2.
Copyright ©2001-2010 FUDforum Bulletin Board Software

Back to the top