Skip to main content


Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Modeling » EMF » How to strip invalid XML characters when saving Resource
How to strip invalid XML characters when saving Resource [message #1448800] Mon, 20 October 2014 13:20 Go to next message
Phil Beauvoir is currently offline Phil BeauvoirFriend
Messages: 33
Registered: October 2012
Member
In our application it is possible for a user to paste invalid characters (for example from a binary file) into a text field. When saving the content via an EMF Resource (Resource#save(null)) a RuntimeException is thrown:

java.io.IOException: java.lang.RuntimeException: An invalid XML character (Unicode: 0x18) was found in the element content:

at org.eclipse.emf.ecore.xmi.impl.XMLSaveImpl$Escape.convertText(XMLSaveImpl.java:3496)

However, by this point the saved file has zero bytes and the damage has been done.

What is the best way to deal with this?

1. Trap Paste events in the text field and somehow strip out all invalid characters? If so, how?
2. Check the validity of the contents in the EMF Resource content first?
3. Save to temp file first?

Thanks for any advice.
Re: How to strip invalid XML characters when saving Resource [message #1448828 is a reply to message #1448800] Mon, 20 October 2014 13:58 Go to previous messageGo to next message
Ed Merks is currently offline Ed MerksFriend
Messages: 29653
Registered: July 2009
Senior Member
Phil,

Comments below.

On 20/10/2014 3:20 PM, Phil Beauvoir wrote:
> In our application it is possible for a user to paste invalid
> characters (for example from a binary file) into a text field. When
> saving the content via an EMF Resource (Resource#save(null)) a
> RuntimeException is thrown:
>
> java.io.IOException: java.lang.RuntimeException: An invalid XML
> character (Unicode: 0x18) was found in the element content:
Many more characters are valid with XML 1.1, so you could use
org.eclipse.emf.ecore.xmi.XMLResource.setXMLVersion(String) or the
org.eclipse.emf.ecore.xmi.XMLResource.OPTION_XML_VERSION option. But if
you read in a resource, it remembers the version in that resource, and
generally should be saved back with that same version...
>
> at
> org.eclipse.emf.ecore.xmi.impl.XMLSaveImpl$Escape.convertText(XMLSaveImpl.java:3496)
>
> However, by this point the saved file has zero bytes and the damage
> has been done.
>
> What is the best way to deal with this?
>
> 1. Trap Paste events in the text field and somehow strip out all
> invalid characters? If so, how?
Perhaps define your own EDataType that wraps java.lang.String and in its
createAbcFromString method you could strip out bad characters.
> 2. Check the validity of the contents in the EMF Resource content first?
That's possible too, but note that it's only invalid because it's an XML
resource. Other serialization formats might well support those
characters. E.g., BinaryResourceImpl. So it's a bit trickier that
it's only invalid relative to the resource implementation (and then
relative to the XML version as well).
> 3. Save to temp file first?
If you use
org.eclipse.emf.ecore.resource.Resource.OPTION_SAVE_ONLY_IF_CHANGED it
will serialize the resource to a temp file or an in-memory buffer and
compare that to the current bytes in the underlying resource, so given
it fails early before opening an output stream on the underlying
resource, you'd avoid overwriting the existing content in the case that
new content can't be serialized at all.
>
> Thanks for any advice.
Re: How to strip invalid XML characters when saving Resource [message #1448832 is a reply to message #1448828] Mon, 20 October 2014 14:03 Go to previous messageGo to next message
Phil Beauvoir is currently offline Phil BeauvoirFriend
Messages: 33
Registered: October 2012
Member
Thanks very much for taking the time to answer, Ed. Much appreciated! Smile

I shall now explore these options.

Re: How to strip invalid XML characters when saving Resource [message #1448845 is a reply to message #1448832] Mon, 20 October 2014 14:24 Go to previous messageGo to next message
Phil Beauvoir is currently offline Phil BeauvoirFriend
Messages: 33
Registered: October 2012
Member
I tried your suggestion:

"Many more characters are valid with XML 1.1, so you could use org.eclipse.emf.ecore.xmi.XMLResource.setXMLVersion(String) or the org.eclipse.emf.ecore.xmi.XMLResource.OPTION_XML_VERSION option. But if
you read in a resource, it remembers the version in that resource, and generally should be saved back with that same version..."

So when saving the resource I do this before saving it:

if(resource instanceof XMLResource) {
((XMLResource)resource).setXMLVersion("1.1");
}

And that works!

But I'm concerned by your remark, "But if you read in a resource, it remembers the version in that resource, and generally should be saved back with that same version..." - what are the dangers here? And is version "1.1" non-standard?
Re: How to strip invalid XML characters when saving Resource [message #1449007 is a reply to message #1448845] Mon, 20 October 2014 19:41 Go to previous messageGo to next message
Phil Beauvoir is currently offline Phil BeauvoirFriend
Messages: 33
Registered: October 2012
Member
For anyone reading this thread, this is what I did:

I didn't save as XML version 1.1, I carried on as before. Instead, I wrapped Resource#Save() in a more inclusive try/catch block to catch all exceptions. When saving, we now save to a temporary file first. Also, I added a VerifyListener to each text field that strips out any invalid XML characters.
Re: How to strip invalid XML characters when saving Resource [message #1770712 is a reply to message #1448828] Wed, 16 August 2017 12:00 Go to previous message
Alain Picard is currently offline Alain PicardFriend
Messages: 217
Registered: July 2009
Senior Member
Thanks Ed, very useful advice even if close to 3 years old.

[Updated on: Wed, 16 August 2017 12:01]

Report message to a moderator

Previous Topic:Special character in String
Next Topic:createEcoreAnnotations is not generated from .xcore file
Goto Forum:
  


Current Time: Sun Nov 18 08:18:57 GMT 2018

Powered by FUDForum. Page generated in 0.01684 seconds
.:: Contact :: Home ::.

Powered by: FUDforum 3.0.2.
Copyright ©2001-2010 FUDforum Bulletin Board Software

Back to the top