|
Re: How to strip invalid XML characters when saving Resource [message #1448828 is a reply to message #1448800] |
Mon, 20 October 2014 13:58 |
Ed Merks Messages: 33217 Registered: July 2009 |
Senior Member |
|
|
Phil,
Comments below.
On 20/10/2014 3:20 PM, Phil Beauvoir wrote:
> In our application it is possible for a user to paste invalid
> characters (for example from a binary file) into a text field. When
> saving the content via an EMF Resource (Resource#save(null)) a
> RuntimeException is thrown:
>
> java.io.IOException: java.lang.RuntimeException: An invalid XML
> character (Unicode: 0x18) was found in the element content:
Many more characters are valid with XML 1.1, so you could use
org.eclipse.emf.ecore.xmi.XMLResource.setXMLVersion(String) or the
org.eclipse.emf.ecore.xmi.XMLResource.OPTION_XML_VERSION option. But if
you read in a resource, it remembers the version in that resource, and
generally should be saved back with that same version...
>
> at
> org.eclipse.emf.ecore.xmi.impl.XMLSaveImpl$Escape.convertText(XMLSaveImpl.java:3496)
>
> However, by this point the saved file has zero bytes and the damage
> has been done.
>
> What is the best way to deal with this?
>
> 1. Trap Paste events in the text field and somehow strip out all
> invalid characters? If so, how?
Perhaps define your own EDataType that wraps java.lang.String and in its
createAbcFromString method you could strip out bad characters.
> 2. Check the validity of the contents in the EMF Resource content first?
That's possible too, but note that it's only invalid because it's an XML
resource. Other serialization formats might well support those
characters. E.g., BinaryResourceImpl. So it's a bit trickier that
it's only invalid relative to the resource implementation (and then
relative to the XML version as well).
> 3. Save to temp file first?
If you use
org.eclipse.emf.ecore.resource.Resource.OPTION_SAVE_ONLY_IF_CHANGED it
will serialize the resource to a temp file or an in-memory buffer and
compare that to the current bytes in the underlying resource, so given
it fails early before opening an output stream on the underlying
resource, you'd avoid overwriting the existing content in the case that
new content can't be serialized at all.
>
> Thanks for any advice.
Ed Merks
Professional Support: https://www.macromodeling.com/
|
|
|
|
|
|
|
Powered by
FUDForum. Page generated in 0.04647 seconds