Home » Modeling » EMF » Problem serialization of model to encrypted binary file
Problem serialization of model to encrypted binary file [message #1067279] |
Mon, 08 July 2013 08:02 |
Niels Brouwers Messages: 80 Registered: July 2009 |
Member |
|
|
Hi,
We have been serializing EMF models to binary files for some while now. The binary models are exported to the execution environment. Binary files are compared with their predecessors to allow for incremental upgrading of the execution environment.
The problem we are facing now is that binary serialization of identical models, validated by EMF Compare and by saving the binary models as XMI, are sometimes different to each other. We have a set of about 100+ models which are being serialized, and from this set it seems that about 20 models of them are not identical to their predecessors. Unfortunately, this set of non-identical files is not constant, and seems to be randomly different.
Unfortunately, it is a strong requirement to only upgrade the execution environment when files are really functionally different.
Help is much appreciated to tackle this problem. Can someone please help us? Provide a solution, or guide us into the correct direction?
Thanks!
Kind regards,
Niels Brouwers.
|
|
|
Re: Problem serialization of model to encrypted binary file [message #1067285 is a reply to message #1067279] |
Mon, 08 July 2013 08:13 |
Ed Merks Messages: 33113 Registered: July 2009 |
Senior Member |
|
|
Niels,
Comments below.
On 08/07/2013 10:03 AM, Niels Brouwers wrote:
> Hi,
>
> We have been serializing EMF models to binary files for some while
> now. The binary models are exported to the execution environment.
> Binary files are compared with their predecessors to allow for
> incremental upgrading of the execution environment.
>
> The problem we are facing now is that binary serialization of
> identical models, validated by EMF Compare and by saving the binary
> models as XMI, are sometimes different to each other.
Hmmm. It would be hard to see what the difference is... Are there XMI
ID's involved? That's using a HashMap so the iteration order is not
well defined...
> We have a set of about 100+ models which are being serialized, and
> from this set it seems that about 20 models of them are not identical
> to their predecessors. Unfortunately, this set of non-identical files
> is not constant, and seems to be randomly different.
> Unfortunately, it is a strong requirement to only upgrade the
> execution environment when files are really functionally different.
>
> Help is much appreciated to tackle this problem. Can someone please
> help us? Provide a solution, or guide us into the correct direction?
I guess I'll wait to know whether extrinsic IDs are involved...
>
> Thanks!
Ed Merks
Professional Support: https://www.macromodeling.com/
|
|
|
Re: Problem serialization of model to encrypted binary file [message #1067324 is a reply to message #1067285] |
Mon, 08 July 2013 10:03 |
Niels Brouwers Messages: 80 Registered: July 2009 |
Member |
|
|
Hi Ed,
yes, extrensic ID's are indeed involved:
@Override
protected boolean useUUIDs() {
return true;
}
I believe the CRC of the file used as input for a QVTO transformation is used to determine the UUIDs. As such, if the input file is identical and the transformation produces deterministic output, the UUIDs of the objects should be deterministic as well.
Furthermore, these are load and save options of the RealXmiResourceImpl class, which is derived from XmiResourceImpl:
private void setOptions() {
// Make binary data deterministic
eObjectToIDMap = new LinkedHashMap<EObject, String>();
URIHandler uriHandler = new XmiUriHandeler();
// Update default de-serialization (load) options.
Map<Object, Object> loadOptions = getDefaultLoadOptions();
// Recommended load options for performance
loadOptions.put(OPTION_DEFER_ATTACHMENT, true);
loadOptions.put(OPTION_DEFER_IDREF_RESOLUTION, true);
loadOptions.put(OPTION_USE_DEPRECATED_METHODS, false);
loadOptions.put(OPTION_USE_PARSER_POOL, parserPool);
loadOptions.put(OPTION_USE_XML_NAME_TO_FEATURE_MAP,
nameToFeatureMap.get());
// Other options
loadOptions.put(OPTION_URI_HANDLER, uriHandler);
// Update default serialization (save) options.
Map<Object, Object> saveOptions = getDefaultSaveOptions();
// Recommended safe options for performance
saveOptions.put(OPTION_CONFIGURATION_CACHE, true);
saveOptions.put(OPTION_USE_CACHED_LOOKUP_TABLE, lookupTable.get());
// Other options
saveOptions.put(OPTION_URI_HANDLER, uriHandler);
saveOptions.put(OPTION_KEEP_DEFAULT_CONTENT, true);
saveOptions.put(OPTION_DECLARE_XML, true);
saveOptions.put(OPTION_PROCESS_DANGLING_HREF,
OPTION_PROCESS_DANGLING_HREF_RECORD);
saveOptions.put(OPTION_SCHEMA_LOCATION, true);
saveOptions.put(OPTION_USE_XMI_TYPE, true);
saveOptions.put(OPTION_SAVE_TYPE_INFORMATION, true);
saveOptions.put(OPTION_SKIP_ESCAPE_URI, false);
saveOptions.put(OPTION_ENCODING, XMI_ENCODING);
// Set XML encoding to the encoding defined for XMI, if necessary.
if (!getEncoding().equals(XMI_ENCODING))
setEncoding(XMI_ENCODING);
setIntrinsicIDToEObjectMap(new LinkedHashMap<String, EObject>());
}
The actual class used for (de-)serialization is derived from RealXmiResourceImpl and adds encryption, BinaryResourceImpl:
private void setOptions() {
getDefaultLoadOptions().put(OPTION_BINARY, true);
getDefaultLoadOptions().put(BinaryResourceImpl.OPTION_VERSION,
Version.VERSION_1_1);
getDefaultLoadOptions().put(
BinaryResourceImpl.OPTION_STYLE_BINARY_ENUMERATOR, true);
getDefaultLoadOptions().put(
BinaryResourceImpl.OPTION_STYLE_PROXY_ATTRIBUTES, true);
try {
getDefaultLoadOptions().put(OPTION_CIPHER, new DESCipherImpl(getKey()));
}
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
getDefaultSaveOptions().put(OPTION_BINARY, true);
getDefaultSaveOptions().put(BinaryResourceImpl.OPTION_VERSION,
Version.VERSION_1_1);
getDefaultSaveOptions().put(
BinaryResourceImpl.OPTION_STYLE_BINARY_ENUMERATOR, true);
getDefaultSaveOptions().put(
BinaryResourceImpl.OPTION_STYLE_PROXY_ATTRIBUTES, true);
// Disable xml-formatting
getDefaultSaveOptions().put(OPTION_FORMATTED, false);
try {
getDefaultSaveOptions().put(OPTION_CIPHER, new DESCipherImpl(getKey()));
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
super.init();
Kind regards,
Niels Brouwers.
[Updated on: Mon, 08 July 2013 10:13] Report message to a moderator
|
|
| | | | |
Re: Problem serialization of model to encrypted binary file [message #1067412 is a reply to message #1067390] |
Mon, 08 July 2013 14:30 |
Ed Merks Messages: 33113 Registered: July 2009 |
Senior Member |
|
|
Niels,
Comments below.
On 08/07/2013 3:40 PM, Niels Brouwers wrote:
> Hi Ed,
>
> We've done additional testing. First test was to disable the
> encryption options mentioned in the code in my previous posts and run
> the same transformations in a batch twice. The same amount of output
> models were determined to be different compared to a previous run.
> Next, we set the output format to XMI and executed the transformations
> in a batch twice again. We have found all output models to be binary
> identical to the previous run.
>
> Our conclusion until now is that the difference is somehow caused
> during the serialization from the model in memory to the binary format.
I see.
> When comparing two non-identical binary models with each other, we see
> small differences at the place were model elements from another
> resource are being referenced.
The referenced resource has the same URI but somehow the proxy URI being
serialized is a little different?
> Sometimes the first part of the URI (containing the file) is missing
> and replaced by a token.
Which version of EMF are you using? In
org.eclipse.emf.ecore.resource.impl.BinaryResourceImpl.EObjectOutputStream.writeURI(URI,
String), even in older versions of EMF, that method generally uses a URI
table so it will write the full URI only the first time, and after that
write out a compressed int for the repeated occurrence.
> Sometimes the token in both files is just different.
I would suggest adding print statements to the writeURI method to see if
each case produces the same sequence of URI/fragment pairs.
> At this point it may also be note worthy that we created our own URI
> converter, which allows us to find models according to a certain
> scheme within two distinct execution environments.
Another thing to be sure about is whether the URIs of all the referenced
resources are identical in each case; of course those URIs are used to
encode proxy references for the cross document references. Such a
problem would be clear from the printed traces of all the calls to
writeURI...
>
> Any more ideas?
>
>
Ed Merks
Professional Support: https://www.macromodeling.com/
|
|
| |
Re: Problem serialization of model to encrypted binary file [message #1068010 is a reply to message #1067992] |
Thu, 11 July 2013 16:13 |
Ed Merks Messages: 33113 Registered: July 2009 |
Senior Member |
|
|
Niels,
With URI mapping as used by URIConverter.normalize it is possible that
two different URIs will normalize to the same final URI and the resource
could have either of those URIs and behave much the same. Once you've
serialized an undesirable URI, it can tend to show up because the URI on
the resource will the that of the first attempt to load it, i.e., the
first proxy resolve.
So one approach to consider it using EcoreUtil.resolveAll on the
resource set. Look at all the URIs of all the resources. Any that you
don't consider the "canonical form of the URI", set it to what it should
be and then save all the resources again.
On 11/07/2013 5:13 PM, Niels Brouwers wrote:
> Hi Ed,
>
> I finally got time to modify the BinaryResourceImpl and use this
> version to serialze the models to a binary file and see what is
> causing the difference in the binary output.
>
> It seems that the difference is indeed caused when a uri is
> serialized. Apparently we have multiple references to the same model
> which can be reached through multiple paths on the filesystem.
> Sometimes the one path ends up in the uri, sometimes another. We are
> not sure if that is caused by our custom written URI converter, or
> maybe a non-determinsm in the transformation.
> So, I am pretty sure it is something we are doing wrong. Hopefully, we
> can fix it ourselves without any further assistance.
>
> Thanks for your help!
Ed Merks
Professional Support: https://www.macromodeling.com/
|
|
|
Goto Forum:
Current Time: Thu Mar 28 23:35:20 GMT 2024
Powered by FUDForum. Page generated in 0.03364 seconds
|