Effect of the load options? [message #1549238] |
Tue, 06 January 2015 12:02 |
|
Hi,
The EMF book, existing usages and some resources scattered around the web (such as this doc page, though probably outdated) usually seem to give emphasis to a few load options to "enhance the performance", but I found no real explanation on the cases they should enhance and when to use them. I've been trying for the past while to get a few measures out of using 5 different load options... and had no success (i.e. using the options or not makes no difference).
Namely, I've been focusing on :
- XMLResource.OPTION_USE_PARSER_POOL
- XMLResource.OPTION_USE_DEPRECATED_METHODS
- XMLResource.OPTION_USE_XML_NAME_TO_FEATURE_MAP
- XMLResource.OPTION_DEFER_ATTACHMENT
- XMLResource.OPTION_DEFER_IDREF_RESOLUTION
used through the following code (This does the very minimal : set the options on my resource set, load the root of my model, then resolve all proxies to load all of the other parts of my model):
@Test
public void loadModel() throws IOException {
ResourceSetImpl resourceSet = new ResourceSetImpl();
new ResourceSetImpl.MappedResourceLocator(resourceSet);
final Map<Object, Object> options = resourceSet.getLoadOptions();
options.put(XMLResource.OPTION_USE_PARSER_POOL, new XMLParserPoolImpl());
options.put(XMLResource.OPTION_USE_DEPRECATED_METHODS, Boolean.FALSE);
options.put(XMLResource.OPTION_USE_XML_NAME_TO_FEATURE_MAP, new HashMap<Object, Object>());
options.put(XMLResource.OPTION_DEFER_ATTACHMENT, Boolean.TRUE);
options.put(XMLResource.OPTION_DEFER_IDREF_RESOLUTION, Boolean.TRUE);
final String folder = "D:\\developpement\\eclipse-Luna-EMF-Compare\\eclipse\\workspace\\ecore.load\\src\\ecore\\load\\data\\medium";
final String modelLocation = "__ROOT__.xmi";
final File modelFile = new File(folder + '\\' + modelLocation);
resourceSet.getResource(URI.createFileURI(modelFile.getAbsolutePath()), true);
EcoreUtil.resolveAll(resourceSet);
assertEquals(882, resourceSet.getResources().size());
}
Since I am trying to see the effect of the option(s), I'm commenting the "options.put" lines and relaunching a new vm every time I test one. Problem is... this runs in the exact same amount of time no matter the options I enable/disable.
I've used the same code with three different models :
- an ecore model, fragmented (either though containment or simple cross-references) in 882 different files (64MB on disk),
- a UML model fragmented (only through containment via the "control" command) in 929 different files on disk (44MB),
- the same UML model, but with all EObjects pulled up in the same UML file.
The same result was observed on all of these : no difference in performance whatever the options used.
When I do not resolve the proxies, i.e. if I replace my "resolveAll" line by the following :
final File containingFolder = new File(folder);
final File[] members = containingFolder.listFiles();
for (int i = 0; i < members.length; i++) {
resourceSet.getResource(URI.createFileURI(members[i].getAbsolutePath()), true);
}
Then I see a difference... but only "OPTION_DEFER_IDREF_RESOLUTION" does anything at all even if it does it well, reducing the time of my test from 160 seconds to 26 seconds with the fragmented ecore model. Adding a resolveAll call afterwards so as to make sure the proxies are resolved almost removes this difference though.
None of this makes sense to me. I'd have expected a change, one way or the other... or I'm somehow misunderstanding/misusing something (or measuring in a wrong way)? I'm using a computer with SSD disks if this has an impact, though since these options shouldn't affect I/O... attached is the unit test with which I was trying this, along with the ecore model I was using as a sample (it's only a generated model though).
What exactly should we be looking at to determine which load options are useful for us? For example, if I look at the doc page I mentionned earlier, it seems like using a parser pool is only ever useful if I'm planning on repeatedly loading the same resource?
Laurent
[Updated on: Tue, 06 January 2015 12:03] Report message to a moderator
|
|
|
Re: Effect of the load options? [message #1585340 is a reply to message #1549238] |
Mon, 26 January 2015 07:56 |
|
Hi Ed,
Any info on this? Even if it's because I did a stupid mistake in my test, I'd really like some insight on the matter. I wanted to determine what would be the effect on the load options on what kind of model to better determine what options to use in such or such case... At the moment I've concluded that no option makes any difference in the nominal case, which I instinctively think is wrong .
Laurent
|
|
|
|
Re: Effect of the load options? [message #1754276 is a reply to message #1754255] |
Thu, 16 February 2017 06:50 |
Ed Merks Messages: 33142 Registered: July 2009 |
Senior Member |
|
|
I addition to deferring ID resolution, you should set org.eclipse.emf.ecore.resource.impl.ResourceImpl.setIntrinsicIDToEObjectMap(Map<String, EObject>) in the resource factory. Otherwise ID lookup must traverse the whole containment tree to find the resolution of any particularly ID, resulting in performance that's likely O(n^2) rather than O(n). The OPTION_DEFER_IDREF_RESOLUTION option will help a bit all by itself because at least such traversals won't be done early, in which case forward references can't resolve early and then the full traversal must be repeated at the end. With an intrinsic map, the traversal is done once, building a map, and then each ID can be resolved in O(1) time via that map.
Ed Merks
Professional Support: https://www.macromodeling.com/
|
|
|
Powered by
FUDForum. Page generated in 0.04394 seconds