Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Modeling » Compare » On EMF Compare 2.X scalability (some thoughts and questions on 2.x scalability)
On EMF Compare 2.X scalability [message #1051942] Mon, 29 April 2013 08:04 Go to previous message
Victor Roldan Betancort is currently offline Victor Roldan Betancort
Messages: 468
Registered: July 2009
Senior Member
Hi there,

aside from getting used to the new apis, I've faced new challenges on the 2.X stream.
My main concern is current matching algorithm's scalability. I got surprised to see how test-cases that used to take 15 seconds, now need 20 minutes to finalize.

I've debugged the code and compared it to 1.X, and found some differences that look to me as possible responsibles to this poor performance. Aparently, the scope of comparison has heavily increased on the new match engine. Inspecting ProximityIndex.findTheClosest(Comparison, EObject, Side, Side, boolean) shows how now the engine tries to match each element with every other possible element on the model with the same type (ByTypeIndex takes care of that classifying by type). You can check how the "storageToSearchFor" list contains all elements with the same type in the model as the object to match.

Oppossed to that, 1.X used content matching strategies to narrow the scope of the comparison. This means: we will only compare elements that are around the same point of the models structure, by doing a structural/content match. There was even a search window parameter to narrow/broad even more the match scope in case necessary. Although this could lead to some false positives, it was way more efficient.

For big models, the 2.x behaviour end up in almost 2 * n * n comparisons in the worst case, where n is the number of elements, and a potential 2 due to double check mechanism. Obiously, the new approach is safer in the sense that nothing remains out of the match calculations, but this appears not to scale on big models.

I got surprised to realise this, when scalability has always been a concern to the project.

So, the questions here are:
- Is it true what I've found or I'm missing something?
- Why is the model flattened and compared, and why was the content match of 1.x discarded?
- Any recommended way to avoid this exponential number of comparisons?
- Are there any possible optimizations to know?

Thanks in advance,
VĂ­ctor.
 
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Previous Topic:UnmatchElement List vs total
Next Topic:[EMF Compare] Exchange ScopeProvider in UI plugin
Goto Forum:
  


Current Time: Tue May 28 10:33:21 EDT 2013

Powered by FUDForum. Page generated in 0.01900 seconds