Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
[emf-dev] EMF Compare Name Similarity


at the moment I am reverse engineering EMF Compare and I've already read much material. I think I found some inconsistencies among the material and want to task if I understand things right.

That are the statements in question:
a) According to [1] EMF Compare uses Levenshtein distance for string similarity. b) According to [3] EMF Compare 1.3 is similar to [4]. In [4] the Dice coefficient (although it is not named explicitly) is used for string similarity.

After a code review of [2] and [5], I came to the following conclusions:
I) EMF Compare 1.x and 2.x use the Dice coefficient with bi-grams for string similarity II) EMF Compare 2.x uses the Longest Common Subsequence to determine changes in multi-references of EObjects
III) a) is wrong/outdated.

I appreciate if someone can approve my conclusions.







Back to the top