Equality matching and similarity flooding

This example contains a set of matching transformations used to produce a weaving model. The example is an implementation of an adapted version of the Similarity Flooding algorithm [1] The example contains three transformations:

  • The first transformation calculates similarity values between model elements and creates a propagation graph.
  • The second transformation propagates the similarity values through nodes that are connected (neighbors).
  • The third transformation selects the best matching results and creates a weaving model.

Accuracy of the weaving model

The final output weaving model contains links generated automatically by the matching transformations. The accuracy of the weaving model depends basically on two factors: how the similarities are calculated, and how to select the best similarity values.
In this simple example, the main matching criteria is the name equality between elements (e.g., author = author, title = title, etc.). There are many other criteria that are not covered in this example (dictionaries, string matching, etc.). The best results are selected by looping over all the elements of the left model and by choosing the right element with the best corresponding similarity value.

Instructions

Executing the transformations

There is an Ant Script (scripts/executeAll.xml) that produces a weaving model (models/mw_refined_match.ecore) between two KM3 models (models/Book-KM3.ecore and models/Publication-KM3.ecore). The script executes the following actions:

  • Executes the PropagationGraph.atl transformation. This transformation creates the cartesian product between the elements of the input models and it calculates a similarity value between every pair of elements. The similarities are calculated using the name, cardinality and type. The output is a propagation model (models/propagation_model.ecore) conforming to metamodels/propagation_graph.ecore.
  • Executes the SimilarityFlood.atl transformation. It propagates the similarity values calculated by PropagationGraph.atl through the elements that are connected. This transformation may be executed more than one time.
  • Executes the Selects_SF_amw.atl transformation. This transformation selects a set of nodes with the best similarity values and creates a weaving model (models/mw_refined_match.ecore) with equality mappings.

Loading the weaving model into AMW:

The weaving model (models/mw_refined_match.ecore) can be loaded by double-clicking on the file or by using the wizard.

Remarks:

  • The ATL transformations can also be executed separatedly by using the corresponding launch configuration scripts.
  • The PropagationGraph.atl transformation contains rules that match more than one input element. This is a new feature that uses the ATL 2006 compiler. Instructions about how to install it are available here.
  • The "for" task that executes the SimilarityFlooding.atl transformation several times is not a standard Ant task, but a task available at the Ant Contrib project. It is necessary to install the "for" task as described here.
  • The SimilarityFlood.atl transformation has some performance issues when matching large models.

References:
1. Melnik, S. Generic Model Management: Concepts and Algorithms, Ph.D. Dissertation, University of Leipzig, Springer LNCS 2967, 2004