Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
[jgit-dev] "instable" diff algorithms lead to wrong merge results

Hi,

I am having hard times to make the merge results of JGit and C-Git
similar. During long investigations I detected very interesting
problems caused by the underlying diff algorithms and I am asking
myself whether C Git developers are aware of this problem. I think
that certain diff algorithms can lead to really wrong merge results
(wrong in the eyes of a human reader of the merge result). Here is the
problem:

Imagine you have a diff algorithm which is how I call it "not stable".
In an "not stable" diff algorithm you can never be sure which EDITS a
diff algorithm will produce for a isolated (surrounded by large blocks
of common lines) difference. Example (each line of content has as
prefix the line-number and a colon):

Lines 100-104 of Content BASE :
====
...
100: a() {
101: }
102:
103: b() {
104: }
====

Lines 100-107 of Content OURS :
====
...
100: a() {
101: }
102:
103: x() {
104: }
105:
106: b() {
107: }
====

Let's look at  two alternative diffs which can be reported by an diff
algorithm when comparing BASE and OURS:
a) OURS inserts its lines 101-103 before line 101 of BASE ( thats and
Insert Edit(101,101,101,103) )
b) OURS inserts its lines 103-105 before line 103 of BASE ( thats and
Insert Edit(103,103,103,105) )

My fear is that at least for MyersDiff whether a) or b) is reported is
determine on how lines 0-99 look like. If this is true the following
can happen: Imagine there is also a content THEIRS which differs from
OURS in lines 0-99 but which is exactly similar to OURS regarding
lines 100-107. The diff between BASE and OURS returns a) and the one
between BASE and THEIRS return b). During a merge operation where we
merge together OURS and THEIRS with common base BASE. Since OURS and
THEIRS do the same modification to BASE in lines 100-107 Semanticly I
expect that every good merge algorithm will not report a conflict and
simply take the new content regarding lines 100-107. But because of
the "not stable" diff the EDITS a) and b) are detected as
non-overlapping diffs which can be applied independently. We end up
with content were x() is added twice!!!

Lines 100-110 of the Merge Result :
====
...
100: a() {
101: }
102:
103: x() {
104: }
105:
106: x() {
107: }
108:
109: b() {
110: }
====

Am I missing something?

 If you want to see this problems in real world look try to merge
cb2e630 and 3e494ae of git://egit.eclipse.org/egit.git and look at the
files UIText.java and EgitUiEditorUtils.java. JGit and CGit behave
differently while merging also because of this problem.

Ciao
  Chris


Back to the top