Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
RE: [egit-dev] Re: JGit merge algorithm (fwd)


On Fri, 11 Dec 2009, Halstrick, Christian wrote:

> thanks Johannes for this very helpful comments. They made me think again 
> about how we report the merge result and brought some simplifications. 
> After a lot of back and forth I am now happy with one proposal which I 
> published at,140. Please have a look. 

MergeChunk: after I sent my comments using the magic -1 - x thing to 
convey a boolean value together with an int, I regretted it; it might be 
seen as clever, but it is unreadable to the next developer.  So maybe it 
would be better to document clearly that

	4 * chunkNo + 0 = ConflictState.ordinal()
	4 * chunkNo + 1 = begin index in the base document
	4 * chunkNo + 2 = begin index in the document
	4 * chunkNo + 3 = end index in the document

I have a slight suspicion that you wanted to be able to describe N-way 
merges in the MergeChunk.  Then it would be better to have the 4*chunkNo+0 
number refer to the document, i.e. 0 = base, 1 = ours, 2 = theirs, 3 = 

This number would convey the nature of the chunk: 0 means there was no 
conflict, everything else is a conflict.

MergeAlgorithm: you often use the paradigm "edit = iter.hasNext() ? 
iter.nex() : null".  It might be more readable if you overrode the next() 
method in EditList to do that straight away.

In case 0 in merge(), it might be wiser to define the variables oursEndB 
and theirsEndB _after_ merging the overlapping Edits.

Also in case 0, their might not be a common part in the beginning: the 
first line could already conflict.

Oh, and I am still not certain if adding a compareTo() method to Edit is 
worth it.  IMHO it would be clearer to do the check explicitely, on 
oursEdit and theirsEdit.

MergeFormatter: Hmm.  Somehow, assuming that the sequences are RawText 
instances does not feel right.  I'd be happier if there was a method in 
Sequence that takes an OutputStream and writes a range of items.

Instead of using inConflict, lastConflictingName could be null. (My 
experience with redundant information is that it gets out-of-sync by 
mistake too easily, so it is better to avoid it.)

In spite of MergeChunk desiring to be able to reflect more than two 
non-base versions, the MergeFormatter handles them not really gracefully 
by repeating "=======" conflict separators, but only listing the name of 
the last one.

Besides, there is the diff3 format, which has output looking like this:


which some people are used to (you can convince C Git to generate the 
conflict markers like this, too), and which disagrees with the output of 
the MergeFormatter for more than 2 files.

> One question is still not so clear to me, maybe somebody knows. Should 
> the merge result contain references to the common base? One could easily 
> present the results of the merge just by copying ranges of text from 
> "ours" or "theirs". No need to refer to the common base text. On the 
> other hand I would like to express in the merge result where we have 
> areas which are common to both merged texts. In the current proposal 
> this is expressed by having merge chunks which refer back to the common 
> base text. Any comments on this?

It is nice, and it is actually preferable to the way C Git does it, 
because it leaves us with the option to do a white-space ignoring merge, 
for example: the diffs can ignore white space, and for non-conflicting 
edits (that change the indentation differently, though), we can still use 
the base version, even if both new versions disagree due to white space 


Back to the top