Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
RE: [egit-dev] Re: JGit merge algorithm (fwd)

Hi,

On Fri, 11 Dec 2009, Halstrick, Christian wrote:

> thanks Johannes for this very helpful comments. They made me think again 
> about how we report the merge result and brought some simplifications. 
> After a lot of back and forth I am now happy with one proposal which I 
> published at http://egit.eclipse.org/r/#change,140. Please have a look. 

MergeChunk: after I sent my comments using the magic -1 - x thing to 
convey a boolean value together with an int, I regretted it; it might be 
seen as clever, but it is unreadable to the next developer.  So maybe it 
would be better to document clearly that

	4 * chunkNo + 0 = ConflictState.ordinal()
	4 * chunkNo + 1 = begin index in the base document
	4 * chunkNo + 2 = begin index in the document
	4 * chunkNo + 3 = end index in the document

I have a slight suspicion that you wanted to be able to describe N-way 
merges in the MergeChunk.  Then it would be better to have the 4*chunkNo+0 
number refer to the document, i.e. 0 = base, 1 = ours, 2 = theirs, 3 = 
theirs2...

This number would convey the nature of the chunk: 0 means there was no 
conflict, everything else is a conflict.

MergeAlgorithm: you often use the paradigm "edit = iter.hasNext() ? 
iter.nex() : null".  It might be more readable if you overrode the next() 
method in EditList to do that straight away.

In case 0 in merge(), it might be wiser to define the variables oursEndB 
and theirsEndB _after_ merging the overlapping Edits.

Also in case 0, their might not be a common part in the beginning: the 
first line could already conflict.

Oh, and I am still not certain if adding a compareTo() method to Edit is 
worth it.  IMHO it would be clearer to do the check explicitely, on 
oursEdit and theirsEdit.

MergeFormatter: Hmm.  Somehow, assuming that the sequences are RawText 
instances does not feel right.  I'd be happier if there was a method in 
Sequence that takes an OutputStream and writes a range of items.

Instead of using inConflict, lastConflictingName could be null. (My 
experience with redundant information is that it gets out-of-sync by 
mistake too easily, so it is better to avoid it.)

In spite of MergeChunk desiring to be able to reflect more than two 
non-base versions, the MergeFormatter handles them not really gracefully 
by repeating "=======" conflict separators, but only listing the name of 
the last one.

Besides, there is the diff3 format, which has output looking like this:

	<<<<<<<
	OURS
	=======
	BASE
	=======
	THEIRS
	>>>>>>>

which some people are used to (you can convince C Git to generate the 
conflict markers like this, too), and which disagrees with the output of 
the MergeFormatter for more than 2 files.

> One question is still not so clear to me, maybe somebody knows. Should 
> the merge result contain references to the common base? One could easily 
> present the results of the merge just by copying ranges of text from 
> "ours" or "theirs". No need to refer to the common base text. On the 
> other hand I would like to express in the merge result where we have 
> areas which are common to both merged texts. In the current proposal 
> this is expressed by having merge chunks which refer back to the common 
> base text. Any comments on this?

It is nice, and it is actually preferable to the way C Git does it, 
because it leaves us with the option to do a white-space ignoring merge, 
for example: the diffs can ignore white space, and for non-conflicting 
edits (that change the indentation differently, though), we can still use 
the base version, even if both new versions disagree due to white space 
changes.

Ciao,
Dscho



Back to the top