Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
RE: [egit-dev] Re: JGit merge algorithm (fwd)


> MergeChunk: after I sent my comments using the magic -1 - x thing to
> convey a boolean value together with an int, I regretted it; it might be
> seen as clever, but it is unreadable to the next developer.  So maybe it

Yes, you're right. I'll store the state explicitly instead of trying to fold
this info somewhere inbetween our ints.

> would be better to document clearly that
> 	4 * chunkNo + 0 = ConflictState.ordinal()
> 	4 * chunkNo + 1 = begin index in the base document
> 	4 * chunkNo + 2 = begin index in the document
> 	4 * chunkNo + 3 = end index in the document

Aren't we loosing info here? Range in base and in the document don't have
to be of same length. Storing two starts and one end value is not sufficient
to compute the range in both docs.

> I have a slight suspicion that you wanted to be able to describe N-way
> merges in the MergeChunk.

Correct suspicion!

>                            Then it would be better to have the 4*chunkNo+0
> number refer to the document, i.e. 0 = base, 1 = ours, 2 = theirs, 3 =
> theirs2...

Currently I have coded the following
 	3 * chunkNo + 0 = which document (0=base, 1=firstDoc, 2=secondDoc, 
                          ...) combined with state info
 	3 * chunkNo + 1 = begin index in the document combined with other
                          state info
 	3 * chunkNo + 2 = end index in the document

Putting together all your comments and my ideas my proposal would be

	4 * chunkNo + 0 = index of the doc (0=base, 1=first doc(ours), 
                             2=second doc (theirs...)
 	4 * chunkNo + 1 = ConflictState.ordinal()
 	4 * chunkNo + 2 = begin index in the document
 	4 * chunkNo + 3 = end index in the document

For each conflict I would expect number-of-docs chunks and one chunk
(optionally) telling me something about the base (also to support diff3
Format). The conflictstate for the chunk for the base could be special
 (e.g. 3="base text for conflict") to distinguish it from really 
conflicting chunks.

> MergeAlgorithm: you often use the paradigm "edit = iter.hasNext() ?
> iter.nex() : null".  It might be more readable if you overrode the next()
> method in EditList to do that straight away.


> In case 0 in merge(), it might be wiser to define the variables oursEndB
> and theirsEndB _after_ merging the overlapping Edits.


> Also in case 0, their might not be a common part in the beginning: the
> first line could already conflict.

Right. I'll explicitly check for this situation and handle it. This causes
currently no problem, but currently I may report unneeded empty chunks.

> Oh, and I am still not certain if adding a compareTo() method to Edit is
> worth it.  IMHO it would be clearer to do the check explicitely, on
> oursEdit and theirsEdit.

Hmm, this compareTo was needed more in earlier intermediate states of this
Algorithm and could really be replaced by explicit calls now. I like the
Idea to have all the merge-algorithm-relevant computations directly in one

> MergeFormatter: Hmm.  Somehow, assuming that the sequences are RawText
> instances does not feel right.  I'd be happier if there was a method in
> Sequence that takes an OutputStream and writes a range of items.

Here I have a different opinion. The Sequence interface is so small and clean
offers exactly what a merge-algorithm needed. No access to specific element
but just a comparison method. This would be spoiled (in my eyes) if would
method to serialize elements to streams.
Another argument I have is that a MergeFormatter is anyhow something which is
tied to the kind of documents which have been merged. For Java Code in
RawTexts it's required to add texts "<<<", ">>>", "===" to mark conflicting
ranges. These markers have the same type as the merged docs, right? So, the
formatter was in my eyes so much tied to the type of the docs that is was ok
to write a specific RawText version of it.
But with one thing I agree: having only one formatter which is hardcoded to
RawText handling maybe not sufficient. We may need other formatters for other 
document types and for other ways how to represent merges (e.g. a diff3 
MergeFormatter). An interface for the formatter is needed in my eyes.

> Instead of using inConflict, lastConflictingName could be null. (My
> experience with redundant information is that it gets out-of-sync by
> mistake too easily, so it is better to avoid it.)


> In spite of MergeChunk desiring to be able to reflect more than two
> non-base versions, the MergeFormatter handles them not really gracefully
> by repeating "=======" conflict separators, but only listing the name of
> the last one.

Fully agree, I also detected this bug. Are there any standards how to
present n-way merges?


Back to the top