Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [jgit-dev] Implementing shallow clones

>> Point taken.  At this point, I'm inclined to say screw it and just
>> implement DepthGenerator from scratch.  If I start with the code from
>> TopoSortGenerator, and modify it to only add parents if their depth is
>> low enough, I should basically get everything I need without a whole
>> lot of hassle.  I know copy-paste is evil and all, but
>> TopoSortGenerator is like fifty lines of fairly non-tricky code, so
>> maybe that wouldn't be TOO evil?  C Git does this with a little
>> one-off topological sort as well, so I don't think I'm entirely
>> off-base here.
>
> I'm fine with that, as you point out its only a small amount of
> code that is being duplicated.
>
Grr, ok, that's not going to work either.  TopoSortGenerator functions
by sucking the whole tree in so that it can figure out how many
children everybody has.  If I want to avoid doing that, I'll need to
use a different algorithm.

The most efficient way to do this (and what C git does) is the following:

1.  Set up a queue (FIFORevQueue in jgit parlance), and seed it with
all the root commits.  Mark them as depth 0.
2.  Pop a commit from the front of the queue, and examine its depth.
3.  Set all parents of this commit to depth n + 1, unless their depth
is already less than this.
4.  Add any parents whose depth is less than the cutoff to the back of
the queue.
5.  Produce the popped commit, and repeat from 2 until the queue is empty.

This works because it's basically a breadth-first descent into the
graph.  If you have a commit that can be arrived at by two different
routes, you'll always get there by the shortest route first, because
the queue is FIFO.  Therefore, once the commit gets to the front of
the queue, you know its depth tag is correct, and it's always safe to
produce.  Plus, you don't add a parent to the queue unless its depth
is low enough, so you always terminate at exactly the correct places.

PendingGenerator almost does this, but it's built around a
DateRevQueue, so it's not guaranteed to get the ordering right.  It
kind of feels like the right way to do this would just be to
completely forgo use of the PendingGenerator, and have DepthGenerator
start directly from the root commits and spin out its own tree.
That's kind of an invasive change to StartGenerator, though--would
something like that be ok with you, or is there some other way to
integrate this algorithm into the existing structure?

Thanks for your patience--I promise I'll eventually get this working,
and then you won't have to hear about it anymore. :)

--Matt


Back to the top