Re: [jgit-dev] RevWalk problems with shallow clones
On Fri, Aug 12, 2011 at 02:50, Marc Strapetz <marc.strapetz@xxxxxxxxxxx> wrote:
>> The way to do handle the shallow stuff is to actually insert
>> pre-parsed RevCommit objects into the RevWalk object pool that have
>> the parents truncated away, similar to a graft (which we also don't
>> support). This will keep JGit from walking backwards into the commits
>> it doesn't have, and thus avoid the MissingObjectException you noted.
> That means to parse probably a lot of shallow commits which are actually
> never encountered for a specific RevWalk. Couldn't we detect shallow
> commits lazily -- would following approach work?
> RevCommit.parseCanonical() uses RevWalk.lookupCommit() to resolve
> parents. We could introduce a new package-protected method:
> lookupCommit(final AnyObjectId id, final AnyObjectId child)
> On the otherside we would have new
> public RevWalk.markShallow(final AnyObjectId id)
> Now, when new lookupCommit() receives a child being part of commits
> marked as shallow, it will return null and RevCommit.parseCanonical()
> then will simply ignore that parent.
This will (massively) slow down the common case of not having shallow commits.
We don't actually need to parse the shallow commits. We just need to
load the shallow file and allocate the RevCommit object with the
parents field populated. IIRC RevCommit won't overwrite the parents if
they are already filled when it parses, which means we can setup a
shallow commit with an empty parent array. For non-shallow walks, this
has no additional penalty. For shallow walks, the penalty is only a
small hit up-front as we scan through the shallow file... which you
have to do no matter what anyway.