Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [jgit-dev] Gerrit clone time takes ages, because of PackWriter.searchForReuse()

On Fri, Oct 2, 2020 at 5:34 PM Jonathan Nieder <jrn@xxxxxxxxxx> wrote:
Martin Fick wrote:

I haven't looked at those measurements, do they take into account the actual
transferred bytes?

Yes, they measure bytes transferred over the wire.

Any wastefulness from "thickening" happens after the duplication of objects 
reported by this metric. But since there is 10-20x duplicated objects being set, 
if those objects area all deltified there are 10-20x more thickenings.

> It would be nice if new packfiles could be appended to existing packfiles so
> that they would not need to be thickened,

Our backend storage is write-once so it doesn't work for us. Updating an index
with 20M entries isn't going to be that efficient either. But I understand the appeal.

An eventual evolution of multi-pack indices might get us there. AIUI, Microsoft's
initial implementation in cgit still writes the individual pack files in the same way,
but when we do it in JGit we might avoid it.

See https://lore.kernel.org/git/20200603015314.GA253041@xxxxxxxxxx/ for more discussion of this behavior.

Thanks,
Jonathan

Back to the top