Re: [jgit-dev] delta generation during packing
On Sat, Jul 10, 2010 at 2:32 PM, Robin Rosenberg
> I am impressed. I know Java is not slow in general, but I expected it
> to be here. Not so.
I know. I also expected performance to be worse than it showed here.
I was rather surprised the code did this well.
I actually tried to put a lot of thought into these lower level,
performance critical sections of the system. I'm glad its paid off.
For example, I'm glad we did RevWalk/ObjectWalk/TreeWalk a few years
ago when we did those. They are a key part of why we can pack with
reasonable performance today. :-)
Unfortunately its still taking a long time to convert people away from
Commit and Tree. *sigh*
I wouldn't say that we are done tuning the code here, but its already
fairly optimized. It may be hard to beat what we are doing. Most of
the algorithms are the best available given the inputs we have on
hand. I did hope for the linux-2.6 test to be closer to the C one
though, given how we did with the jgit.git and git.git tests.
> How about memory usage? It is in my experience the area
> where Java suffers most compares to C. Raw performance can usually
> be achieved, but at the cost of memory usage, which may affect how
> well this works when we pack from within Eclipse.
Yea, I know. Memory usage should actually rival the C implementation.
I didn't look at the JVM heap usage while its running, but I know what
the code does, and how big the data structures involved are. I play a
lot of games to share fields within the ObjectToPack instance during
packing. As an example, the CRC field actually is used in three
different ways as the packing progresses through various phases:
initially its the size of the object in an existing pack file, then
its the size of the object fully inflated, and then finally its the
CRC in the output pack. I do the same thing with the path hash code
field. I know the C implementation actually uses different fields for
this in the struct, but they also use a single giant malloc block,
while we have to pay JVM object header overheads. By recycling the
fields we are able to "buy back" that header cost.
I do the same thing with booleans, we store them as bits in a single
larger int field. I think that combines what would normally be like
20 bytes of memory into only 4 in JGit.
I think we pay more per window slot than the C implementation does,
but its a tiny fixed constant per slot. We allocate one object per
window slot, and each one has 3 fields. So on a Sun HotSpot JVM where
we pay 8 bytes per object header, each window slot costs us around say
40 bytes (assuming references are 64 bits). A window size of 5000
would require around 195 KiB, which isn't bad. The real memory usage
is the indexes.
The block indexes that we need to encode deltas actually took a lot of
thought on my part. We use a single int and another long to
encode the entire block index, so that the overhead is the smallest
amount possible. Its actually harking back to C where you have to
manage everything yourself. :-) Fortunately this means we need
exactly 12 bytes of memory to index each 16 byte block of a file, no
matter what the JVM is. I think the bigger challenge is just getting
the contiguous memory allocations for the arrays from the JVM.
If we cap the window memory with the pack.windowMemory parameter, and
we set the pack.bigFileThreshold to a reasonable setting, we should be
able to pack even inside of an IDE. Those parameters will limit how
much memory the PackWriter can consume, at the cost of producing a
less optimal pack. Packing is inherently memory intensive and I can't
do much about that, other than to rewrite the entire algorithm to be
disk based, which would make it take a _lot_ longer.