Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [jgit-dev] Memory usage for PackedObjectInfo

On Wed, Nov 30, 2011 at 14:36, Gonsolo <gonsolo@xxxxxxxxx> wrote:
> I get an OutOfMemory Exception and hprof shows the following:
>
> Class
> Instance Count  Total Size
> class org.eclipse.jgit.transport.PackedObjectInfo       285778          12574232
> class org.eclipse.jgit.transport.PackedObjectInfo[]     1
>         8936248
> ...

This is only 20M of data. The average size of PackedObjectInfo on your
JVM is only 44 bytes. Native C Git might need more, I just estimated
the size of the same data structure to be 60 bytes per
PackedObjectInfo equivalent struct. So I think we are doing pretty
good on memory usage here in JGit.

> A pack file is received and written to disk.
>
> Is it possible to lower memory usage or bypass jgit and directly write
> the pack to disk?

Do you want to use the pack? Or are you just downloading it to waste disk space?

JGit writes the pack to disk and does not buffer the pack data itself.
However any Git implementation receiving a pack over the network must
scan the pack and build an index to support random access to the pack.
Without the index, no Git implementation is able to read the data from
it, as there is no information about where objects are placed, so it
is impossible to read the file. Unfortunately the only known algorithm
to build this index is to store a small amount of information about
each object in memory. (It may be possible to offload this to a disk
based database, but the code for this would be HORRID, and throughput
would crawl to <1 object every 20 ms rather than thousands of objects
per second.)

You could hack up JGit to read the stream from the network, write to
disk, but skip the PackParser class that is doing the index generation
and instead fork native C Git's `git-index-pack` program to create the
index file. But memory usage may be higher that way, it looks like
native Git might have needed 16.3M worth of PackedObjectInfo for what
JGit did in ~12.5M.


Increase your JVM heap with the -Xmx flags (or whatever works for your JVM).


Back to the top