Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [jgit-dev] Performance of commit preparation

On Fri, Oct 22, 2010 at 5:08 PM, Matthias Sohn
<matthias.sohn@xxxxxxxxxxxxxx> wrote:
> 2010/10/21 Shawn Pearce <spearce@xxxxxxxxxxx>
>> NIO2 in Java 7 might solve this problem.  But right now that isn't
>> available to us.  I keep thinking about doing an optional tiny JNI
>> layer for JGit that just offers us a handful of helper routines.
>> Exposing the basics (type, length, last modified time) of POSIX and
>> Win32 stat system calls is one of those.
> any hints on which other helpers we would need in addition and
> how this layer should look like ? I would be willing to try this ...

I had 5 things in mind that might be useful to JGit:

1)  symlinks:  Provide readlink() and symlink() so JGit can actually
process symlinks like native C Git does, assuming the JNI library can
be loaded and that you are on a POSIX system where symlinks work like
they should.

2)  lstat:  Provide the majority of the lstat() structure up to the
Java level.  The important part about this is first that its lstat()
and not stat(), because then we read the status of a symlink itself
and not the target of the link.  The second part is being able to get
the st_mode, st_mtime and st_size fields in a single operating system
call.  If we want to be more compatibility with the C implementation
we would honor the tv_nsec for nanosecond component of the time field,
so we can get more accurate times than just milliseconds.  We might
also want to honor the st_ctime, st_dev, st_ino, st_uid and st_gid
fields that C Git stores into the index record (see DirCacheEntry's
commented out static constants).

3)  readdir:  Provide a Linux-like readdir to replace
File.listFiles().  Some C libraries provide a d_type field in struct
dirent that is returned by readdir.  This field can have values like
DT_DIR, DT_REG, and DT_LNK, to hint about what the item is.  If we
have this data we don't need to perform a stat on a path in order to
know how to work with it.

4)  mmap and lookup of entries in pack index files.  PackIndexV[12]
classes are brutal in Java.  If we had native forms of these that use
mmap() to open the file and a C implementation of the binary search
algorithm, we might be able to do more efficient object lookup.  We
don't use NIO's mapped ByteBuffer code because its slower than what we
have, and the GC isn't able to release the mmap region fast enough
when we decide we don't want that file anymore.  If we do this in
native code ourselves we can also provide explicit unmapping.

5)  mmap and inflation of objects from pack files.  PackFile is pretty
brutal, needing to load in slices of a pack into byte[] and then
inflate those byte[] chunks through the Inflater class.  If we do all
of this in C, and use explicit mmap calls, we can avoid the allocation
of JVM heap memory and just use the operating system's buffer cache
directly to read from the packs.  We can also shovel the chunks of
data into libz inflate() more efficiently, which means we can probably
read small objects more quickly, resulting in faster processing of
commits and trees.

The last one might be harder now that we are trying to support large
objects.  But it could still make a good performance improvement.  A
lot of our resource cost right now is tied up in WindowCache and the
byte[] we had to allocate in order to copy the data in from the file.
If we can just convert those over to mmap() slices that are accessible
only from JNI, and expose in JNI just two small methods like:

  readObjectHeader()  -- first half of the load() method in PackFile
  inflateRegion()  -- the incremental decompression in WindowCursor

We might be able to do almost everything else at the Java level with
much lower overheads.

I think the above list is already sorted in priority order.  1
(symlink) and 2 (lstat) are needed just for good compatibility with C
Git and are fairly simple to implement.  3 (readdir with d_type) would
give us a small performance boost, but isn't that important.  4 and 5
are likely to provide some real performance improvements, but are a
lot more work.


Back to the top