Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
[jgit-dev] Memory-mapped PackIndexV2

I have been experimenting with a memory-mapped PackedIndexV2 implementation and so far results look promising. For large index files and small operations, i.e. where ratio of required-index-data/read-index-data is very small (e.g. parsing a single commit), speedup of factor 100x and more is possible (see experiments below). The current state of my patch is very work-in-progress. It enables optional use of the memory-mapped pack-index and is meant as basis for discussion:

https://git.eclipse.org/r/c/jgit/jgit/+/170675

To decide whether/how to continue this work, I would very much appreciate feedback on following open questions:

(1) How to safely "unmap" the MappedByteBuffer, once the PackIndex is closed?

As my colleague, Alexandr, has pointed out, sun.misc.Unsafe.invokeCleaner() can be used here. According to our experiments, this seems to work fine and for us using this API is acceptable as long as there is no better "supported" API.

(2) Should we handle index files larger than Integer.MAX limit?

This will make the implementation of PackedIndexV2m more complex. Currently I'm not aware of any (public) repositories which are close to 2GB pack indexes. On the other hand, real-world repositories like the Linux Kernel[1] (250M) or Chromium[2] (500M) are almost in this magnitude. Hence, I would support >2GB index files from the very beginning.

(3) Should we handle multi-threaded access to buffers?

The current patch asserts single-thread access, which is sufficient for our Git client. I haven't checked in detail, but from my understanding this should be true for most of JGit's own code, too. For the few multi-threaded usages, the current in-memory PackIndexV2 could be used.

It would be interesting to hear whether Gerrit/EGit and other projects are using JGit's Pack-API in a single-threaded or multi-threaded way?

Implementing thread-safety should be no big deal. We see following options here:

(a) synchronizing all public methods of the new PackIndexV2m: straight-forward implementation with probably more frequent synchronized-executions; or

(b) having separate buffers per threads: more complex implementation with probably less frequent synchronized-executions

We ourselves haven't experienced problems with frequent synchronized-executions, but I recall that JGit is rather trying to avoid that, if possible?

(4) Have a more reasonable design, once approach for (1)-(3) are clarified

Experiments
===========

I have uploaded my benchmarking code at:

https://git.eclipse.org/r/c/jgit/jgit/+/170797

It requires a recent clone of the Linux repository with just a single pack-file to run. Benchmarks are comparing current in-memory PackIndexV2 with proposed memory-mapped PackIndexV2m.

Benchmarks were performed on my Windows 8.1 machine, quad-core, 8GB RAM, SSD. "Score" denotes the average execution time in ms. "useMmap=true" denotes the memory-mapped version of the benchmark.

Windows Results
---------------
PackIndexV2LoadCommitsBenchmark.testLoadRandomCommits
(commitCount)  (useMmap)  Mode  Cnt    Score    Error  Units
            1      false    ss   20  164,271 ± 16,327  ms/op
            1       true    ss   20    1,779 ±  0,286  ms/op
           10      false    ss   20  165,841 ±  7,374  ms/op
           10       true    ss   20    3,057 ±  0,255  ms/op
          100      false    ss   20  164,650 ±  8,172  ms/op
          100       true    ss   20    8,830 ±  2,218  ms/op
         1000      false    ss   20  190,149 ±  8,033  ms/op
         1000       true    ss   20   49,824 ± 10,934  ms/op

PackIndexV2FindOffsetBenchmark.testFindSingleOffset:
(useMmap)  Mode  Cnt     Score    Error  Units
    false  avgt   20   157,933 ±  5,613  ms/op
     true  avgt   20     0,173 ±  0,053  ms/op

PackIndexV2FindOffsetBenchmark.testFindAllOffsets:
(useMmap)  Mode  Cnt     Score    Error  Units
    false  avgt   20   821,798 ± 16,820  ms/op
     true  avgt   20  1965,568 ± 11,618  ms/op

Linux Results (Ubuntu 18.04 VM, 4 cores, 4G RAM)
------------------------------------------------
(commitCount)  (useMmap)  Mode  Cnt     Score     Error  Units
            1      false    ss   20  1218.530 ±  13.141  ms/op
            1       true    ss   20     7.412 ±   1.619  ms/op
           10      false    ss   20   429.231 ± 353.918  ms/op
           10       true    ss   20    23.468 ±   9.429  ms/op
          100      false    ss   20   236.937 ±  76.485  ms/op
          100       true    ss   20     8.602 ±   5.417  ms/op
         1000      false    ss   20   208.833 ±  13.432  ms/op
         1000       true    ss   20    52.573 ±  30.591  ms/op

PackIndexV2FindOffsetBenchmark.testFindSingleOffset:
(useMmap)  Mode  Cnt     Score    Error  Units
    false  avgt   20   183.936 ± 10.847  ms/op
     true  avgt   20     0.046 ±  0.005  ms/op

PackIndexV2FindOffsetBenchmark.testFindAllOffsets:
(useMmap)  Mode  Cnt     Score    Error  Units
    false  avgt   20   890.854 ± 15.688  ms/op
     true  avgt   20  2047.881 ± 31.824  ms/op

Note that for the memory-mapped benchmarks, WindowCache will be switched to memory-mapped mode, too (this corresponds to JGit config option "core.packedgitmmap" does). This only affects PackIndexV2LoadCommitsBenchmark.

[1] https://github.com/torvalds/linux.git
[2] https://chromium.googlesource.com/chromium/src

Thanks for your ideas!

-Marc







Back to the top