Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
[jgit-dev] Writing packs with bitmaps sometimes slower as without bitmaps?

Hi,

I am investigating a performance problem non-public gerrit servers.
The problem boils
down to slow performance of PackWriter when using bitmaps. When the
usage of bitmaps
is turned off then performance is at least 5 times better.

I uploaded a the change https://git.eclipse.org/r/141842 to
demonstrate that. That change
adds some printfs to emit performance data for the PackWriter class.
Additionally a System
property PackWriterForceNoBitmap is introduced that when set to true
forces PackWriter not
to use bitmaps.

Problem is the performance of PackWriter#findObjectsToPack. That
method delegates to
PackWriter#findObjectsToPackUsingBitmaps() which is in my case much
slower than using
the default code not using bitmaps. It looks like
findObjectsToPackUsingBitmaps() is first calculating all have objects,
then all want objects and then calculates the difference. In huge
repos calculating all have objects is consuming 900sec.
The non-bitmap code in findObjectsToPack() creates a walk where have
and want objects are both used and has to walk only over very few
objects which takes only 200sec.
In the end both algorithms (bitmap and non-bitmap aware code) find the
same result: only one commit with one new blob has to be sent.

Is somebody aware of the fact that when working on packfiles of 2GB
size findObjectsToPackUsingBitmaps() is so much slower than
non-bitmap-aware code?

Stats of the repo:
$ du -sh *.pack
1.1M    pack-097c243a771df372d4e1098af0d89a3d25be8de6.pack
 36K    pack-0bc2fecc4d3c070b82285b70a6394b15f87a9e50.pack
 64K    pack-57c34e10b1cc0305784f48bda175064e2ad8fa1a.pack
240K    pack-7cc06b124492dcab6fda3e797f3c16fe06929e2a.pack
2.0G    pack-e7a9322fa39225c670cab245b9e012c1aa6f3a61.pack


Here are the traces:

=================================================
added performance printfs, forced not to use bitmaps
=================================================
...
Counting objects:       1
Counting objects:       4
TracePushPerf: findObjectsToPush(): code not using bitmaps runtime: 251776
TracePushPerf: findObjectsToPack() runtime: 251916

Finding sources:         25% (1/4)
Finding sources:         50% (2/4)
Finding sources:         75% (3/4)
Finding sources:        100% (4/4)
Finding sources:        100% (4/4)

Getting sizes:           33% (1/3)
Getting sizes:           66% (2/3)
Getting sizes:          100% (3/3)
Getting sizes:          100% (3/3)

Compressing objects:     99% (8001/8060)
Compressing objects:    100% (8060/8060)
Compressing objects:    100% (8060/8060)

Writing objects:         25% (1/4)
Writing objects:         50% (2/4)
Writing objects:         75% (3/4)
Writing objects:        100% (4/4)
Writing objects:        100% (4/4)

remote: Updating references: 100% (1/1)To
/Users/d032780/git/repl_test/hana.dst.git/
   766c519..cdcc784  temp_d056507 -> temp_d056507

=================================================
added performance printfs, using bitmaps
=================================================
Counting objects:       1
Counting objects:       41507
Counting objects:       100967
Counting objects:       165489
...
Counting objects:       3946346
Counting objects:       3948655
TracePushPerf: findObjectsToPackUsingBitmaps() ms to find find haves: 929147

Counting objects:       4065807
TracePushPerf: findObjectsToPackUsingBitmaps()  ms to find find want: 638
TracePushPerf: findObjectsToPackUsingBitmaps()  ms to find find need: 8
TracePushPerf: findObjectsToPackUsingBitmaps()  ms to add needed: 2
TracePushPerf: findObjectsToPackUsingBitmaps() runtime: 929795

Counting objects:       4103019
TracePushPerf: findObjectsToPack() runtime: 938584

Finding sources:         25% (1/4)
Finding sources:         50% (2/4)
Finding sources:         75% (3/4)
Finding sources:        100% (4/4)
Finding sources:        100% (4/4)

Getting sizes:           33% (1/3)
Getting sizes:           66% (2/3)
Getting sizes:          100% (3/3)
Getting sizes:          100% (3/3)

Compressing objects:     99% (8001/8060)
Compressing objects:    100% (8060/8060)
Compressing objects:    100% (8060/8060)

Writing objects:         25% (1/4)
Writing objects:         50% (2/4)
Writing objects:         75% (3/4)
Writing objects:        100% (4/4)
Writing objects:        100% (4/4)

remote: Updating references: 100% (1/1)To
/Users/d032780/git/repl_test/hana.dst.git/
   766c519..cdcc784  temp_d056507 -> temp_d056507


Back to the top