Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [jgit-dev] Writing packs with bitmaps sometimes slower as without bitmaps?

Hi,
found time to proceed on this issue.

I can now reproduce the issue with the public gerrit repo. When pushing one new commit to a local clone of the gerrit repo (a repo having a lot of refs) then the push is 8 times slower if the sending repo has bitmaps. Most of the time is spent in org.eclipse.jgit.revwalk.BitmapWalker.findObjectsWalk(). The script which execution log I append can be downloaded from https://gist.github.com/chalstrick/864fecf5cc45c056e90225418d6b9c89


++ jgit --version
jgit version 5.1.8-SNAPSHOT
++ rm -fr gerrit.src.git gerrit.dst.git gerrit.client
++ git clone --bare --mirror https://gerrit.googlesource.com/gerrit gerrit.dst.git
...
++ cp -r gerrit.dst.git gerrit.dst.git.backup
++ git clone --bare --mirror gerrit.dst.git gerrit.src.git
Cloning into bare repository 'gerrit.src.git'...
done.
++ git clone gerrit.src.git gerrit.client
Cloning into 'gerrit.client'...
done.
++ cd gerrit.client
++ date
++ git add README.md
++ git commit -m 'modify README.md'
[master 7d26b6c82f8] modify README.md
 1 file changed, 1 insertion(+)
++ git push origin
...
++ cd gerrit.src.git

##### The fast push (2s) when we don't have bitmaps

++ jgit push origin HEAD:refs/heads/master
Counting objects:       3
Finding sources:        100% (3/3)
Getting sizes:          100% (2/2)
Compressing objects:    100% (4210/4210)
Writing objects:        100% (3/3)
remote: Updating references: 100% (1/1)To /Users/d032780/git/repl_test/gerrit.dst.git
   78e6f24..7d26b6c  HEAD -> master

real    0m2.173s
user    0m5.084s
sys     0m0.525s
++ rm -fr gerrit.dst.git
++ cp -r gerrit.dst.git.backup gerrit.dst.git
++ cd gerrit.src.git
### Force creating bitmaps
++ git repack -a -d -b
Enumerating objects: 1053003, done.
Counting objects: 100% (1053003/1053003), done.
Delta compression using up to 8 threads
Compressing objects: 100% (436944/436944), done.
Writing objects: 100% (1053003/1053003), done.
Selecting bitmap commits: 238041, done.
Building bitmaps: 100% (352/352), done.
Total 1053003 (delta 467117), reused 1052949 (delta 467065)

##### The slow push (9s) when we have bitmaps

++ jgit push origin HEAD:refs/heads/master
Counting objects:       528225
Finding sources:        100% (3/3)
Getting sizes:          100% (2/2)
Compressing objects:    100% (1142/1142)
Writing objects:        100% (3/3)
remote: Updating references: 100% (1/1)To /Users/d032780/git/repl_test/gerrit.dst.git
   78e6f24..7d26b6c  HEAD -> master

real    0m9.070s
user    0m12.130s
sys     0m1.015s
>
  

On Mon, May 13, 2019 at 2:43 PM Christian Halstrick <christian.halstrick@xxxxxxxxx> wrote:
Hi,
I was trying to reproduce it on a big public repo but up to know had
no success with that. That's a very time consuming task because I need
huge repos with a ton of refs. Tried with linux, no luck. Now I check
with chromium. I inform you when I managed that.

But by just inspecting the code and looking at the trace output I see
a difference between the bitmap aware code and the non-bitmap-aware
code in PackWriter#findObjectsToPack. In my repeatable case
BitmapWalker.findObjects(have, null, true)
runs for 15 minutes :-(

On Fri, May 10, 2019 at 10:53 PM Ivan Frade <ifrade@xxxxxxxxxx> wrote:
>
> Hi Christian,
>
>  I have been working with bitmaps lately [1]. My changes doesn't affect PackWriter (they are not even committed yet!), but I am interested in everything bitmap-related.
>
>  Did you gather any more information about this issue? Any chance to reproduce it in a test?
>
>  Regards,
>
> Ivan
>
>  [1] https://git.eclipse.org/r/c/140958/
>
>
>
> From: Christian Halstrick <christian.halstrick@xxxxxxxxx>
> Date: Wed, May 8, 2019 at 4:04 PM
> To: JGit Developers list
>
>> Hi,
>>
>> I am investigating a performance problem non-public gerrit servers.
>> The problem boils
>> down to slow performance of PackWriter when using bitmaps. When the
>> usage of bitmaps
>> is turned off then performance is at least 5 times better.
>>
>> I uploaded a the change https://git.eclipse.org/r/141842 to
>> demonstrate that. That change
>> adds some printfs to emit performance data for the PackWriter class.
>> Additionally a System
>> property PackWriterForceNoBitmap is introduced that when set to true
>> forces PackWriter not
>> to use bitmaps.
>>
>> Problem is the performance of PackWriter#findObjectsToPack. That
>> method delegates to
>> PackWriter#findObjectsToPackUsingBitmaps() which is in my case much
>> slower than using
>> the default code not using bitmaps. It looks like
>> findObjectsToPackUsingBitmaps() is first calculating all have objects,
>> then all want objects and then calculates the difference. In huge
>> repos calculating all have objects is consuming 900sec.
>> The non-bitmap code in findObjectsToPack() creates a walk where have
>> and want objects are both used and has to walk only over very few
>> objects which takes only 200sec.
>> In the end both algorithms (bitmap and non-bitmap aware code) find the
>> same result: only one commit with one new blob has to be sent.
>>
>> Is somebody aware of the fact that when working on packfiles of 2GB
>> size findObjectsToPackUsingBitmaps() is so much slower than
>> non-bitmap-aware code?
>>
>> Stats of the repo:
>> $ du -sh *.pack
>> 1.1M    pack-097c243a771df372d4e1098af0d89a3d25be8de6.pack
>>  36K    pack-0bc2fecc4d3c070b82285b70a6394b15f87a9e50.pack
>>  64K    pack-57c34e10b1cc0305784f48bda175064e2ad8fa1a.pack
>> 240K    pack-7cc06b124492dcab6fda3e797f3c16fe06929e2a.pack
>> 2.0G    pack-e7a9322fa39225c670cab245b9e012c1aa6f3a61.pack
>>
>>
>> Here are the traces:
>>
>> =================================================
>> added performance printfs, forced not to use bitmaps
>> =================================================
>> ...
>> Counting objects:       1
>> Counting objects:       4
>> TracePushPerf: findObjectsToPush(): code not using bitmaps runtime: 251776
>> TracePushPerf: findObjectsToPack() runtime: 251916
>>
>> Finding sources:         25% (1/4)
>> Finding sources:         50% (2/4)
>> Finding sources:         75% (3/4)
>> Finding sources:        100% (4/4)
>> Finding sources:        100% (4/4)
>>
>> Getting sizes:           33% (1/3)
>> Getting sizes:           66% (2/3)
>> Getting sizes:          100% (3/3)
>> Getting sizes:          100% (3/3)
>>
>> Compressing objects:     99% (8001/8060)
>> Compressing objects:    100% (8060/8060)
>> Compressing objects:    100% (8060/8060)
>>
>> Writing objects:         25% (1/4)
>> Writing objects:         50% (2/4)
>> Writing objects:         75% (3/4)
>> Writing objects:        100% (4/4)
>> Writing objects:        100% (4/4)
>>
>> remote: Updating references: 100% (1/1)To
>> /Users/d032780/git/repl_test/hana.dst.git/
>>    766c519..cdcc784  temp_d056507 -> temp_d056507
>>
>> =================================================
>> added performance printfs, using bitmaps
>> =================================================
>> Counting objects:       1
>> Counting objects:       41507
>> Counting objects:       100967
>> Counting objects:       165489
>> ...
>> Counting objects:       3946346
>> Counting objects:       3948655
>> TracePushPerf: findObjectsToPackUsingBitmaps() ms to find find haves: 929147
>>
>> Counting objects:       4065807
>> TracePushPerf: findObjectsToPackUsingBitmaps()  ms to find find want: 638
>> TracePushPerf: findObjectsToPackUsingBitmaps()  ms to find find need: 8
>> TracePushPerf: findObjectsToPackUsingBitmaps()  ms to add needed: 2
>> TracePushPerf: findObjectsToPackUsingBitmaps() runtime: 929795
>>
>> Counting objects:       4103019
>> TracePushPerf: findObjectsToPack() runtime: 938584
>>
>> Finding sources:         25% (1/4)
>> Finding sources:         50% (2/4)
>> Finding sources:         75% (3/4)
>> Finding sources:        100% (4/4)
>> Finding sources:        100% (4/4)
>>
>> Getting sizes:           33% (1/3)
>> Getting sizes:           66% (2/3)
>> Getting sizes:          100% (3/3)
>> Getting sizes:          100% (3/3)
>>
>> Compressing objects:     99% (8001/8060)
>> Compressing objects:    100% (8060/8060)
>> Compressing objects:    100% (8060/8060)
>>
>> Writing objects:         25% (1/4)
>> Writing objects:         50% (2/4)
>> Writing objects:         75% (3/4)
>> Writing objects:        100% (4/4)
>> Writing objects:        100% (4/4)
>>
>> remote: Updating references: 100% (1/1)To
>> /Users/d032780/git/repl_test/hana.dst.git/
>>    766c519..cdcc784  temp_d056507 -> temp_d056507
>> _______________________________________________
>> jgit-dev mailing list
>> jgit-dev@xxxxxxxxxxx
>> To change your delivery options, retrieve your password, or unsubscribe from this list, visit
>> https://www.eclipse.org/mailman/listinfo/jgit-dev

Back to the top