[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [jgit-dev] Writing packs with bitmaps sometimes slower as without bitmaps?

Hi,
I was trying to reproduce it on a big public repo but up to know had
no success with that. That's a very time consuming task because I need
huge repos with a ton of refs. Tried with linux, no luck. Now I check
with chromium. I inform you when I managed that.

But by just inspecting the code and looking at the trace output I see
a difference between the bitmap aware code and the non-bitmap-aware
code in PackWriter#findObjectsToPack. In my repeatable case
BitmapWalker.findObjects(have, null, true)
runs for 15 minutes :-(

On Fri, May 10, 2019 at 10:53 PM Ivan Frade <ifrade@xxxxxxxxxx> wrote:
>
> Hi Christian,
>
>  I have been working with bitmaps lately [1]. My changes doesn't affect PackWriter (they are not even committed yet!), but I am interested in everything bitmap-related.
>
>  Did you gather any more information about this issue? Any chance to reproduce it in a test?
>
>  Regards,
>
> Ivan
>
>  [1] https://git.eclipse.org/r/c/140958/
>
>
>
> From: Christian Halstrick <christian.halstrick@xxxxxxxxx>
> Date: Wed, May 8, 2019 at 4:04 PM
> To: JGit Developers list
>
>> Hi,
>>
>> I am investigating a performance problem non-public gerrit servers.
>> The problem boils
>> down to slow performance of PackWriter when using bitmaps. When the
>> usage of bitmaps
>> is turned off then performance is at least 5 times better.
>>
>> I uploaded a the change https://git.eclipse.org/r/141842 to
>> demonstrate that. That change
>> adds some printfs to emit performance data for the PackWriter class.
>> Additionally a System
>> property PackWriterForceNoBitmap is introduced that when set to true
>> forces PackWriter not
>> to use bitmaps.
>>
>> Problem is the performance of PackWriter#findObjectsToPack. That
>> method delegates to
>> PackWriter#findObjectsToPackUsingBitmaps() which is in my case much
>> slower than using
>> the default code not using bitmaps. It looks like
>> findObjectsToPackUsingBitmaps() is first calculating all have objects,
>> then all want objects and then calculates the difference. In huge
>> repos calculating all have objects is consuming 900sec.
>> The non-bitmap code in findObjectsToPack() creates a walk where have
>> and want objects are both used and has to walk only over very few
>> objects which takes only 200sec.
>> In the end both algorithms (bitmap and non-bitmap aware code) find the
>> same result: only one commit with one new blob has to be sent.
>>
>> Is somebody aware of the fact that when working on packfiles of 2GB
>> size findObjectsToPackUsingBitmaps() is so much slower than
>> non-bitmap-aware code?
>>
>> Stats of the repo:
>> $ du -sh *.pack
>> 1.1M    pack-097c243a771df372d4e1098af0d89a3d25be8de6.pack
>>  36K    pack-0bc2fecc4d3c070b82285b70a6394b15f87a9e50.pack
>>  64K    pack-57c34e10b1cc0305784f48bda175064e2ad8fa1a.pack
>> 240K    pack-7cc06b124492dcab6fda3e797f3c16fe06929e2a.pack
>> 2.0G    pack-e7a9322fa39225c670cab245b9e012c1aa6f3a61.pack
>>
>>
>> Here are the traces:
>>
>> =================================================
>> added performance printfs, forced not to use bitmaps
>> =================================================
>> ...
>> Counting objects:       1
>> Counting objects:       4
>> TracePushPerf: findObjectsToPush(): code not using bitmaps runtime: 251776
>> TracePushPerf: findObjectsToPack() runtime: 251916
>>
>> Finding sources:         25% (1/4)
>> Finding sources:         50% (2/4)
>> Finding sources:         75% (3/4)
>> Finding sources:        100% (4/4)
>> Finding sources:        100% (4/4)
>>
>> Getting sizes:           33% (1/3)
>> Getting sizes:           66% (2/3)
>> Getting sizes:          100% (3/3)
>> Getting sizes:          100% (3/3)
>>
>> Compressing objects:     99% (8001/8060)
>> Compressing objects:    100% (8060/8060)
>> Compressing objects:    100% (8060/8060)
>>
>> Writing objects:         25% (1/4)
>> Writing objects:         50% (2/4)
>> Writing objects:         75% (3/4)
>> Writing objects:        100% (4/4)
>> Writing objects:        100% (4/4)
>>
>> remote: Updating references: 100% (1/1)To
>> /Users/d032780/git/repl_test/hana.dst.git/
>>    766c519..cdcc784  temp_d056507 -> temp_d056507
>>
>> =================================================
>> added performance printfs, using bitmaps
>> =================================================
>> Counting objects:       1
>> Counting objects:       41507
>> Counting objects:       100967
>> Counting objects:       165489
>> ...
>> Counting objects:       3946346
>> Counting objects:       3948655
>> TracePushPerf: findObjectsToPackUsingBitmaps() ms to find find haves: 929147
>>
>> Counting objects:       4065807
>> TracePushPerf: findObjectsToPackUsingBitmaps()  ms to find find want: 638
>> TracePushPerf: findObjectsToPackUsingBitmaps()  ms to find find need: 8
>> TracePushPerf: findObjectsToPackUsingBitmaps()  ms to add needed: 2
>> TracePushPerf: findObjectsToPackUsingBitmaps() runtime: 929795
>>
>> Counting objects:       4103019
>> TracePushPerf: findObjectsToPack() runtime: 938584
>>
>> Finding sources:         25% (1/4)
>> Finding sources:         50% (2/4)
>> Finding sources:         75% (3/4)
>> Finding sources:        100% (4/4)
>> Finding sources:        100% (4/4)
>>
>> Getting sizes:           33% (1/3)
>> Getting sizes:           66% (2/3)
>> Getting sizes:          100% (3/3)
>> Getting sizes:          100% (3/3)
>>
>> Compressing objects:     99% (8001/8060)
>> Compressing objects:    100% (8060/8060)
>> Compressing objects:    100% (8060/8060)
>>
>> Writing objects:         25% (1/4)
>> Writing objects:         50% (2/4)
>> Writing objects:         75% (3/4)
>> Writing objects:        100% (4/4)
>> Writing objects:        100% (4/4)
>>
>> remote: Updating references: 100% (1/1)To
>> /Users/d032780/git/repl_test/hana.dst.git/
>>    766c519..cdcc784  temp_d056507 -> temp_d056507
>> _______________________________________________
>> jgit-dev mailing list
>> jgit-dev@xxxxxxxxxxx
>> To change your delivery options, retrieve your password, or unsubscribe from this list, visit
>> https://www.eclipse.org/mailman/listinfo/jgit-dev