Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [jgit-dev] [Internet]Re: JGit Garbage Collection very slow on large repository

I noticed long git repo clone times as soon as the repo contains 1 big (~150MB) text files (csv).. 
It would take ~25-30min to clone such a repo using jgit.


Am Freitag, 4. November 2022 um 08:11:48 MEZ hat kylezhao(赵柯宇) <kylezhao@xxxxxxxxxxx> Folgendes geschrieben:

I think this is a known issue, 2022 GOALS for Gerrit[1] mentioned it.


>  Improvement of JGit bitmaps for large number of refs


IIUC, in the current JGit version, the number of bitmaps to be built will increase with the number of refs.

If the repository has too many refs, the garbage collection would be very slow because of it.


If your repository storage is FileRepository, using cgit for garbage collection instead may help.


Otherwise, you may need to call the lower interface PackWriter#preparePack() and use that “noBitmaps”

parameter to exclude branches other than trunk. This change[2] has introduced it.









From: jgit-dev <jgit-dev-bounces@xxxxxxxxxxx> On Behalf Of TINGTING ZHOU
Sent: 2022114 7:37
To: Sohn, Matthias <matthias.sohn@xxxxxxx>; jgit-dev@xxxxxxxxxxx
Subject: [Internet]Re: [jgit-dev] JGit Garbage Collection very slow on large repository


hi Matthias,


There is another thing that I have observed. We are blocked for around 3hours after this log: "Start Building bitmaps:", the number is over 2w. May I ask what is the possible factor which may impact the number of bitmaps here? Is that related to the number of branches?






On Thu, Nov 3, 2022 at 12:57 PM TINGTING ZHOU <zhoutt96@xxxxxxxxx> wrote:

Thank you Matthias,


This is a private repository, actually, I will try to reproduce this issue on a public repository and let you. At the same time, will try to grab the GC profile here. Regarding the heap size, it's 4GB on our host.






On Thu, Nov 3, 2022 at 1:35 AM Sohn, Matthias <matthias.sohn@xxxxxxx> wrote:

Is this repository publicly accessible so that we can reproduce this ?

Without having access to the repository, we can’t find out what might be the cause.

Maybe you need more heap ?

If the repository is private maybe, you can profile garbage collection e.g. using JFR/JMC ?





From: jgit-dev <jgit-dev-bounces@xxxxxxxxxxx> on behalf of TINGTING ZHOU <zhoutt96@xxxxxxxxx>
Date: Thursday, 3. November 2022 at 01:14
jgit-dev@xxxxxxxxxxx <jgit-dev@xxxxxxxxxxx>
Subject: [jgit-dev] JGit Garbage Collection very slow on large repository

Some people who received this message don't often get email from zhoutt96@xxxxxxxxx. Learn why this is important

Hi JGit team,


I am an engineer from Oracle, and recently we are seeing some performance issues when doing garbage collection using JGit. The size of the repository is around 400MB, and around 30000 branches, and 50000 commits in this repository. When I check the logs, I found it takes around 2hour 30mins to build the bitmap index. I am asking here to get more ideas from the professional team: why it takes so long for doing GC on this repository? Is that expected?




jgit-dev mailing list
To unsubscribe from this list, visit

Back to the top