Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [jgit-dev] [Internet]Re: JGit Garbage Collection very slow on large repository

I think this is a known issue, 2022 GOALS for Gerrit[1] mentioned it.

 

>  Improvement of JGit bitmaps for large number of refs

 

IIUC, in the current JGit version, the number of bitmaps to be built will increase with the number of refs.

If the repository has too many refs, the garbage collection would be very slow because of it.

 

If your repository storage is FileRepository, using cgit for garbage collection instead may help.

 

Otherwise, you may need to call the lower interface PackWriter#preparePack() and use that “noBitmaps”

parameter to exclude branches other than trunk. This change[2] has introduced it.

 

[1] https://gitenterprise.me/2022/01/10/2022-goals-for-gerrit/

[2] https://git.eclipse.org/r/c/jgit/jgit/+/97404

 

Regards,

Kyle

 

 

From: jgit-dev <jgit-dev-bounces@xxxxxxxxxxx> On Behalf Of TINGTING ZHOU
Sent: 2022114 7:37
To: Sohn, Matthias <matthias.sohn@xxxxxxx>; jgit-dev@xxxxxxxxxxx
Subject: [Internet]Re: [jgit-dev] JGit Garbage Collection very slow on large repository

 

hi Matthias,

 

There is another thing that I have observed. We are blocked for around 3hours after this log: "Start Building bitmaps:", the number is over 2w. May I ask what is the possible factor which may impact the number of bitmaps here? Is that related to the number of branches?

 

 

Thanks.

 

 

On Thu, Nov 3, 2022 at 12:57 PM TINGTING ZHOU <zhoutt96@xxxxxxxxx> wrote:

Thank you Matthias,

 

This is a private repository, actually, I will try to reproduce this issue on a public repository and let you. At the same time, will try to grab the GC profile here. Regarding the heap size, it's 4GB on our host.

 

 

Regards,

Tingting

 

On Thu, Nov 3, 2022 at 1:35 AM Sohn, Matthias <matthias.sohn@xxxxxxx> wrote:

Is this repository publicly accessible so that we can reproduce this ?

Without having access to the repository, we can’t find out what might be the cause.

Maybe you need more heap ?

If the repository is private maybe, you can profile garbage collection e.g. using JFR/JMC ?

 

-Matthias

 

 

From: jgit-dev <jgit-dev-bounces@xxxxxxxxxxx> on behalf of TINGTING ZHOU <zhoutt96@xxxxxxxxx>
Date: Thursday, 3. November 2022 at 01:14
To:
jgit-dev@xxxxxxxxxxx <jgit-dev@xxxxxxxxxxx>
Subject: [jgit-dev] JGit Garbage Collection very slow on large repository

Some people who received this message don't often get email from zhoutt96@xxxxxxxxx. Learn why this is important

Hi JGit team,

 

I am an engineer from Oracle, and recently we are seeing some performance issues when doing garbage collection using JGit. The size of the repository is around 400MB, and around 30000 branches, and 50000 commits in this repository. When I check the logs, I found it takes around 2hour 30mins to build the bitmap index. I am asking here to get more ideas from the professional team: why it takes so long for doing GC on this repository? Is that expected?

Regards,

Tingting

 


Back to the top