[jgit-dev] jGit memory management and optimizations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]

[jgit-dev] jGit memory management and optimizations

From: Emilian Bold <emilian.bold@xxxxxxxxx>
Date: Fri, 16 Nov 2012 19:36:15 +0200
Delivered-to: jgit-dev@xxxxxxxxxxx
List-archive: <https://dev.eclipse.org/mailman/private/jgit-dev>
List-help: <mailto:jgit-dev-request@eclipse.org?subject=help>
List-subscribe: <https://dev.eclipse.org/mailman/listinfo/jgit-dev>, <mailto:jgit-dev-request@eclipse.org?subject=subscribe>
List-unsubscribe: <https://dev.eclipse.org/mailman/options/jgit-dev>, <mailto:jgit-dev-request@eclipse.org?subject=unsubscribe>

Hy,

I'm just starting to look at jGit but from my small tests, it is extremely liberal with RAM.

The only advanced guide I could find about this mentions very few tricks: http://help.eclipse.org/indigo/index.jsp?topic=%2Forg.eclipse.egit.doc%2Fhelp%2FJGit%2FUser_Guide%2FAdvanced-Topics.html

I haven't yet analyzed the source code itself very much but I'll start with a simple questions: how does one efficiently count the commits?

RevWalkUtils.count(...) calls find(walk, start, end).size() which basically builds a huge ArrayList with all the commits. Counting by hand is better, but not by much as, it seems to consume lots of RAM even so (via the RevWalk itself, I assume).

What am I missing?

I'm starting to believe that perhaps I should read more about the Git files format (http://git-scm.com/book/en/Git-Internals-Packfiles ?) and parse that somehow directly -- at least for the whole repository, counting should be fast.

It there something inherent in the git design that makes this so RAM hungry? I realize we are doing a topological sort on a DAG, but this seems to be a rather particular kind of DAG (generally, each vertex has only one incoming/outgoing edge) and I somehow expected operations on it to be much more efficient in terms of both memory and time.

Any low-hanging fruit remaining? Perhaps some ideas about building some 'index' to speed up jgit operations?

--emi

Follow-Ups:
- Re: [jgit-dev] jGit memory management and optimizations
  - From: Shawn Pearce

Prev by Date: Re: [jgit-dev] Issue in Using JGit PullCommand
Next by Date: Re: [jgit-dev] Issue in Using JGit PullCommand
Previous by thread: [jgit-dev] Issue in Using JGit PullCommand
Next by thread: Re: [jgit-dev] jGit memory management and optimizations
Index(es):
- Date
- Thread

Breadcrumbs