Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [jgit-dev] listener for porcelain/plumbing commands?

Hi Alex,

in the while loop of AddCommand#call() method I see that every file is read twice from hard disk:

1) long contentSize = f.getEntryContentLength();

2) InputStream in = f.openEntryStream();
                                    try {
                                                Constants.OBJ_BLOB, contentSize, in));
                                    } finally {

f.getEntryContentLength() not only returns the content length but also read the file into a 64K array. Is the contentSize really needed in the inserter.insert(...) method?


2013/7/24 Alex Blewitt <alex.blewitt@xxxxxxxxx>
When you do an add, it has to read through the files and generate a hash of the content. As such you will at a minimum find that it will take time proportional to the amount of time taken to read all of the files. Try timing cp * /dev/null or copy * NUL: and find out how long it takes to dump the files. A better estimate can be done by piping through sha1.

Obviously the time taken to do this is proportional to the size of the files. Adding 10000 empty files on my machine takes less than a second. 

If your files are sourced from a network share then clearly you will be much more impacted by IO.


On 24 Jul 2013, at 22:12, Christian Trutz wrote:

Hello jgit developers,

I tested the JGit AddCommand performance with many files. Here the results:

100 files  12 sec.
1000 files 91 sec.
10000 files ? (too long)

The results are a little sobering for me. But the "original" git add . command is only a little better:

100 files 10 sec.
1000 files 70 sec.
10000 files ? (too long)

I think the main bottleneck is the harddisk IO... I will profile a little the JGit AddCommand and will try to find some possible other bottlenecks ...


2013/7/24 Christian Trutz <christian.trutz@xxxxxxxxx>
Hi Tomek,

ah ok I understand, you mean with something like "git status" I can get the chanced files and then add them individually. OK this should work. Thank you for your ideas ...


2013/7/24 Tomasz Zarna <tzarna@xxxxxxxxx>
I think what Alex meant is not to use "git add ." per se, but get
files that you to stage and process them individually. This way you
should be able to provide feedback to the user and get all the files
staged at the same time.


On Wed, Jul 24, 2013 at 2:54 PM, Christian Trutz
<christian.trutz@xxxxxxxxx> wrote:
> Hi Alex,
> OK, thank you for you answers. But with "git add ." the problem remains the
> same. OK, I will do some performance tests with 1000, 5000 and 10000 files
> ... I am a little curious about the JGit performance :-)
> Regards
> Christian
> 2013/7/24 Alex Blewitt <alex.blewitt@xxxxxxxxx>
>> On 24 Jul 2013, at 09:50, Christian Trutz <christian.trutz@xxxxxxxxx>
>> wrote:
>> > Hi Alex,
>> >
>> > the use case is:
>> >
>> > we want to version many (1000+) configuration files (xml files)
>> > automatically. The purpose is to have the history and the ability to roll
>> > back configuration files. JGit is predestined for such a job ;-) The user
>> > has a "button" and he can say "Version now ...". This action invokes:
>> >
>> > git add .  (all configuration files)
>> > git commit
>> >
>> > But we want also to inform the user about the status of the "git commit"
>> > command (10% ready, 20% ready, etc) and the user should also have the
>> > posibility to abort the perhaps long running "git commit" command ...
>> If you are creating a single commit for all files then the commit will be
>> atomic. However the time will not be spent in the commit, it will be spent
>> in the add. So you can report based on that instead.
>> Alex
> --
> Christian Trutz
> Von-Flotow-Straße 24
> D-45772 Marl
> Festnetz (privat): +49 (0)2365 3840327
> E-Mail: christian.trutz@xxxxxxxxx
> _______________________________________________
> jgit-dev mailing list
> jgit-dev@xxxxxxxxxxx

Back to the top