|Re: [jgit-dev] Problem when trying to do diff on large files stored in LargePackedDeltaObject|
I am looking forward to a fix :). Should I open a bug for this?
>>1) If I understand correct the TeeInputStream is created so that a
>> loose object is created that can later be used instead of reading the
>> pack file again. But in this case where only the size is wanted it
>> instead creates huge amount of overhead. Can't this be handled in a
>> special case, e.g. the TeeInputStream set up first when a read is performed?
>Oops. This was clearly a mistake. If all the caller wanted was the size we
>shouldn't have done this.
>> 2) It takes crazy long time to create this loose object. You can see aI was afraid you would say this :)
>> objects/noz2787230184080961842.tmp that grows very slowly, the largest
>> file I have there is 86M and that took > 12h on a decent machine.
>> Can't this be improved?
>I don't know how to do this better. The problem is the delta is seeking
>randomly around in the base. The base needs to be created in order to
>support seeking. Even if the base is a loose object it is compressed
>and needs to be inflated in order to find the relevant section. If the
>delta skips backwards again the inflater is closed and reopened and
>uncompresses forward until the relevant spot is found.
>Setting the streamFileThreshold to a larger size allows the base to be
>fully in memory as a contiguous byte array which is randomly accessible
>in constant time. This matches what git-core does. It uses a lot of
>memory, but is faster.
If there is a fix for 1) and my application don't touch big files (other
than looking at the size) I assume that I can still get away with a
Back to the top