|Re: [jgit-dev] Question on large object streams|
On Mon, Oct 4, 2010 at 10:17 AM, Dmitry NeverovIt depends on how the large object is stored. :-)
> Why reading from large object streams is so slow?
If its in a pack file, and is stored as a delta to another object, its
so slow that its unusable. If its stored whole in the pack (not as a
delta), or is a loose object, its performance is acceptable. Its
slower than the fast path, but its still something that a user won't
mind waiting for.
RIght, this object is a delta in a pack file. What's happening here
> We have a file of size ~ 10Mb and reading its content from ObjectStream
> takes forever.
> I see a file 'noz5208794269214828797.tmp' in .git dir, it's size grows
> slowly (~2Mb per hour).
> And whenever I pause execution I saw this in stack trace:
> main@1, prio=5, in group 'main', status: 'runnable'
> java.lang.Thread.State: RUNNABLE
> locked <0xb05> (a java.util.zip.Inflater)
> locked <0x94b> (a java.io.BufferedInputStream)
> locked <0x603> (a jetbrains.buildServer.vcs.patches.PatchBuilderImpl)
> at java.util.zip.Inflater.inflateBytes(Inflater.java:-1)
> at java.util.zip.Inflater.inflate(Inflater.java:215)
is you are deflating the base object into a temporary file, and then
doing random seeks on that temporary file in order to apply the delta.
If the delta is at the end of a delta chain that is say 15 objects
long, you need to do this 15 times before you can get to the data for
the requested object. That can be a lot of work.
What version of JGit is this? Tip of master should be inflating these
objects into loose objects in the loose objects directory, such that
subsequent access is faster because its just streaming from the loose
object rather than the packed form. But its still slow for the
initial read. :-(
One thing we should do is teach IndexPack about this and have it cache
the large delta object as a loose object immediately during
fetch/clone so that during checkout we have fast access to that
content. But I hadn't thought about doing that until just now, so
Increase the core.streamFileThreshold in your WindowCacheConfig to a
> How can we speed it up?
value larger than the default. Right now the default is 5 MiB. But I
thought I had patches queued on Gerrit to increase this to 50 MiB.
You can also use the -delta gitattribute when you pack the repository
to try and keep these "large" files from being delta compressed. The
resulting pack will be bigger, but JGit will perform better when
accessing it because the bigger objects can be directly streamed.
Back to the top