Re: [jgit-dev] question on pack files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]

Re: [jgit-dev] question on pack files

From: Shawn Pearce <spearce@xxxxxxxxxxx>
Date: Thu, 13 Jan 2011 12:01:57 -0800
Delivered-to: jgit-dev@xxxxxxxxxxx
List-archive: <https://dev.eclipse.org/mailman/private/jgit-dev>
List-help: <mailto:jgit-dev-request@eclipse.org?subject=help>
List-subscribe: <https://dev.eclipse.org/mailman/listinfo/jgit-dev>, <mailto:jgit-dev-request@eclipse.org?subject=subscribe>
List-unsubscribe: <https://dev.eclipse.org/mailman/options/jgit-dev>, <mailto:jgit-dev-request@eclipse.org?subject=unsubscribe>

On Thu, Jan 13, 2011 at 11:32, Dmitry Neverov <dmitry.neverov@xxxxxxxxx> wrote:
> I'm not sure it happened inside IndexPack.renameAndOpenPack()
> because I don't have stacktraces yet.
>
> We sync repository periodically. From some point in time during
> every attemp to do the fetch we got OutOfMemory error which is
> known problem. Also it seems that during each fetch new .pack
> file appears in the objects/pack. They eventually fills up the
> whole disk. I wonder is it possible to get OOM error during
> renameAndOpenPack() ant try to fetch same pack again and
> again (but with different name)?

Possibly.

If we get OOME during Repository.openPack() the pack won't open and
its objects won't be accessible, and because the Error is going up the
stack the FetchProcess will abort and won't update the local refs.
When fetch tries again, the objects it needs aren't available, so it
will redo the network stuff and download a new pack.  If there is at
least one new object from the remote side, the pack will have a
different name, so it will move into the repository again, and we'll
probably get an OOME again, and the whole process repeats itself.

The issue here might be the PackIndex structures.  Each PackIndex is
completely loaded into memory.  The OOME might have occurred while
allocating the PackIndex during openPack(), the JRE might not have
enough heap space left to make the allocation succeed.  But there is
still sufficient heap to do other work, and retry the fetch, which is
why you don't see subsequent allocations fail with OOME.

If the repository has many packs, all of their PackIndexes might be
open in memory at once, because FetchProcess searched through all
local packs to see if the objects it wants are already local.  Since
PackIndex doesn't evict its data from memory when memory is low, we've
effectively used up all available heap space and cannot open another
PackIndex later.  Repacking this repository may help, because there
may be a lot of redundant objects due to thin-pack completion
occurring on each fetch.  Or it might not, if there are a lot of
unique objects the resulting repack may create a single PackIndex that
is too big to open in the available heap space.

Eons ago JGit used paged windows on the PackIndex, just like it does
on the PackFile.  This was rather slow, so we replaced it with simpler
code that just loads the entire PackIndex structure into memory as
several int[].  However, we really increased the chances that we'll
run the heap out of space.

-- 
Shawn.

Follow-Ups:
- Re: [jgit-dev] question on pack files
  - From: Dmitry Neverov

References:
- [jgit-dev] question on pack files
  - From: Dmitry Neverov
- Re: [jgit-dev] question on pack files
  - From: Shawn Pearce
- Re: [jgit-dev] question on pack files
  - From: Dmitry Neverov

Prev by Date: Re: [jgit-dev] question on pack files
Next by Date: Re: [jgit-dev] question on pack files
Previous by thread: Re: [jgit-dev] question on pack files
Next by thread: Re: [jgit-dev] question on pack files
Index(es):
- Date
- Thread

Breadcrumbs