|Re: [jgit-dev] Problems with bitmaps and cloning and DFS back end
On Wed, Jun 19, 2013 at 12:02 PM, Alex Blewitt <alex.blewitt@xxxxxxxxx> wrote: > On 19 Jun 2013, at 17:03, Shawn Pearce <spearce@xxxxxxxxxxx> wrote: > >> But my original idea for DFS backend was to encode the DfsPackDescription >> data into the filename, and parse it back out when listing the packs >> of the repository. Unfortunately this might not be possible if you >> cannot rename a file, as some of the data arrives too late. > > Yes, I found that out too :-) > > However I found when the commitPack is called the info is there, so I write out a surrogate file with the data encoded in it. Yes, our implementation does the same thing. The description data is frozen just before commitPack() is invoked, so this is the point to save it. > The only real question is how much is needed; this is the first time I've needed anything at currently it works with only objectCount set. Colby pointed out to me this morning that the object count is necessary, and can be obtained from the PackIndex. Getting it from there is not trivial. The DfsPackDescription coming out of listPacks() needs the count set. The only way to get the object count from the index is to load the entire index and call getObjectCount(). Unfortunately the PackIndex you get from this process won't be cached properly for the DfsPackFile to use it, so now you have the index being loaded twice. Yuck. I think a reasonable fix in JGit is to modify DfsCachedPack so that it uses the DfsPackFile to get the object count, rather than the DfsPackDescription. The count is only used once we have chosen to reuse the pack, and reuse required us to look at the bitmap index, which required us to load the PackIndex. So DfsPackFile can safely delegate to the cached PackIndex and get a fast answer from memory. > I imagine that fileSize and lastModified would be useful to enable the correct priority sorting to work Yes, but this is overridable in your DfsPackDescription subclass by replacing compareTo(DfsPackDescription). > but I'm not sure whether the statistics are relevant outside of a DfsGarbageCollect result. I'm not sure about deltaCount. The deltaCount can be 0. Its only used in a stats line shown to a client when they do a clone and the bitmap file kicked in a reuse of the entire pack. This can be safely 0, it just yields a possibly confusing line. > The question is what subset is strictly necessary? Perhaps unnecessary fields (if any) could be annotated with "transient". Documenting this is a fantastic idea.
Back to the top