|Re: [jgit-dev] Improving performance of looking up a path within a tree (TreeWalk.forPath)|
> Message: 2 > Date: Sat, 3 Mar 2012 10:08:24 +0100 > From: Pawel Kozlowski <pkozlowski.opensource@xxxxxxxxx> > > What I would like to ask is this: what would be the best approach to > improve the performance of paths look-ups given a tree. For me it > should involve caching uncompresed trees red from a disk (IO and > decompression seems to be hot spot here) but I'm not sure where is the > best place to plug such a cache? Does wrapping such cache around > ObjectReader makes sense? > I keep wanting to implement a working tree library that will version an arbitrary K/V store ; rather than accessing the data as native git objects, break the keyspace of the store into trees (automatically, maybe with a burst trie strategy, wrap the native KV library with an extension that keeps the modification times on these artificial tree objects up to date on value insertions and permits keeping an index for commits. In this way Git moves from a system that manages changes to a file system to a system that manages changes to a K/V store ; there isn't much semantic difference, the main points being that instead of the user managing the trees (folders / files), the software manages the trees, and that merging is some ways easier (you can probably discard the notion of renames as objects probably retain their key when they change states). You would have to check out your "trees", but then you can do native lookup on the actual keys for the objects instead of having to deal with the git trees ; since you only have to address them when you write an object, rather than when you doing reads, things should be faster. Alas, I just don't seem to have the time ; I have a mental sketch of the basics like splitting the keyspace using burst tries, but not enough time to dissect the JGit library enough to make the file based working tree support into a K/V working tree library.
Back to the top