Re: [jgit-dev] Improving performance of looking up a path within a tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]

Re: [jgit-dev] Improving performance of looking up a path within a tree (TreeWalk.forPath)

From: Adrian <adrian.wilkins@xxxxxxxxx>
Date: Sat, 3 Mar 2012 21:05:59 +0000
Delivered-to: jgit-dev@xxxxxxxxxxx
List-archive: <https://dev.eclipse.org/mailman/private/jgit-dev>
List-help: <mailto:jgit-dev-request@eclipse.org?subject=help>
List-subscribe: <https://dev.eclipse.org/mailman/listinfo/jgit-dev>, <mailto:jgit-dev-request@eclipse.org?subject=subscribe>
List-unsubscribe: <https://dev.eclipse.org/mailman/options/jgit-dev>, <mailto:jgit-dev-request@eclipse.org?subject=unsubscribe>

> Message: 2
> Date: Sat, 3 Mar 2012 10:08:24 +0100
> From: Pawel Kozlowski <pkozlowski.opensource@xxxxxxxxx>
>
> What I would like to ask is this: what would be the best approach to
> improve the performance of paths look-ups given a tree. For me it
> should involve caching uncompresed trees red from a disk (IO and
> decompression seems to be hot spot here) but I'm not sure where is the
> best place to plug such a cache? Does wrapping such cache around
> ObjectReader makes sense?
>

I keep wanting to implement a working tree library that will version
an arbitrary K/V store ;
rather than accessing the data as native git objects, break the
keyspace of the store
into trees (automatically, maybe with a burst trie strategy, wrap the
native KV library with
an extension that keeps the modification times on these artificial
tree objects up to date
on value insertions and permits keeping an index for commits. In this
way Git moves from
a system that manages changes to a file system to a system that
manages changes to
a K/V store ; there isn't much semantic difference, the main points
being that instead of
the user managing the trees (folders / files), the software manages
the trees, and that
merging is some ways easier (you can probably discard the notion of renames as
objects probably retain their key when they change states).

You would have to check out your "trees", but then you can do native
lookup on the actual
keys for the objects instead of having to deal with the git trees ;
since you only have to
address them when you write an object, rather than when you doing
reads, things should
be faster.

Alas, I just don't seem to have the time ; I have a mental sketch of the basics
like splitting the keyspace using burst tries, but not enough time to
dissect the
JGit library enough to make the file based working tree support into a K/V
working tree library.

Prev by Date: [jgit-dev] Improving performance of looking up a path within a tree (TreeWalk.forPath)
Next by Date: [jgit-dev] Project meta data is out of date for technology.jgit
Previous by thread: [jgit-dev] Improving performance of looking up a path within a tree (TreeWalk.forPath)
Next by thread: [jgit-dev] nio exception
Index(es):
- Date
- Thread

Breadcrumbs