|Re: [jgit-dev] [RFC] Cassandra based storage layer for JGit|
On Fri, Jan 28, 2011 at 13:34, Adrian <adrian.wilkins@xxxxxxxxx> wrote: > > I'm not sure I'm putting myself across correctly - my goal is to > version control > a set of objects which are themselves persisted as entries in a K/V store, > rather than a bunch of source code files. So one K/V store is my working tree. > The second is for loose objects. OK, now I'm following you. > I think a K/V backend is important for > the loose objects because of the limitations inherent in trying to > store several > million objects as individual files in an ObjectDirectory implementation. Yes, but you would never store millions of loose objects, you would switch to pack files long before you got that many. > I > completely agree that packed objects are much more efficient and have lower > overhead ; the overhead of storing packs in a K/V store for a local-disk-only > implementation is probably not worth it. So what I'm driving at is > that for my purposes > I'd like to try to get to a jgit stack where > > * The working "tree" is a K/V store This would be nice for "client in a cloud" model, like what project Orion is trying to do at Eclipse. We already want to abstract the working "tree" APIs so that JGit can more directly use the EGit IResource APIs when making changes to the workspace... but we're not there yet, and I don't think our plans would handle millions of working "tree" items that are treated like a normal source code checkout. So you may be a bit more off in your own direction here. > * The loose objects are in a K/V store > ** Because millions of teensy files will stress most file systems adversely Again, why not repack these? Pack files are a K/V store, just more limited. They can only be updated by completely rewriting them, and the keys must be SHA-1s. But both Cassandra and Hadoop HBase implement their backends by doing complete rewrites of segments of their K/V store when there are sufficient changes to make a compaction worthwhile. Likewise... Git pack files. > * Packed objects are in standard pack files > ** Why bother with the overhead of storing them in another container > ** Not doing this for the enterprise scaling or redundancy Why use a K/V store if you want standard pack files in your local filesystem? -- Shawn.
Back to the top