Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [jgit-dev] [RFC] Cassandra based storage layer for JGit

2010/10/23 Shawn Pearce <spearce@xxxxxxxxxxx>
On Fri, Oct 22, 2010 at 5:01 PM, Matthias Sohn
> 2010/10/22 Shawn Pearce <spearce@xxxxxxxxxxx>
>>
>> Last weekend I wrote a read-only storage layer for JGit that uses
>> Apache Cassandra[1] as the backend data store.  I have now posted the
>> source under the EDL here:
>>
>>  http://github.com/spearce/jgit_cassandra
>>
> that's exciting news, we are definitively interested to participate in that
> effort

I think a Cassandra based storage server is silly for a small project.
 But when you start talking about a large corporate deployment of Git
that needs to support 1000 developers spread across 3 offices on two
different continents, it starts to become an interesting idea.  Most
companies don't have a Cassandra cluster already running.  But if you
are already dedicating 6 machines in 3 offices to running Git, you
might be able to absorb the cost of setting up a Cassandra cluster in
order to have no single failure point, and also get fast local mirrors
of the repository.
 
That's why I am interested since our shop is a big one -- not yet
in terms of git users but the demand is growing and we are already
looking for high availability.
 
We have a _ton_ of work to do to make this a reality.  But I'd
certainly be interested in help from anyone who wants to contribute.

yeah, I expected that there is some work ahead ;-)
 
> I tried building it against current jgit d00420ae and maven says :
> [INFO] Compilation failure
> /Users/d029788/src/git/jgit_cassandra/src/main/java/org/eclipse/jgit/storage/cassandra/Importer.java:[337,11]
> searchForReuse(org.eclipse.jgit.lib.ProgressMonitor) has private access in
> org.eclipse.jgit.storage.pack.PackWriter

I don't want this method to be part of our public API.  I exposed it
so the hacky Importer class can use it.

> /Users/d029788/src/git/jgit_cassandra/src/main/java/org/eclipse/jgit/storage/cassandra/Importer.java:[450,29]
> cannot inherit from final org.eclipse.jgit.storage.pack.PackOutputStream

I also don't want this class subclassed, that is why it is final.  But
I made it non-final so the hacky Importer class can extend it, and
override the output routines in order to redirect the data in a way
that it shouldn't be redirected.  :-)


The hacky Importer class needs to be taken out back and put out of its
misery.  The _correct_ way to use JGit is to somehow replace
IndexPack's interface to the storage system, so that the standard
Transport fetch code and ReceivePack server code just work.
Unfortunately this isn't possible yet, and I needed a way to get data
into Cassandra just to test the simpler reading code.  So, I hacked up
the Importer class.  That class needed two back doors into JGit that
shouldn't ever be exposed... so I made the hand edits locally.  I
won't push a change that makes these edits, because I _really_ don't
want them made to JGit, and pushing a change would just encourage
that.

Its only two edits, and its easy enough to do given the compiler
errors above.  And when my IndexPack rework is done, this won't be
necessary, because we'll be able to delete the Importer class and just
push directly into the Cassandra repository over the standard Git
protocols.

I'll do the edits and start playing jgit the cassandra way
 
--
Matthias

Back to the top