Re: [jgit-dev] [RFC] Cassandra based storage layer for JGit

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]

Re: [jgit-dev] [RFC] Cassandra based storage layer for JGit

From: Adrian <adrian.wilkins@xxxxxxxxx>
Date: Fri, 28 Jan 2011 23:24:58 +0000
Delivered-to: jgit-dev@xxxxxxxxxxx
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=wOIXCv4IouXE571hkYfdi2LeOsfJuH6RuIjKrmJVXmlEI361vSiZ3x4+GiOv4GJI70 SOg+mZ/rd94P6Zs6tV7RexcAczaxUKPV2iG3bdO0gc4WV2vITUGFNZJ3868LaXnld0yB hNYd9RLlN99R9Jx6uNQJ7X6V8JhX6wYUvwPkg=
List-archive: <https://dev.eclipse.org/mailman/private/jgit-dev>
List-help: <mailto:jgit-dev-request@eclipse.org?subject=help>
List-subscribe: <https://dev.eclipse.org/mailman/listinfo/jgit-dev>, <mailto:jgit-dev-request@eclipse.org?subject=subscribe>
List-unsubscribe: <https://dev.eclipse.org/mailman/options/jgit-dev>, <mailto:jgit-dev-request@eclipse.org?subject=unsubscribe>

On Fri, Jan 28, 2011 at 10:56 PM, Shawn Pearce <spearce@xxxxxxxxxxx> wrote:
>
> Again, why not repack these?

I would guess that I'm not understanding the triggers on which the code
packs loose objects - my understanding of this part is based only on
watching the output of C git and is thus vague.

My concern is just that loose object files will accumulate beyond the limits of
the file system they are on, especially on Windows. I am working under
the assumption that packing only occurs after commits or when manually
invoked, so inserting, for example, 350,000 objects at a time into the
index to be committed might generate 350,000 loose object files which
I would anticipate to be a little slow.

These are just that, assumptions. I shall take a look at the relevant code.

By and large, this is a large set of objects with a slow rate of change
mediated by human editors, and loose object accumulation as files
probably isn't a concern.

But there are a few processes that deal with large sets of objects ; the initial
construction of a commit graph would be one. There are some batch processes
that import objects from other systems (although I'd anticipate the
change count is
much smaller than the actual object count there).

>
>> * Packed objects are in standard pack files
>> ** Why bother with the overhead of storing them in another container
>> ** Not doing this for the enterprise scaling or redundancy
>
> Why use a K/V store if you want standard pack files in your local filesystem?

Only because of my concerns about loose objects. I guess I should just suck
it and see first.

References:
- Re: [jgit-dev] [RFC] Cassandra based storage layer for JGit
  - From: Adrian
- Re: [jgit-dev] [RFC] Cassandra based storage layer for JGit
  - From: Shawn Pearce
- Re: [jgit-dev] [RFC] Cassandra based storage layer for JGit
  - From: Adrian
- Re: [jgit-dev] [RFC] Cassandra based storage layer for JGit
  - From: Shawn Pearce

Prev by Date: Re: [jgit-dev] [RFC] Cassandra based storage layer for JGit
Next by Date: [jgit-dev] How to obtain number of files in repository?
Previous by thread: Re: [jgit-dev] [RFC] Cassandra based storage layer for JGit
Next by thread: [jgit-dev] FindBugs and PMD Results in Builds
Index(es):
- Date
- Thread

Breadcrumbs