|Re: [jgit-dev] JGit backup & synchronization|
On Thu, May 3, 2012 at 7:30 AM, <epeters1111@xxxxxxxxx> wrote: > (1) What's the recommended approach to backing up a repository that's being used by JGit? Same as backing up any other Git repository. > Based on some threads from Stack Overflow , it seems like the "right" way is to use "bundle" to create snapshots, >  http://stackoverflow.com/questions/2129214/backup-a-local-git-repository This is one valid way to do it. Another way is to use git clone, which that thread discourages. git clone is a fine backup method, the issues on that thread are about how to you then backup that directory of files? If your backup system has trouble with a directory, then you might need to e.g. tar the clone first. At which point bundle might just be a good approach. Another way is to store the repository on a filesystem that supports snapshots. If you snapshot the POSIX filesystem, the repository will be consistent as of that snapshot, and then you can backup the snapshot. E.g. btfs on Linux or ZFS on Solaris/FreeBSD. JGit (and normal Git) always perform updates in a safe ordered way to support this sort of approach. It depends on how much data you are talking about. bundle/clone will produce a complete copy of the repository. For one repository done nightly, this isn't really a problem. For 1600 repositories that weigh in over 200G total, it is. Its that latter case where snapshotting filesystems can be useful with Git. > which seems to be supported in JGit through org.eclipse.jgit.transport.BundleWriter. Does this sound about right? Yes. > (2) The JGit Repository is thread-safe, so it seems like our app could support multiple threads interacting with the same repository. This is correct. Gerrit Code Review runs the JGit Repository as a singleton across multiple concurrent threads, under very high user loads, and has been doing that for ~3 years in production at a lot of companies. Its thread-safe. :-) > But it seems like there's still just one working directory Its thread safe... until you touch the working directory. A bare repository is thread-safe. I don't think we provide any assurances the working directory is thread-safe. Some internal structures might be thread-safe enough that you can perform different working directory operations sequentially on different threads. But the working directory is not strictly thread safe on its own. Its too hard to provide the right consistency guarantees across all of the files in the working tree. You can run multiple working directories, but you would need to do things a bit more yourself, by tracking your own DirCache objects and their locations on disk, and tracking your own directory root for each user, and this may mean you can't use all of the JGit API classes because they assume the single working directory relationship with a Repository. > (3) Is it ever safe to have multiple *processes* using the same repository? Yes, this is explicitly supported to permit end-users to use JGit from within an embedded application (e.g. Eclipse IDE) and still use git-core on the command line. > I'm thinking of two scenarios: (a) a user using command-line git at the same time our app is using JGit, and/or Yes, like I just said this was an explicit design goal with JGit. > (b) two app processes with a shared disk. Also works.
Back to the top