[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
[jgit-dev] Threading and MissingObjectException

I am using jGit for an internal tool here at Sony Mobile. In order to improve performance I started to do some threading. Unfortunately this came with some random MissingObjectException:

org.eclipse.jgit.errors.MissingObjectException: Missing unknown 04f9cd20301800d51e3a164a4dc926e779bb4104
at org.eclipse.jgit.internal.storage.file.WindowCursor.open(WindowCursor.java:148)
at org.eclipse.jgit.lib.ObjectReader.open(ObjectReader.java:229)
at org.eclipse.jgit.revwalk.RevWalk.parseAny(RevWalk.java:809)
at org.eclipse.jgit.revwalk.RevWalk.parseCommit(RevWalk.java:722)
at com.sonyericsson.patchtool.server.db.GitDatabaseTest$GitLogger.call(GitDatabaseTest.java:215)
at com.sonyericsson.patchtool.server.db.GitDatabaseTest$GitLogger.call(GitDatabaseTest.java:1)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)

My application basically creates a FileRepository and then spawns a bunch of threads that create their own RevWalk (on the same FileRepository instance) and then do RevWalk.parseCommit(). I managed to create a test case that reproduced the problem which looks something like this:

Main thread:	repository = new FileRepository(...);
		ObjectId headObjectId = ObjectId.fromString("04f9cd20301800d51e3a164a4dc926e779bb4104");

Thread N		revWalk = new RevWalk(repository);
		revWalk.parseCommit(headObjectId); //<-- Throws MissingObjectException

After some digging around in the jGit code I found some interesting things in FileObjectDatabase.java. If you look at openObjectImpl1:
final ObjectLoader openObjectImpl1(...) {
	ldr = openObject1(curs, objectId);
	if (tryAgain1()) { // Thread #1 stops for a moment here
		ldr = openObject1(curs, objectId);
		if (ldr != null)
		return ldr;
The tryAgin1() in turn is implemented in ObjectDirectory.java:
boolean tryAgain1() {
	final PackList old = packList.get();
	if (old.snapshot.isModified(packDirectory))
		return old != scanPacks(old);
	return false;

Now, say we have 2 threads that calls openObjImpl1 on the same instance at the same time. Thread #1 decides to stop for a moment at the comment above. Thread #2 continues execution and will execute tryAgain1() that will return true (packList contained NO_PACKS before this and scanPacks has updated it to a "real" scanned one). The openObjectImpl1 will now return successfully for thread #2. Then thread #1 decides to continue and enters tryAgain1(). The "old" variable will now be assigned with the new packList (since thread #2 parsed it nicely) and isModified will return false and the tryAgain1() will return false which will then cause openObjectImpl1 to fail.

I might have missed something, but changing to "return true" (just to test) in tryAgain1() will make my test case pass without any exceptions (I guess that this will have some unwanted performance impact).