Re: [jgit-dev] Re: Understanding Racy Git Handling in JGit: persisting

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]

Re: [jgit-dev] Re: Understanding Racy Git Handling in JGit: persisting smudged entries

From: Matthias Sohn <matthias.sohn@xxxxxxxxxxxxxx>
Date: Wed, 14 Jul 2010 22:55:06 +0200
Delivered-to: jgit-dev@xxxxxxxxxxx
Domainkey-signature: a=rsa-sha1; c=nofws; d=googlemail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=Iv87XTKzMg4TrwdkVKnEd8i2PO0M9Ph16eYV6FIXB/fLUkxripSty7V02aRpCy+VHA He8NT3+bwQv2tGk7gcSa2zrqV4BOT5meExLys1os16bv0IRoZN6EO2dSRBrKDPYlDBa4 4wwO2+owjV1Dpl3husYDTXE4zTMt8FAKYHCZc=
List-archive: <https://dev.eclipse.org/mailman/private/jgit-dev>
List-help: <mailto:jgit-dev-request@eclipse.org?subject=help>
List-subscribe: <https://dev.eclipse.org/mailman/listinfo/jgit-dev>, <mailto:jgit-dev-request@eclipse.org?subject=subscribe>
List-unsubscribe: <https://dev.eclipse.org/mailman/listinfo/jgit-dev>, <mailto:jgit-dev-request@eclipse.org?subject=unsubscribe>

2010/7/14 Shawn O. Pearce <spearce@xxxxxxxxxxx>

Christian Halstrick <christian.halstrick@xxxxxxxxx> wrote:
> why does JGit smudge the modification time of index entries
> differently than native git? Native git truncates the cached length to
> 0, while JGit smudges the timestamp to a maximum value. Although we
> don't see any problem with that we are wondering why this has been
> done. We found the following comment in
> DirCacheEntry.smudgeRacilyClean() :
>
> // We don't use the same approach as C Git to smudge the entry,
> // as we cannot compare the working tree file to our SHA-1 and
> // thus cannot use the "size to 0" trick without accidentally
> // thinking a zero length file is clean.
> //
> // Instead we force the mtime to the largest possible value, so
> // it is certainly after the index's own modification time and
> // on a future read will cause mightBeRacilyClean to say "yes!".
> // It is also unlikely to match with the working tree file.
> //
> // I'll see you again before Jan 19, 2038, 03:14:07 AM GMT.
>
> We don't understand fully that comment. Additionally: shouldn't it be
> very cheap to detect whether a zero-length file is clean or not? The
> SHA-1 for an empty file should be constant?

Sorry the comment wasn't clear enough.

The problem I ran into is, DirCache and DirCacheEntry don't
necessarily have access to the working tree in order to examine
the file itself. If we set the size to 0 during smudge, we can't
verify that the file length is not 0 before we smudge.

But I think what you are trying to get to here is, can't we just
set the size to 0 to smudge the record and worry about it later?

If the file is non-empty and we smudge the length to 0, during a
future scan the length won't match, and we can recheck its contents
to repair the cached stat data.

If the file is empty, the cached stat data now matches, and we'll
consider it clean. That leave us open to missing an update where a
user truncates a file in the same filesystem step as the index was
changed. But the cached ObjectId won't be the empty blob ObjectId,
which is easy to check for zero length files.

OK, I think I get what you are saying.

- We change our smudge logic to set the length to 0.

- During a status check of a file, we have two special cases:

if (dirCacheEntry.length == 0) {
if (dirCacheEntry.objectId != emptyBlobObjectId) {
// This is a racily clean smudged entry. It can be repaired by
// checking the file's contents on disk to see if they match the
// dirCacheEntry.objectId. If they do, we can correct length and
// last modified to unsmudge it.
//

} else (file.length == 0) {
// This might have been racily clean, we don't know, but
// it doesn't actually matter.
//
}
}

Right?

Yeah, that's what we thought could replace the current implementation.

But we weren't sure if we don't overlook some aspect. Now with your

rephrased explanation of the idea it sounds quite straight forward to me.

This approach would be closer to the native implementation and also fix the

problem which would come up in around 28 years with the current

smudge implementation ;-)

--
Matthias

Follow-Ups:
- Re: [jgit-dev] Re: Understanding Racy Git Handling in JGit: persisting smudged entries
  - From: Shawn O. Pearce

References:
- [jgit-dev] Understanding Racy Git Handling in JGit: persisting smudged entries
  - From: Christian Halstrick
- [jgit-dev] Re: Understanding Racy Git Handling in JGit: persisting smudged entries
  - From: Shawn O. Pearce

Prev by Date: [jgit-dev] Re: Understanding Racy Git Handling in JGit: persisting smudged entries
Next by Date: Re: [jgit-dev] Re: Understanding Racy Git Handling in JGit: persisting smudged entries
Previous by thread: [jgit-dev] Re: Understanding Racy Git Handling in JGit: persisting smudged entries
Next by thread: Re: [jgit-dev] Re: Understanding Racy Git Handling in JGit: persisting smudged entries
Index(es):
- Date
- Thread

Breadcrumbs