Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [jgit-dev] DirCacheEntry.mightBeRacilyClean correct?

"Lay, Stefan" <stefan.lay@xxxxxxx> wrote:
> In the method mightBeRacilyClean of DirCacheEntry there is the
> following:
>
> if (smudge_s < mtime)
>        return true;
> 
> Does this mean that an entry might be racily clean if its modification timestamp
> is after the timestamp of the index file, not regarding the time distance?
> 
> This would lead to the result that an entry is possibly racily clean
> even if the time distance is much more than one second.

No.  You are reading this wrong.

Here mtime is the cached modification time in the index.  It is
the time we observed from the file's lastModified() method the
last time we wrote the file out for the user during a checkout or
reset operation.


When you edit a file with your editor, and we later refresh the
index, *if the content differs*, we don't update the index timestamp
to the new lastModified() of the file.  We leave it alone.

The only way that smudge_s < mtime can be true is if we are seeing
some funny clock skew due to an NFS environment, or if there
are different cores reporting slightly different times, or if we
have had to smudge this record before.  (The way we smudge is by
setting the time far into the future, causing the record to report
mightBeRacilyClean again.)

The more likely case is for smudge_s == mtime, smudge_ns to be 0,
and for the mtime_ns to be 0.  This means that the index wrote out
at the same time as the file itself, and the filesystem probably
has a time granularity of 1 second.

We obviously wrote the file first, as we always do files then index.
But we can't be sure that the file wasn't modified after the index
was written, because the cached file timestamp is >= to the index
timestamp.  In those cases we smudge the record to force a refresh
to occur later, that later refresh will check the file content.

Above I said we don't change the lastModified time in the index
during a refresh if the content differs.  But if the content is
identical, we do update the timestamp in the index.  That is what
fixes the smudged records.


Don't feel bad if you aren't getting this right away.  This was
a pretty difficult issue in C Git that took Linus and Junio some
time to work out.  I thought I understood what they were doing,
until I coded this stuff in DirCache/DirCacheEntry... and only then
did I fully understand it.  :-)

FWIW, we don't do the same thing C Git does when we smudge a record.
When writing an index file to disk, C Git actually refreshes the
index records for entries that were previously racily clean... by
checking their contents on disk.  We can't do that here because
DirCacheEntry has no access to the local working tree.  So we defer
the refresh, but still mark the dirty invalid by forcing the entry's
mtime to be far into the future.

Unfortunately this has a 2038 bug, the current smudge trick won't
work past 2038.  But neither will the current 'DIRC' file format,
the mtime field is a 32 bit integer.  :-\

-- 
Shawn.


Back to the top