Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [stellation-res] Recognizing changed files

On Thursday 22 August 2002 11:26 am, Jonathan Gossage wrote:
> I think I may have found another Windows related problem. Currently as I
> understand it, you detect that
> a file has changed outside of Stellation by checking the timestamp of the
> file against the timestamp recorded for the file when it was last checked
> out and assuming it has been modified if the timestamp of the file is more
> recent than the base. Unforetunately there are a number of cases in Windows
> that break that assumption.
>
> In particular, whenever you use any of the Windows command line file copy
> commands or if you use Windows Explorer to copy a file, the copy operation
> preserves both the attributes and the timestamp of the original file. Thus
> if you replace a file in a repository with another file with an earlier
> timestamp you will not reognize that the file has changed when a checkin
> command has been issued. What's more, in fact, given this scenario, you
> delete the apparently older file from the workspace.
>
> This problem is at the root of all the queer behaviour I am currently
> observing while trying to test MySQL. A clean solution will take a bit of
> thought. One possible solution which would cover all scenarios on both
> Windows and UNIX would be to compute a hash for each file when it is
> checked out and store this hash in the project data. Then if the timestamp
> has changed in any way, compute the hash again on the same file and see if
> they are the same. If they are, the file can be treated as unchanged and
> the timestamp adjusted appropriately. The major problem with this approach
> is that it is somewhat resource intensive.

In fact, that's *roughly* what we do. 

For a while, we were using hashcodes (we called them signatures) exclusively 
for detecting what changed. The problem was, this gets *incredibly* slow for 
detecting what changed in a a large workspace. (If I recall correctly, in one 
test, it could take over an hour to scan over the Linux kernel sources 
looking for changes.) So, as an optimization, we started using timestamps as 
a method to determine whether or not we needed to recompute the signature.

Here's my proposal for a fix: instead of checking if the timestamp is *newer* 
than what's  recorded in the project, just check if it's *different*. If it's 
different, then do the signature comparison. That should cover this problem, 
right?

	-Mark


-- 
Mark Craig Chu-Carroll,  IBM T.J. Watson Research Center  
*** The Stellation project: Advanced SCM for Collaboration
***		http://www.eclipse.org/stellation
*** Work Email: mcc@xxxxxxxxxxxxxx  ------- Personal Email: markcc@xxxxxxxxxxx




Back to the top