Coda File System

Re: Theory of writeback

From: Jan Harkes <jaharkes_at_cs.cmu.edu>
Date: Fri, 23 Mar 2007 14:16:45 -0400
On Fri, Mar 23, 2007 at 01:22:41PM -0400, M. Satyanarayanan wrote:
> Venus could compute SHA-1 of file on fetch.   Re-compute on close( )
> and skip store if unchanged.   Gives OS-independent solution, and
> avoids dependence on clocks.  Does require faith in SHA-1.
> If Jan lets me, I might even implement it :-)
> 
> But seriously, is computing SHA-1 on fetch and each close ( ) too much
> overhead on typical machines of people on codalist?

Always having a valid SHA1 for any file that is not currently open for
writing is useful for several other reasons.

If we discard the data but keep the status around, we may find an
identical copy through lookaside. Also if we want to send the file back
to the server it may find that someone else already reintegrated an
identical copy, so we don't have to really ship any data.

Finally the sha1 can be used during recovery to ensure that the
container files did not pick up any corruption. I've had fsck fix up the
metadata and pass the filesystem as correct, but in reality we lost the
contents of the container files, they just had random zero blocks. Such
a check is pretty expensive but it could be run lazily in the background
while the client is already servicing kernel upcalls.

But there are still cases where we need to write back even an unmodified
file. During the open we remove any pending stores from the CML (store
optimization), and if that happened we have to write back the file even
if the current open for write didn't change anything.

Jan
Received on 2007-03-23 14:18:37