Coda File System

Re: Theory of writeback

From: Jan Harkes <>
Date: Fri, 23 Mar 2007 15:18:56 -0400
On Fri, Mar 23, 2007 at 07:47:53PM +0100, wrote:
> --- a different, related, matter:
> the checksums should include a per file random IV so that given a contents
> one could not predict the corresponding Coda hash.
> Otherwise the metainformation indirectly reveals the file contents.

No never, what the server should do is not provide the SHA1 at all as
part of the metadata if the client is not allowed to read the file
contents in the first place.

> The IV should be made available to the client along with the file
> contents, but not otherwise. (It would also be necessary for creation of
> usable lookaside data sets.)

This would make it too expensive to build/use a lookaside database,
let's say I have a local copy of a FC5 install and I want to use that
for lookaside, do I now have to send the contents of all these files
back to the server so that the server can figure out which local files
happen to match so that it can send back the modified checksums.

Or should the client send all the checksums of local data and have the
server return the modified ones? Still pretty expensive if I use a large
lookaside cache. i.e. I have ~10000 emails emails in /coda, do I really
want to send 200KB worth of sha's to get 200KB worth of lookaside keys
back? Also now the server needs to have an extra lookup table to map
from unmodified sha1's to it's internal keys + ivs.

And what are we protectings against? To prevent someone who doesn't have
read access to a file from intercepting the sha1? If that is possible we
have some bigger problems and in that case how does this change make it
any more difficult for an attacker, if he can get at the lookaside key,
he can just as easily get the IV.

> Of course, there is another question whether the server has some place
> to store that extra information and if the protocol, current or a compatible
> one, can handle it?

The current server already has no room for the lookaside SHA1 hashes.
I'm still weighing different approaches to storing that additional
information persistently to avoid having to recalculate it everytime a
vnode is copied from RVM. The server currently uses an in-memory vnode
cache which avoids some of pain of recalculating the SHA1.

Received on 2007-03-23 15:21:10