Coda File System

Re: performance

From: Jan Harkes <jaharkes_at_cs.cmu.edu>
Date: Mon, 22 May 2000 16:16:55 -0400
On Mon, May 22, 2000 at 01:19:09PM +0200, tholli wrote:
> Hello
> 
> I have been doing some performance checks on Coda comparing it with NFS
> and Samba. My results are good specially I was impresed how fast Coda
> could access the Cashe on the local disk.
> 
> It is allthough one thing bothering me.
> When I open file for reading and writing (r+), read the whole file
> without doing any changes Coda is much less efficient than the other
> file systems.

Not by much if you switch to write-disconnected operation, in which we
trickle the modifications back to the servers in the background.
 (cfs wd /path/to/volume)

Ofcourse you do get the less desired behaviour that you update a file
and tell your colleague where to find it and he doesn't see it until 5
minutes later, which is why there is also a `cfs forcereintegrate/fr'
call to push all pending modifications back to the servers.

> I think that the reason is that Coda will contackt the server after the
> raedings storing the whole file assuming that it was modified (this can
> been seen with codacon). I am not quite sure but I thing the other file
> systems detect that no changes has been made to the file limiting the
> Server-Client communication.

NFS/Samba have a completely different model, they are block based. If
you read a very large uncached file in Coda, you have to wait until it
has been fetched completely. Once it is in the cache, read/write access
is pretty much similar to local disk speed. So if you are doing several
sweeps through a 500MB file, the others will really suck.

On the other hand, if you read that file just once and never ever look
at it again (i.e. unpacking a tarball), filesystems like NFS are more
likely to be useful.

Also because of the session semantics, modifications are only propagated
back to the servers when the file is closed. This improves consistency
of the files, and simplifies both disconnected operation and conflict
resolution.

> If my idea is right, is there any good explanation of why it is so.

We always send back a file that has been opened for writing, any sane
application (except for Win95 ones) only opens for reading if they
aren't updating the file contents. We actually use this fact to
optimize away pending store operations in the CML when a file is open
for writing. We know that in that case, when the file is closed, a new
store record will be generated.

These optimizations can really save a lot of network traffic, f.i.
intermediate files created during a compilation, they are likely to be
removed within 5 minutes and won't even be sent to the server at all.

Jan

* CML is a log of pending operations kept during disconnected and
  write-disconnected operation.
Received on 2000-05-22 16:18:56