Coda File System

Re: large servers: please help

From: Peter J. Braam <>
Date: Wed, 20 Jan 1999 11:10:38 -0500 (EST)
On Wed, 20 Jan 1999, Robert Watson wrote:

> On Wed, 20 Jan 1999, Laszlo Vecsey wrote:
> > Isnt part of the problem the client, which afaik is not supposed to have a
> > large cache, say greater than 300mb or so. 
> My understanding was that part of the problem lies in the scalability of
> RVM as a transaction system; because it mmaps its log into address space,
> you run into address space limitations.  Similarly, as you increase load,
> a costly transaction system can be a problem.  However, as this is not my
> area of expertise, I'll leave answering that question to others. :)  I'm
> more interested in the following question.

It's a combination of things.  Startup time is one, the amount of RVM that
one can reasonably memory map another.  I think that 500,000 files per
server isn't so bad for now.

> > For example if you were to 'head' or 'tail' a large file, the entire file
> > needs to be transferred into the local cache first, and then finally the
> > data is made available to the running program. Makes mmap completly
> > useless too -- nfs deals with these situations right? How far into coda's
> > future is this ability, I think this is whats holding it back.
> AFS deals with this by 'chunking' -- that is, it demand-loads portions of
> files into the cache as they are needed; I believe it also uses an
> agressive read-ahead policy.  The net result is more efficient use of the
> cache for partial file reads or writes, especially for mammoth files.

I just sent a message about this.

> However, that raises consistency issues: currently the resolution of
> conflicts between file versions is that of entire file system objects
> (files or directories).  Dealing with fine-grained inconsistency severely
> complicates the repair process, I would guess; it is not even clear if the
> client would have access to the whole file version it is attempting to
> integrate.  For disconnected operation anyway, it seems like transferring
> the whole file is more useful, as the chances are high that if you access
> a bit of the file, you will access all of it (loading it into emacs,
> writing it out, etc).

Whoops, this is a good point.  However, the conflict resolution mechanisms
themselves would use the chunk fetching code, so it need not really be a

> > I was really hoping to have home directories mounted over coda, with inbox
> > being stored right in the accounts, (and also large procmail filtered
> > mailing-list archived mail folders) but that wont be feasible until at
> > least write-back caching is available in a connected state.
> > 
> > I just got coda running recently, but the initial excitement has faded
> > somewhat after discovering the above.. :(
> My suspicion is that the arrangement you describe will suffer from Coda's
> weak consistency model: if multiple clients are using write-back caching,
> then conflicts can occur.  

Write back caching wil have the same semantics as connected Coda.
If another client comes along, then the one holding the write back token
will have to reintegrate first.

Conflicts in Coda arise as easily in connected mode as in AFS you would
overwrite data (last close wins in AFS).  The problem with receiving email
in Coda is locking to avoid conflicts.  I don't know how AFS does this,
but with NFS it is certainly possible to ruin your mailbox easily.

> Especially on objects accessed as frequently as
> inbox directories.  My greatest concerns about Coda is the exposure of the
> consistency mechanism to a) users and b) unattended machines.  In the
> users' cases, it requires a fair amount of education, even on the concept
> of 'file consistency' and 'replication'.  For the machines' cases, it is
> not clear what the correct approach is--the Coda decision to allow the
> user to intervene, coupled with a lack of ability for the average UNIX
> machine to determine the semantic content of a MS Word document and
> integrate conflicting versions poses a challenge. :)  For multiple
> large-scale multi-user machines and unattended mail servers, disabling the
> replication mechanism and improving the consistency model are almost
> required; in other words, making it essentially behave like AFS.

The conflicts have more to do with simultaneous access than with
replication I think.  Of course, through a partition of the coda servers
you have even more opportunity to get conflicts, but for mail two clients,
one on which mail is delivered and another on which it is read and
modified, both using a single server, is an excellent ground to get

It's a good puzzle to see if Coda's connected semantics allow for the
atomic creation of a lock file. Perhaps that is just possible.  On the
other hand, I don't really have much more faith in AFS or NFS without lock
daemons when it comes to my mail.

> This is not to suggest that Coda is not useful in such an environment; 
> it's real benefits come in the case of mobile computing.  It might be
> interesting to introduce the concept of different 'classes' of client: 
> that is, the semantics and consistency enforced for a particular client
> might depend on the role it was expected to play.  

Yup, unfortunately, that's a rather major project probably.

- Peter -

For example, unattended
> mail servers require a high degree of consistency, at least in as much as
> they should never be required to resolve a conflict, and that ideally they
> should always be able to write to the file system if space remains.  On
> the other hand, mobile computing devices (like the notebook) would require
> far weaker consistency; in fact, they would encourage it.  The ability to
> go mobile and reintegrate changes later is clearly an advantage, and as a
> single-user attended machine, probably with a spiffy gui, this is
> acceptable (and better yet, encouraged under Coda :).
>   Robert N Watson 
> PGP key fingerprint: 03 01 DD 8E 15 67 48 73  25 6D 10 FC EC 68 C1 1C
> Carnegie Mellon University  
> TIS Labs at Network Associates, Inc.
> SafePort Network Services   
Received on 1999-01-20 11:11:42