Coda File System

Re: using coda for large servers

From: Jan Harkes <jaharkes_at_cs.cmu.edu>
Date: Tue, 19 Aug 2003 11:30:54 -0400
On Sun, Aug 17, 2003 at 04:34:20PM -0700, Ronald Hilterfuge wrote:
> I am investigating whether or not coda is suitable for my application.
> I have two questions about the workings of coda. I would like to
> read-write mirror a rather large filesystem (about 800 gb) over a WAN
> to 4 distinct nodes.  I have sufficient RAM to be able to create a RVM

Ok, in a way this really is a FAQ, but I guess noone has ever tried to
really write it down as one.

There are clearly different types of replication. One is the type you
describe, where multiple widely distributed sites have a need to have
identical replicas of a common dataset. I don't really know if there is
a good term for this, but site-mirroring comes close.

The other type is more a failover type of replication, where data is
replicated across multiple machines at a single site. This is similar
to how a RAID5 array works, the load of read accesses is shared across
all replicas, and when one replica becomes unavailable the others can
transparently take over. Some sort of High Availability replication.

Coda's server-server replication is of the second type. It really
doesn't perform well when the various replicas are spread across a wide
area. The main reason for this is that most of the replication smarts
are not in the server, but on the client. It would almost be possible to
give a Coda client an HTTP backend and use a normal web-server as
fileserver. The only modification that would be necessary is a way to
securely synchronize the webservers when a client detects a difference
between servers.

Now it is possible to approach the site-mirror type of replication if
you have all the servers in one location, and only use Coda clients from
the other locations. This is a bit inefficient, because all the clients
do not share state, so each independently has to fetch identical files.

Some interesting research is done on the use of staging servers and
client cache sharing, where a recent copy of (most of) the data is
placed on a local machine, or clients are allowed to borrow data from
each other's caches. Ofcourse trust and security come into play here.

    http://portal.acm.org/citation.cfm?id=566775&coll=portal&dl=ACM&CFID=11728998&CFTOKEN=73925472

    http://www.eecs.umich.edu/~jflinn/papers/sigops02.ps

> metadata partition which is still 4% of the total filessytem size.  
>  
> 1) How would coda perform over such a situation? Can coda handle large
> filesystems?

Not very well, you might have to look at things like rsync, unison or
omirr. Coda can handle extremely large filesystems as long as your files
are relatively large :) Our real limitation is the number of objects
whose metadata has to be stored in recoverable VM (basically RAM+SWAP).

Each object takes a few hundred bytes, but it quickly adds up.

> 2)  Also how do the filesystem semantics work?  if someone attempts to
> open a file at location A, and someone else tries to open the same
> file at location B, how does the filesystem respond in this situation?

For reading or for writing? If both open for reading, no problem. Both
get an identical copy. If one reads and the other writes, the reader
will not see any updates until he closes and reopens the file _after_
the writer has closed his file. Of both are writing, then the last one
to close his filedescriptor gets a reintegration conflict which has be
be repaired by the user.

Jan
Received on 2003-08-19 11:31:59