Coda File System

Re: Suggetion please

From: Jan Harkes <jaharkes_at_cs.cmu.edu>
Date: Sun, 7 Sep 2003 18:31:24 -0400
On Sat, Sep 06, 2003 at 03:52:03PM +0530, Mahesh wrote:
> Hi all,
> I need a distributed set up where we have many machines,each
> containing their respective coda server and coda client with
> replication. Each machin can be added and removed from the
> distributed setup without modifications to any machines.

In a way that is a tough question, but I'll start with some answers.

First of all, your need to have several machines responsible for your
'realm'. Once this is set up correctly all clients will be able to
connect even when either server A or B is offline. This is normally done
with IN SRV dns records. Lets assume you have 2 machines, A.localrealm
and B.localrealm. The DNS configuration would look something like

    _codasrv._udp.localrealm	IN SRV 10 0 2432 A.localrealm.
				IN SRV 10 0 2432 B.localrealm.

Alternatively (if you can't add DNS records, or your dns servers don't
support IN SRV type records, which is my situation here at CMU) you have
to specify this information in /etc/coda/realms on each client.

/etc/coda/realms:
    localrealm		A.localrealm B.localrealm

Second, all of the responsible servers for the realm that we specified
this way should have a replica of the rootvolume. There can be no more
than 8 replicas, but 2 or 3 is typically enough already. I actually
don't have anything higher than triple replication.

Finally, to make conflict resolution more reliable, servers keep a log
of operations that haven't been committed by all other servers in a
replicated group. These logs have a finite size (afaik somewhere between
4000 and 8000 operations) and a server doesn't like to run out of the
resolution log. As a result, any machine can only be taken offline for a
limited period. The resolution log size can be enlarged (volutil
setlogparms) by the administrator but that really only postpones the
time to failure by a bit.

If all other volumes within your realm have at least 2 replicas, then
any one server can be taken offline for some period of time. If each
volume has three replicas then two servers can be taken offline.
Ofcourse as long as you make sure that at least one server for any
replicated volume available you can clearly bring even more servers
offline at the same time.

Jan
Received on 2003-09-07 18:32:30