Coda File System

Re: CODA as fault tolerance system

From: Jan Harkes <>
Date: Wed, 14 Jan 2004 17:53:33 -0500
On Wed, Jan 14, 2004 at 06:30:41PM +0000, Ro Irving wrote:
> It all seems to work brilliantly except when I take one of the servers 
> off line. The first couple of accesses when one server dies seem to take 
> up to 10 seconds to resolve which server they should look at.

Actually that should be about 15 seconds. And it is not resolving which
server it should be looking at, but simply hoping that it will get a
reply from the lost server.

A Coda client will (for most operations) talk to all available servers
in parallel. When it started the operation it still thought the server
was alive and as such assumes that it's request might have been lost on
the network, or the response is just taking a while to get back. So it
resends the request a couple of times and finally gives up.

And 15 seconds isn't really all that unusual, it is sometimes even a bit
on the quick side for dialup connections when there are a couple of
concurrent TCP tranfers active.

> Is this due to a timeout setting somewhere?
> Or is it due to me only pointing the venus client to one of the 2 
> servers (venus-setup test2 30000)?

The only thing that that does is that venus will only ask 'test2' about
volume location and such. The test2 server will then respond with a list
of servers that hold replicas of the volume and from that point on the
client will talk to everyone in parallel.

> Can I tell the venus setup script to look at multiple servers?

I guess you must be using a pre-6.0 version, in that case you should be
able to list multiple 'rootservers' as 'test2,test3' (comma separated,
no spaces). Or edit the rootservers= line in /etc/coda/venus.conf,
because that is really the only thing that venus-setup does.

> My other question involves windows 2000/XP.
> I would quite like to have a couple of windows machines accessing the 
> CODA storage area but I cant get to NT client to work.
> "net start coda" works fine but
> "net start venus" gives me an error of  "System error 1067 has occurred" 
> and dies. Any ideas?

NT client seems to have varying success. I'm not yet sure why. But in my
case, after I boot the machine, I start a cygwin shell and then run

    net start coda
    venus -init

I guess 'net start venus' would try to run venus as a system service,
but for me it seems to work better when it runs in a cygwin shell.
Windows code is definitely needs more work.

Received on 2004-01-14 17:54:34