Coda File System

From: Jan Harkes <jaharkes_at_cs.cmu.edu> Date: Fri, 18 May 2007 23:39:57 -0400

On Sat, May 19, 2007 at 12:00:35AM +0200, Enrico Weigelt wrote:
> * shivers_at_ccs.neu.edu <shivers_at_ccs.neu.edu> wrote:
> > Note that if coda were rock solid and scaled like this, 
> > then you would never have to waste disk space setting up 
> > RAID arrays on every disk system in your life -- you'd 
> > only need to RAID your coda *server*, while all your 
> > client boxes could run un-RAIDed.
> 
> AFAIK, if you run multiple mirrored servers, you probably can 
> even live w/o RAID. If one system fails, another one jumps in.

Right, our servers don't use RAID. If one server fails we just keep
running with a smaller available volume group. The only problem is that
resolution logs don't get truncated until the missing server returns, so
the remaining server(s) will at some point run out of resolution log
space and trigger an assert.

> But I didn't test this yet. Probably depends on Coda's behaviour
> on broken storage. 
> 
> So the big question is: what happens if the an underlying FS fails ?
> Does Coda properly detect this and drive to some clean state
> (ie. take an damaged volume offline and kick clients to the 
> next server ?)

It depends on the failure. If the underlying FS just silently corrupts
data then the files from that server are corrupted. Some client may
never even consider that server as the best connected so they wouldn't
even see any problems.

If the failure hits either RVM, or makes things fail with IO errors,
then the server tends to hit an assertion and as a result become
unreachable.

Most of the failures I've seen with our servers are really hard-stop
type. We've had unexpected shutdowns when a UPS failed, a powersupply
that broke, once the CPU fan must have fallen off and a server went into
thermal shutdown. And then of course the occasional harddrive that just
refuses to spin up after a reboot.

I've replaced failed servers by installing an empty Coda server on a new
machine and giving it the same ip-address. Then with a list of all the
volume replicas that used to exist on the dead machine I created
identical empty volumes. The only thing needed then is to tell a client
to recursively walk the tree. It starts at the root and notices that the
new server has an all zero version vector, so it copies the root
directory from another replica, then it creates empty files for all
directory entries and as such we resolve/repopulate all volumes.

At first there will be a lot of errors in the logs from other clients
that are trying to access files that do not yet exist on the new server,
sometimes I set up a firewall to block out all other clients until I've
resolved at least all directories.

I've also used this when we upgraded the Coda servers to new hardware,
bringing up the new machines for some replicas, resolving until they
were in sync, and then replaced the remaining replicas and resolved
again. It takes a while, but avoided the need to go back to backups.

Jan

Coda File System

Re: coda & reasonable cache sizes