Coda File System

AW: Large servers ...

From: Tomalla, Wolfram <TM_at_ProCom.de>
Date: Fri, 22 Jan 1999 10:40:35 +0100
Hi again,

The problem is thee are no replicated Databases that realy
hold the Data faliles except Oracle. And an Oracle
istallation would cost us $25.000 - $30.000 per system.
All the other Databases actualy only support a warm standby.
That means, if the Database commited a transaction, the 
data is only on the machine the database system is running on.
The data will be copied to the other system soon, but if the 
mashine crashes imediadly after the commit there is wrong 
data on the standby server. But it must not happen, that our
database contains wrong data about the situation that is 
controled.
If you immagine, you close a security valve in a nuclear
powerplant. If the controling computer thinks it is closed
he will probably nuke the plant, as he beleves it is closed.

The missing lock() on the filesystem acualy is not the problem.
If we hold the data on a dualported RAID we also have no locks
available. But if the database does a fsync() on a file, the
file realy is available in its currant state on the standby
server. And if the Controling system crashes I do a filesystem
check on the RAID system, mount it and restart the database.
This works ok, but the filesystem check on a 4 or 8 GB ext2
partition takes far to long. So switching to the standby server
will take several minutes. This is ok for a Coce mashine.
But also the Coce mashine has to be sure that there realy
will come someone to bring new bottles, if it tells the 
controling computer that it needs some. This message mustnot
get lost.
I think Coda will be our solution (in a few months).
The only nessasary points to us are:
1) After an fsync() the data has to be available on the 
   standby server.
2) Local conistent locks will speed up the database by far,
   as we can only start one thread for the database if 
   this doesn't work.
But as far as I see Coda seems to work correctly with locks
on the local mashine.
As we will kill any remaining processes on the working
mashine or even switch it of. There will never be two mashines
accessing the filesystem simultaiously.

Ciao

Wolfram

> > Holding databases in a distributed file system requires locking.  Coda
> has
> > no facility to do this.  Sorry you'll have to use another system.
> > 
> > For true redundancy you should look into a replicated database, since
> > there is more to it that merely replicating the log and data file.
> 
> And needless to say, as a replicated database will have finer granularity,
> and possibly understanding of the dependencies between changes,
> performance should be dramatically better (specically in the areas Wolfram
> mentioned :).
> 
> Something about an end-to-end argument...
> 
Received on 1999-01-22 04:43:16