Coda File System

Re: Problems and suggestions setting up a new Coda system

From: Jan Harkes <>
Date: Sun, 12 May 2002 21:57:24 -0400
On Sun, May 12, 2002 at 04:26:55PM -0700, Jim Carter wrote:
> Some users in the mailing list use the traditional client-server model,
> like for AFS or NFS, while others have a "peer to peer" organization.  I
> planned to have a Coda server on my laptop and another on the base
> machine, with data migration according to whichever I used most
> recently.  The docs imply that either model will work.

The peer-to-peer organization isn't all that useful. It is actually
better to rely on the client cache and reintegration than on
server-server resolution.

> Since I'm in a hostile environment (wireless network shared with
> students), I planned to use Kerberos with Coda. However, my distro comes

Kerberos only protects the key exchange between clog and the auth2
daemon. It doesn't add a reliable security layer to the rpc2 and sftp
(file) transfers. For a hostile environment you might want to look at
setting up an IPsec tunnel.

The CVS version has rewritten kerberos support (still only on the
authentication path), and I tested compilation against heimdal,
mit-krb5, and kth-krb4. But the initial report is that it doesn't
really work yet, so there are still some bugs to be ironed out.

> I found the paper about security in the docs, but I couldn't ascertain
> whether the presently implemented Coda has client-server encryption
> turned on by default or by a conf file option.  I assume this is

Not at all, there are in fact places in rpc2 where encrypted parts of a
packet are 'rewritten' without decoding/re-encoding. Needless to say,
the received packets are unintelligible to the receiver.

> independent of Kerberos.  The legal status of encryption has changed a
> lot, for the better, since that paper was written.  Be the first on your
> block to use Rijndael :-)

Legal status might have changed, but as far as I can see no clear
'precedents' exists. And with the added paranoia after Sept11, current
political policy could suddenly change again.

> The location of the runtime data needs to be straightened out. /usr/coda

Look at debian/coda-client.postinst for an example on how all these
paths can be redefined.

> For config files, I prefer key value pairs, one per line, with a
> provision for comments.  The location of all databases and writeable

Did you see /usr/coda/venus.conf?

> and that's good, but the -f option is more traditional.  It's really
> hard to understand the Venus config file, and some important parameters
> can't be set there.

? key-value pairs with extensive comments, and pretty much all important
parameters are set there.

> located. SuSE has a lot of nice features in their startup scripts
> (supporting Linux Standard Base) which I would need to put in.

So does debian, the main tarball is basically built around a mix of the
old CMU Coda installation on Mach and NetBSD and updated for RedHat 5.2.

> Some "cowboy programming" styles of log file rotation are helped if the
> log file is opened in append mode, so you can do "cp /dev/null logfile"
> and it will actually get rewound.  I didn't check if you already do this.

Every Coda binary seems to have it's own method of logfile rotation
implemented, if at all.

> "Monthly tapes are saved for eternity (Admin manual, sect. 12.4).  It's
> important to have a finite retention period to limit the amount of work
> you have to do in response to a subpoena, and it's important to formally
> adopt the policy and publish it (giving your adversaries constructive
> notice), to cover your ass in case of obstruction of justice charges.
> Think "Arthur Andersen".

I use Amanda for backups, a tapecycle is about 2 weeks.

> The procedure for flushing the client cache seems unreliable.  Here's a
> protocol suggestion:
>     a.  The servers maintain a "cache version number" which is incremented
> 	when the cache should be flushed.
>     b.  On reconnection, clients are told the current number, and they
> 	flush if they have the wrong one.
>     c.  When an emergency is declared, the servers poll the clients
> 	recently heard from, tell them the new number, and make them flush.
> 	But the servers won't deliver any files until repairs are finished.

Ehh, we revalidate the version of all cached objects, objects that have
been updated at the server are flushed. Objects that have been changed
on the client are reintegrated. When an object has seen concurrent
updates reintegration fails and the user needs to run repair.

> Making changes to a replicated readonly volume is very intimidating.  It

That's why they aren't really supported anymore in recent versions. Just
use ACL's to protect a rw volume from updates.

> I noticed a 30 second timeout in one of the header files.  When I'm
> shutting down my two systems for the night, I would prefer not to have
> to wait 30 seconds for the second one to realize that the first one has
> gone down, before it too can exit.  Does a server that's exiting or

If you kill one of the servers and use TCP the keepalive timeout is a
lot longer than 30 seconds. And on a congested 28K8 modem link 30
seconds is often close to not enough time to ping when a lot of packets
are queued in the network layers.

> becoming isolated poll all recently heard from clients and servers, and
> positively notify them that it's going down?  Similarly, Venus should
> notify the servers that it's disconnecting.

When I walk out of the range of my wireless network card I can hardly
expect venus to 'notify' the servers. Similar when someone trips in the
lab and pulls a network cable. The current method handles this
transparently, I never have to think about informing venus when I
suspend my laptop and head home.

> For RVM data and log, the docs mention that a plain file can be used
> (for trying it out), but certain consistency checks have to be omitted.

? Files have several advantages, they can be mmapped, significantly
reducing server startup times. On linux even raw device access goes
through the pagecache, so the initial reasons for the decision to use
raw devices are not really valid on current Linux kernels. 

> On Linux you can preallocate a plain file (dd if=/dev/zero of=plainfile
> count=1440) and then "mount" it using "losetup", causing a block device

Too much chance for deadlocks and inconsistencies. And it is unclear
what the writeback policy of the loopback device is.

> The kernel module needs a version string built in.  I'm not sure whether
> I prefer noisy or silent modules.  Probably a developmental module
> should syslog (kprintf) its identity and version when loaded.

Already there, been there for ages (probably since Linux 1.2.something),
The only problem is that Linus often drops 'cosmetic' patches, f.i. the
ones that update the version number, grrr.

    Apr 29 19:13:00 mentor kernel: Coda Kernel/Venus communications, v5.3.15,

> I was surprised to find no per-cell subdirectory under /coda, like AFS
> has.  I have separately administered nets at home and at work, and I
> would frequently want to have both mounted at the same time.  Adding the
> "multi-homed" capability to Venus should get a high priority.

Cells don't exist yet. But you can run multiple concurrent venus daemons
that each have their own /dev/cfsX and /coda-X mounts. The problem is
that the cache is not 'shared' and that cfs and clog only know about
/coda so it is a bit hard to get tokens for the other cells.

Received on 2002-05-12 21:58:21