Coda File System

Re: coda funding

From: Jan Harkes <jaharkes_at_cs.cmu.edu>
Date: Mon, 12 Nov 2001 14:59:58 -0500
On Sat, Nov 10, 2001 at 09:15:52AM +0800, Jeremy Malcolm wrote:
> Ivan Popov wrote:
> > 
> > It would be very interesting to get an estimation of the resources
> > (person*month) needed for making Coda usable "in the wild":
> > 
> > 1. Make it robust (may be we are already there? :)

No, I can still easily crash both clients and servers. It's getting
better. Also the code itself isn't terribly robust yet. Even when we
reach a point where the system appears stable, a simple compiler,
filesystem or VM change in the operating system can lead to new (or
covered up) bugs.

> > 2. Relax server file number limitation to say 500Gb of 10k-files
> >    (even if by a cute configuration utility creating 10 servers at once?)

Hard problem, still implementing parts of the solution, such as the
changes to decrease memory usage during reintegration that went into
5.3.16/7 and hashing out ideas of where we want to end up and how to
get there.

> > 3. Create a working client solution (client, gateway, samba setup,
> >    anything) for Win2k & similar

I could tell you how far Phil got over the summer, but someone would
have to kill you. In any case, we have some licensing issues to resolve
before we can make any claims in this area.

> > 4. Introduce real encryption and make both the server and clients
> >    basically resistant against spoofing, buffer overflows and other
> >    evident types of attack

Buffer overflows, we're pretty safe as far as RPC2 is concerned. However
it is still possible to send arguments along with an RPC2 call that do
not overflow any buffers but will still cause some headaches for servers
that want to process the packet.

> > 5. Support (and may be -hint-hint- standardize) multiple mount points
> >    like afs and dfs do, even if it would rely on dns-names only

Got some ideas here, we don't need to change anything in the servers, it
is just how a client would interpret the 'magic symlinks' that indicate
a volume mountpoint. i.e. currently I would mount a volume in the same
cell as
    cfs mkm /coda/usr/jaharkes vmm:u.jaharkes

and at some point I'd like to allow,
    cfs mkm /coda/usr/jaharkes-in-yourcell users/jaharkes_at_yourcell

Simple rules to interpret the volume names, if there is no @ in the
volume name, assume the volume is in the same cell as the parent volume.
Whenever there is an @ the tail indicates a cell/realm identifying name
(DNS?), if there is nothing in the tail use the client's configured
default cell. If there is nothing in front of the @, a ViceGetRootVolume
rpc (or perhaps ViceGetPublicRoot?) is made to the cell to obtain the
volume name to be mounted. Ofcourse we need multiple user identities
(e.g. clog jaharkes_at_yourcell), but the volume/vsg/mgrp changes have
cleaned up a lot to make it easier to do this.

Ofcourse we still need to find a way to map the cell name to the list of
rootservers for that cell. Simplest solution, a plaintext file in
/etc/coda, other solutions would be DNS lookups, ldap. Maybe we can
simply fork off an helper application that can do such mapping in
whatever way it pleases to. Perhaps having an informal policy on volume
names will help coordinating cross realm mounting f.i. everyone could
have a public volume that can be used as a cross-realm entry point.

So /coda would implicitly be the same as "@" (default cell, and use a
ViceGetRootVolume lookup to find the volume name). An AFS style layout
could be created by populating /coda with 'cfs mkm cs.cmu.edu
@cs.cmu.edu', 'cfs mkm andrew.cmu.edu @andrew.cmu.edu', etc.
Or 'cfs mkm your-own-local-spamdir tmp@'. Ah well, I guess you got the
point and otherwise more examples wouldn't help.

> 6. Easier, one-step setup.  Most of the mistakes that people seem to
>    make and most of the problems they encounter seem to be because they
>    made a mistake (some mistakes quite reasonable) in setting it up.

But then again, Coda is a complex system. Very different from existing
network filesystems, even compared to AFS. And we're doing too much
ourselves, such as the authentication. Ever set up a kerberos realm?
That's not a one-step setup either.

Most mistakes are commonly not even related to Coda, but were already
there, people simply never noticed that their routing/DNS is so messed
up they can't even do 'telnet mylogin_at_thishost' because typically no one
needs to telnet to the local machine once they are already logged in and
applications tend to use unix domain sockets or the loopback interface
of the machine. Another common problem is when gethostbyname("server1")
on server1 returns 127.0.0.1, and the server happily tells any client
that are interested clients that they can reach him by talking to their
own loopback interface.

> 7. GUI (eg. GTK) interface to resolve reintegration conflicts, a la
>    Win2K.  "The file X was changed on your computer (date Y, size Z) 
>    and on the server (date A, size B) while you were offline.  Choose an
>    option: [ ] overwrite server [ ] overwrite client [ ] keep both".

Wow, since when does Win2K have reintegration conflicts??

In any case, for me conflicts tend to be of the category "when it rains,
it pours" and having a commandline based interface makes it a lot easier
to fix them en-masse (find . -type l -exec repair {} /tmp/fix)

> 8. Fix the problem with two replicated servers that I and at least a
>    couple of other people are having (gory details in the archives).

Right. I'm not sure whether it is a bug in Coda at all. Especially since
we've been running doubly and triply replicated servers here for ages
and in an attempt to reproduce the reported problems I have tried
setting up several doubly replicated servers from scratch both in their
own 'cell' and as part of our existing setup. All of which worked
without any problems.

Perhaps the combination of actual useable logging of 5.3.17 and the new
/usr/bin/getvolinfo program that basically only does a ViceGetVolumeInfo
calls and pretty prints all the (non)sense the server is sending back as
a result might help us find it this time.

Jan
Received on 2001-11-12 15:00:00