Coda File System

From: Jan Harkes <jaharkes_at_cs.cmu.edu> Date: Wed, 6 Dec 2006 11:34:35 -0500

On Mon, Nov 20, 2006 at 08:30:19AM -0500, Greg Troxel wrote:
> Currently I'm not using coda so heavily because there's a locking
> problem in the NetBSD kernel code.  But, I may get around to fixing
> it.  (For anyone inclined, I think it's in the VOP_LOOKUP
> implementation, where there are 3 separate rules for IS_DOT,
> IS_DOTDOT, and regular, and the IS_DOT* cases aren't handled right.)

You've been fighting those locking bugs for a while now,

    http://www.coda.cs.cmu.edu/maillists/codalist/codalist-2003/5023.html

Is that an older problem? It looks like that one was symlink related.

> Besides robustness, which is key, I think the next big thing to tackle
> is paging directory info out of rvm to container files on the server;
> this should remove the filesystem size limit that's now a big
> problem.  Perhaps the 256k/dir limit can go as well.

That is definitely one of the things I keep trying to find a good
solution for. A problem is that RVM is providing us with reliable update
semantics and if we move the directories out of RVM, we still need to
either copy directories back into RVM when we want to change them, or
have some persistent log to make sure the directory containers are
correct after a crash (re-apply committed changes or undo aborted ones).

Another problem that I really want to tackle is a series of problems
that all originate with servers identifying volume location by sending
IPv4 addresses back to the clients.

- servers can not be expected to reliably resolve their own name,
  127.0.0.1 or the address of an internal backbone network is perfectly
  valid for the server, but useless to a client. Right now we have
  various checks to avoid some of the problems but not all.

- Multi-homed servers don't have a single v4 address and the server
  doesn't know which address is useful for it's clients. It could even
  be that some clients should use one address, while others need to use
  another address. Also server may or may not respond from the same
  address/interface a request was received depending on the routing
  tables which in turn confuses the client.

- We can't use IPv6, those addresses don't fit in the available 32-bit
  space. RPC2 supports it, and all other clients and daemons (except for
  venus and codasrv) already seem to work correctly on a v6 network.

- Server migration, if we want to move a single server to a different
  network, all Coda servers in the same realm need to be restarted to
  force them to resolve that server's name to the new ip-address. And
  all clients need to be reinitialized to make sure they forget about
  the old address.

The hard part right now is that those ipv4 addresses are used pretty
much everywhere and the object that identifies a server isn't persistent
across client restarts as it can be recreated based on one of those
scattered ipv4 addresses. I'm sure that even once that is done other
issues will surface, like when should we re-resolve the name, and would
we need an asynchronous resolver to avoid blocking the client.

Finally I have a vague idea that I'd like to try at some point, getting
rid of hard-links or at least the UNIX semantics of hard links.

Effectively 'ln a b' could be used to trigger a copy operation on the
server where it creates a file 'b' with the same contents as file 'a'
without requiring data transfers between the client and the server.
Maybe at some point even trigger a direct server->server data transfer
if the link happens to be between different volumes. If we combine that
with content addressible storage for container files on both the server
and the client, copies within the same volume only create a new vnode
but use no extra disk space.

As far as I have found there are 2 main reasons hard links are used by
applications. One is to conserve disk space, which is solved by using
content addressible storage for container files, the second is to
implement a rename that does not remove an already existing target
object, in which case the source of the link will be removed when the
link succeeds so it doesn't matter if we implemented the link as a copy.

Of course with a traditional UNIX hardlink we create multiple directory
entries that point at the same object. If we use copies operations like
chown, chmod, or truncate would only affect that one copy. But not
having hardlinks in the UNIX sense may simplify the code a bit and we
can allow cross-directory hard-links operations without having to worry
about crossing security boundaries or resolution related problems.

Jan

Coda File System

Re: The state of Coda