Coda File System

From: Jan Harkes <jaharkes_at_cs.cmu.edu> Date: Thu, 11 Sep 2008 17:02:27 -0400

On Wed, Sep 10, 2008 at 01:13:52PM +0000, Markus Pfeiffer wrote:
> On Tue, Sep 09, 2008 at 12:06:38PM -0400, Jan Harkes wrote:
> > On Mon, Sep 08, 2008 at 09:40:51PM +0000, Markus Pfeiffer wrote:
> > > What is the current status of ipv6 support for coda? I have seen that rpc2 has
> > > ipv6 support and that some of the daemons seem to communicate via ipv6 but
> > > some other things obviously do not work.
> > 
> > Coda clients and servers use the ipv4 addresses as a 32-bit unique
> > identifier and the volume location query responses don't contain
> > hostnames, but ipv4 addresses.
> > 
> > There is quite a bit of work left to get the clients and servers to the
> > point where we can actually use ipv6 addresses.
> > 
...
> 
> Are there proposals or whitepapers discussing the necessary protocol  changes?
> Is there anything I could do to bring such efforts forward? I have a bit of
> spare time and coding experience.

I started working on this a while ago, and implemented the most likely
only protocol level change necessary by adding an rpc call,

    ViceGetVolumeLocation(IN VolumeId volid, OUT RPC2_BoundedBS HostPort);

The server-side code for this call was implemented almost 2 years ago,
and is available in any Coda servers since Coda-6.9.1.

The idea of this call is that when the client queries the volume
information for a replicated volume, it currently gets a list of volume
replica identifiers as well as ipv4 addresses. We can then ignore the v4
addresses and use ViceGetVolumeLocation to obtain a string that contains
the hostname (or address) and optionally a port number, sort of similar
to how the hostname/port number part are specified in a URL, i.e.

    just a hostname:		hostname
    hostname with port:		hostname:2432
    ipv4 address + port:	1.2.3.4:2432
    ipv6 address + port:	[2002:8002:ce58::42]:2432

(not sure if a ipv6 address without a port should use brackets).

In any case what the string could contain doesn't really matter that
much at the moment because there is no client that makes this rpc2 call,
so nobody actually is using the results, however the current server-side
implementation simply copies any hostnames as it finds them in the
/vice/db/servers file.

So now that we can get a hostname (and optionally port number) instead
of just a single ipv4 address we get a nice level of indirection because
this name can map to any number of ipv4 and/or ipv6 addresses and the
DNS results could even be changed on a geographic or network location
basis (intranet vs. public addresses for servers).

Some of the client changes that are necessary have to do with the fact
that srvent (datastructure representing a Coda server) is currently not
stored persistently. All places in the code that want access to the
srvent use the 32-bit ipv4 address as a lookup key, which is the only
information needed to create a new srvent object if it is missing. But
with hostnames we probably want to store them as part of the srvent in
RVM and have all places that refer to srvents by ipv4 address use either
a srvent* or some randomly assigned lookup key. Either way reference
counting is probably needed.

Once we have a hostname-based server identifier we can use the hostname
whenever we create a new rpc2 connection. Of course this means that the
client is now doing a name lookup whenever a new RPC2 connection is
created. And these things are blocking, which is sort of a hindrance for
a userspace threaded appplication because we cannot handle kernel or
network requests until the resolver is done.

So that is where I sort of got stuck, trying to find an asynchronous
resolver library that would allow linking with rpc2's LGPLv2 licensed
code and which is somewhat easy to use (getaddrinfo) but still allow a
certain amount of control over caching. The caching of lookup results is
interesting because if we assume mobility of the clients the DNS level
timeout may be too generous, and we probably also need to invalidate the
cache whenever we move between networks. On the other hand maybe that is
a non-issue if people already use a nameserver cache like nscd or a DNS
proxies such as dnsmasq. Another option would be to fork off one or more
helper processes similar to what squid does (used to do?) which can
perform plain old blocking DNS lookups with an off-the shelf libresolv
and maybe do some caching of results and such.

Server-side there is a little bit related to callbacks and such however
none of this needs to be stored persistently so any necessary changes
should be as far as I can tell considerably easier.

> Also, if there are other pressing issues which might be easier to tackle I
> would also offer my help.

I don't really have a good list laying around. There is the list of
known/reported bugs, but the hardest part there is often figuring out
if the problem is still relevant, what may have been the cause, and
finding a way to reproduce the problem.

    http://www.coda.cs.cmu.edu/trac/report/1

There are of course a lot of things that need to be done,

- Faster cache revalidations and reducing overhead of tracking
outstanding callbacks by keeping track of all updates in a log and
having clients query the log after reconnection, when they receive a
volume-level callback, or periodically.

- Improve the way directory data is stored. Avoid size issues by storing
them in container files instead of RVM. Use RVM to track uncommitted
updates to maintain the existing consistency guarantees. Teach the
kernel modules to read directly from the container-file representation
instead of having venus translate the in-RVM directory data structure
into a BSD-FFS 'inspired' on-disk representation, which is in turn
parsed by the kernel modules into whatever the VFS needs.

- Allow people to use LDAP instead of pdbtool to manage Coda users and
groups. This would enable users to manage their own groups and
possible improve integration into existing systems. Some experiments
with how a Coda user could interact with an LDAP-based backend can be
found in /coda/coda.cs.cmu.edu/usr/jaharkes/ldap/.

There is a script 'codapts' which has similar functionality as provided
by AFS's pts (and Coda's pdbtool) commands. It is also hardcoded to use
a single server which is not even running an ldap daemon anymore.

The 'token.py' script in the same directory was working towards creating
an openldap plugin that could return valid Coda tokens for authenticated
LDAP users. The idea being that it should be possible to even take auth2
out of the loop.

Jan

Coda File System

Re: IPv6 status?