Coda File System

Re: FUSE, again

From: Jan Harkes <jaharkes_at_cs.cmu.edu>
Date: Sun, 25 Nov 2018 22:04:55 -0500
On Fri, Nov 23, 2018 at 05:22:24PM -0500, Mahadev Satyanarayanan wrote:
> We have been working on a FUSE-based mechanism for Coda, and should
> have a release in the near future.  There is a noticeable performance cost

It is actually 'mostly' released already.


The path took a little, initially I was looking at integrating Venus
with the high level libfuse api, but there were multiple issues like
userspace daemon lifecycle, threading. The end design that would most
likely work was having a 'Coda client cachedaemon' that starts when the
system boots, and then having a FUSE client that connects to the cache
daemon when the file system is mounted.

So I started looking if there was an existing simple file system API for
this connection between the FUSE client and the Coda cache daemon as not
to reinvent wheels and ended up at the plan9 file system protocol. An
advantage here is that there already are one or more FUSE client
implementations, there is a native 9pfs Linux kernel module, and several
userspace libraries that can be linked directly with applications.

Without extensions, you only get access to files and directories, enough
to recover some data from Coda in an emergency but probably not for
general purpose usage. But the specification is extensible and with Unix
and Linux extensions there are symlinks. I initially implemented some
read-only basic functionality and Alexandra Snoy took this and added
read-write functionality and implemented support for both Unix and Linux
extensions. The Unix extensions have been merged and are in recent
releases, I still have to pull a fix to translate fid mappings after
reintegration and the support for Linux extensions.

We used a modified Andrew benchmark to test by untarring and building a
Coda source tree, locally, in Coda using the Coda kernel module, and
using the Linux 9pfs kernel module with and without caching.

- on local filesystem,               1m 59s
- in Coda using Coda kernel module,  2m  9s
- in Coda using 9p2000,             67m 39s
- in Coda using 9p2000 with caching  3m 51s
- in Coda using 9p2000.u,           70m 23s
- in Coda using 9p2000.L,           ~5m

There was an issue with 9p2000.u with caching, I forgot if that got
resolved and I'm not sure if the 9p2000.L was with caching, or if it
implicitly enabled caching.

So although with Unix or Linux extensions we could have ioctls, they
would miss the path->fid resolution provided by the Coda kernel module
and there are platforms that still wouldn't have ioctls anyway, and
ioctls don't exist in the core plan9 protocol so... We reimplemented
pioctl handling by modifying the wrappers that normally hide the various
Unix/Windows differences.

The same way we fake a realm mountpoint when we look up a name that
looks like a domain name that happens to have a Coda server running, we
create a temporary file that only exists in the local cache if it is
opened with a special name. We then write the pioctl 'request' to the
file and close it. Only the user who created the file can reopen it for
reading, and if it isn't opened it will eventually get dropped from the
cache, but when it is opened for reading the actual pioctl 'upcall' is
triggered and the result is written back into the file before the open
returns. And because all of this is implemented in the pioctl wrapper,
no pioctl using applications actually had to be changed. So this works
for cfs, clog, cunlog, ctokens, hoard, repair, filerepair, removeinc.

And... the last couple released versions already included this support,
so your client may already be using pioctls through temporary files
without actually going through the Coda kernel module's pioctl handler.

So some of the things that are still needed are the pulls from Alexandra
to get Linux extensions merged. And we need to namespace user identities
so that the same user ids in different plan9 mounts don't share tokens.
This will probably be a bit of work because user ids are pretty much
everywhere.

Jan
Received on 2018-11-25 22:05:07