Coda File System

From: <u-codalist-z149_at_aetey.se> Date: Sat, 2 Aug 2014 00:16:50 +0200

Hello Jan,

I am thankful for your feedback and sorry if you feel that your
comments have been ignored. I should have explicitly acknowledged that
your feedback has influenced my reasoning and also that it is me who
bears the responsibility when not following your advices.

You know the algorithms and the internals of Coda a whole lot better than
probably anyone else. I am much less familiar with the code but see Coda
more from the deployment perspective and have thus more prerequisites
to be able to see the implications there. This leads to our views and
reasoning being different.

On Fri, Aug 01, 2014 at 01:54:06PM -0400, Jan Harkes wrote:
> On Thu, Jul 31, 2014 at 03:19:01PM +0200, u-codalist-z149_at_aetey.se wrote:
> > The definition of the relevant core functionality is based on my
> > experience of using Coda for myself and deploying it at Aetey and Chalmers
> > for about 12 years - and of course on your comments or lack of those.
> 
> I have actually been an active Coda user for 18 years and a developer
> for about 16 of those, and have been giving you feedback which has been
> completely ignored. So it seems completely pointless for me to give you
> feedback on any of this roadmap anyway, but here I go.

I assume it is plainly necessary and fair to let the reader know the
background which led to the ideas and proposals.

Please do not feel offended. Coda would hardly become the system I fell
in love with, had it not been your work. I do not ever ignore your
opinions even though I do not always share them.

> I disagree with quite a lot in your email and will highlight a
> few here.

Talking (as we do now) helps a lot to reduce disagreements.

> > in the order of diminishing importance:
> > 
> > - delivering software to *nix-like workstations and servers
> >   avoiding any dependency on locally installed software
> > - large scale administration of *nix-like workstation (solution
> >   originally developed at Chalmers, "hotbeat")
> > - storing the data to be used/published via web-like services
> > - accessing one's own personal and/or work-related data
> >   (aka homedir and alike)
> > - storing mail (in Maildir)
> 
> It looks like in this list actual users are listed as #4 and #5, but the
> top priorities are sysadmin/package install stuff that can be done with
> something like rsync and a nightly cronjob.

This importance order simply reflects my experience which is different
from yours.

What you do with "rsync and cronjob" I either do differently or do not
have to do at all - thanks to Coda.

You know very well how to do such stuff in a certain way but I also
did compare "the traditional way" (rsync) with the approach available
due to Coda. The latter makes a huge difference.

> > - storing mail (in Maildir)
> >   value of Coda: convenient, eliminates the need for an extra protocol
> >         (like IMAP) and extra authentication and authorization management,
> >         mail contents is consistently cached at the client/MUA,
> >         MXs can act in parallel instead of buffering/resending
> 
> Ignoring the 'minor' inconveniences of 4096 entry directory limitations,
> this is actually only convienient if your email application treats a
> Maildir folder pretty much like an IMAP server because building a simple
> index of all the email in a folder requires an access to every email. My
> inbox currently has 16864 emails, and that doesn't even include
> mailinglist traffic which is placed in their own respective folders.

I agree, this would be hardly applicable for a usage pattern like yours.

Note that in the summary I listed the _positive_ role of Coda.

To get rid of its _dis_advantages and limitations is the goal, for
the sake of the given advantages.

> But luckily mutt, kmail, etc. do create their own index caches which
> significantly speed up loading large maildir folders. However the way
> these caches are updated often do not have the same lockless properties
> of maildir itself, so now instead of getting conflicts on the email
> folders, you get conflicts on the index. The one good thing is that at

You have probably a quite certain scenario/situation in mind, possibly
avoidable?

By the way, it is you who once shared with me your experience with
putting mail on Coda. This knowledge helped me a lot when I decided
to try it too!

> least that doesn't prevent delivery... unless when you deliver on the
> same machine as where you read your email and there is a conflict on the
> index, everything gets appended to the CML and nothing is propagated
> back to the servers even if you have tokens. Guess how much email you
> can lose when you install a new client.

I have no reason to wish to run mutt on the same computer where the
mail server lives. My infrastructure is to all appearances extremely
different from yours, no surprise we see different pictures, possibilities
and problems. I believe noticeable benefits can be collected and problems
avoided by choosing appropriate practices.

> > - volume names to be treated as comments, meant for humans only,
> >   dropping the corresponding indirection layer and the related code
> 
> The corresponding indirection layer is only used for humans. Internally
> Coda clients and servers use the volume id, the only places the name is
> used is for cfs makemount and when volume ids are mapped back to names
> when we display updates in f.i. codacon or cfs listvol.

Oh, thanks! A confusion of mine has been cured.
I believed mountpoints contained volume names.

> > - clients need to contact VSGs but servers only need to contact AVSGs,
> >   severs have also higher demands on reliability of the mapping
> >   AVSG (a set of server ids) => set of endpoints (ip:port),
> >   the mapping is to be implemented by a "db/servers"-lookalike
> 
> Do you actually know what AVSG means? Because it isn't just a set of
> serverids.

Doesn't it mean "available volume storage group"? :) From the papers which
I read it was not entirely clear which part of the related information
(like version vectors, others?) must come from/to the client and which
part can be fully handled between the servers. Doing my best.

> > - volume ids shall be maintained realm-wise not server-wise
> >   (each replica of the same volume shall bear the same volume id),
> >   dropping the extra mapping from repvol to volreps and the corresponding
> >   code
> 
> Then you cannot expand a conflict on the client and deal with the fact
> that we then have to compare different directory objects which will have
> identical object identifiers (realm.volume.vnode.uniquefier).

You seem to imply something which I did not mean.
You can always _construct_ an unique identifier per replica if you
combine what is the replicas common id with the hosting server id.

Paraphrasing your notation above, the objects will be:
 (realm.(server.volume).vnode.uniquefier).

Today if one server has a replica 01000023 and another one has 02000023
this does not mean they are indistinguishable. What I mean is at volume
creation ensure that the id part other than the server id is equal
between replicas. No need to _store_ the server id there if we
pass it along when necessary. No logical difference, less code and less
confusion for the next deployer/newcomer.

> The replicated volume and volume replica distinction goes much deeper
> than just making for confusing repvol vs. volrep naming.
 [...]

Your description fully reflects my idea of those but I am grateful for
the summary, this helps to remember the relevant properties while reading
and trying to make sense of the code.

> > - the kernel part of the Coda client is to be simplified by dropping
> >   the pioctl part which should instead go via a plain socket out of
> >   the /coda name space, importing the change from Ulocoda
> 
> The kernel part of the pioctl interface is tiny. text is 163 bytes and

It is not for the size, but for convenience and clarity.

The kernel module has the apparent and fundamental role of translating
the file access system calls into upcalls to Venus and back. Its other
and barely related function is to provide an authenticated IPC facility.

If such functionality is scarcely available on different platforms,
let's implement it, but then could the interface be made more general?

I really wish I could use the same Venus binary as long as the kernel
provides the necessary general purpose ABI. This would reduce my concrete
client maintenance burden about 3-4 times.

I assume it would be possible to do this by delegating the mount()
operation (not mapped/mappable in the different implementations of the
Linux ABI) to a separate tiny "native" executable. Then Venus would
only use the "portable" system calls - iff the pioctl IPC interface is
platform-independent at run time, not only at compile time. AFAIK it is
not. I do not think pioctl() can be expected to be emulated by different
implementations of Linux ABI.

> uid of the connecting client. So either relying on the politeness of the
> client to not lie about the identity of the user making the request, or
> assuming that every system is a single user system.

Looking from a different side, I actually would prefer to run a Venus
instance (and a cache instance) per local uid. This would make
Venus simpler - it would no longer have a business of managing local
uids. Doesn't this look like a useful approach? (modulo losing the data
access correlation between users on the same computer, but this is a
currently present optimization which costs Venus complexity and
results in a known deficiency in security isolation)

> Moving to FUSE would be a much better approach if you want to get rid of
> kernel complexity. Patches are welcome.

That would be more portable, sure, but I expect that we'd lose performance.
I appreciate that the current implementation offers very high efficiency
for read()/write()/seek().

Thanks again Jan. I am concerned that you do not feel your comments
being appreciated. We can sometimes talk past each other and may have
insufficient empathy for the situations we imagine - but I always take
your comments very seriously, even in the cases when your arguments do
not convince me right away.

I hope you will be kind to enlighten me sometimes about the implementation
details and also that you would bear with the difference between our
perspectives towards Coda.

Best regards,
Rune

Coda File System

Re: Coda development roadmap