Coda File System

Re: Yellow zone, slowing down writer

From: Jan Harkes <jaharkes_at_cs.cmu.edu>
Date: Thu, 19 Feb 2004 16:20:16 -0500
On Thu, Feb 19, 2004 at 01:22:26PM -0600, Jason A. Pattie wrote:
> I also noticed that it cannot/does not put the correct permissions on
> symlinks?  And apparently it cannot make hardlinks, either, even in the
> same volume.

Cross directory hardlinks are not supported, the server-server
resolution has a hard enough time with cross-directory renames.

As far as I know, symlinks don't really have permissions in Unix. The
following is on a local ext2 filesystem,

    delft:~$ ln -s foo bar
    delft:~$ ls -l bar 
    lrwxr-xr-x    1 jaharkes jaharkes        3 Feb 19 15:43 bar -> foo
    delft:~$ chmod 777 bar
    chmod: failed to get attributes of `bar': No such file or directory

That is because the chmod operation always tries to follow the link and
there isn't something like an lchmod operation (similar to lstat).

Interestingly the initial bits for the link are still set based on your
current umask.

    delft:~$ (umask 0000 ; ln -s foo bar)
    delft:~$ ls -l bar 
    lrwxrwxrwx    1 jaharkes jaharkes        3 Feb 19 15:43 bar -> foo


On Thu, Feb 19, 2004 at 01:29:02PM -0600, Jason A. Pattie wrote:
> Jan Harkes wrote:
> | I've said this many times before, there is no such thing as guaranteed
> | connected operation in Coda. If anything goes wrong during a write/store

Sorry about the ranting in my response there.

> So where do I begin the process of troubleshooting what went wrong?  I'm
> currently in the Red state.  The scenario is that I'm on the actual file

Ok, if you've hit the red state, it is pretty much certain that there is
a conflict and nothing is actually being reintegrated.

> server (scm) transferring files from one location on another partition
> to the /coda/<realm>/<dir> location using tar.  I.e., the file server is
> running both codasrv and coda-client (venus).  Is this a bad thing?

It is not really a bad thing, it just stresses your system somewhat.

We've got a tar process reading data off the disk, then opening the
(existing) files in Coda in order to overwrite them. But venus doesn't
realize your completely replacing them so it first fetches a copy from
the server. So the Coda server is reading off the disk and venus is
writing to the disk, all of this in relatively small blocks and a lot of
back and forth context switching between the client and the server.

Then the tar process gets a chance to overwrite the cached data in the
client. If the client is logging a backup copy is made when the file is
closed to allow other processes to open/write the file without
interfering with the copy that is being reintegrated. We then send the
data to the server (read on the client, write on the server). So we
aren't really reading and writing the X megabytes of data just once. We
actually end up reading and writing about three times as much. (we only
write two times X when we're not reintegrating because we don't need to
make the backup copy).

Now for a lot of files this really isn't a problem because they are
small enough to fit in memory and the client doesn't actively sync
updates to disk. So on a machine with only a client all this reading and
writing is purge to/from the pagecache and many files are deleted before any of the data has even had time to hit the disk.

However the server does call fsync after every major modification and
the Linux kernel implementation doesn't really keep track of dirty
blocks on a per-file basis. Up to version 2.2 it in fact flushes every
single dirty block in the system to disk, I think 2.4 flushes all dirty
blocks associated with the same harddrive. Combined with the memory/swap
usage you're probably giving the poor machine quite a beating.

> directory.  My assumption was that I could reissue the tar command and
> replace all the files with the correct ownership and permissions.  If
> this is not the case, then I can easily delete all files and start the
> copy over again.

What I tend to do is to use rsync locally instead of tar, only the
differences between the two trees are copied that way, and it could be a
little slower/easier on Coda because of the additional time spend
checksumming. It also makes it easy to abort and restart the operation.

Then I have a 'codacon' process running in an xterm and I keep an eye on
that. Whenever it stops giving the SetAttr/Store messages the client
probably went disconnected and I suspend the rsync (^Z), run cfs
checkservers, and cfs writereconnect/cfs listvolume to make sure
everything is reintegrated before I continue the rsync.

It is a bit of hand holding, but when I have to split up a volume or
otherwise move a lot of data it is often because a server has run low or
out of space in a partition and it is important to get it there as
quickly and with as little hassle as possible.

Jan
Received on 2004-02-19 16:24:12