Coda File System

Re: /coda: Input/output error

From: Jan Harkes <jaharkes_at_cs.cmu.edu>
Date: Thu, 26 Aug 1999 22:12:04 -0400
On Fri, Aug 27, 1999 at 12:30:38AM +0200, Torsten Foertsch wrote:
> hi,
> 
> I have 2 questions.
> 
> (1) Is there any way to avoid the venus cache overrun. I have tried 'cfs
> strong', 'venus -cf 10240 -c 102400 -vdds 300000000 -vlds 30000000'.
> 
> I thought I make the cache large enough to hold the entire /usr filesystem
> (ca. 300MB). This blowed up the venus client to 350 MB RAM but the problem
> still happens.

What is the reason that venus crashes? It should be logged in either
/usr/coda/venus.cache/venus.log, or /usr/coda/etc/console.

If the reason is running out of xxx then "cfs strong" should keep the
client connected, so no logs are being built up. It is also useful to
run "codacon" in a separate xterm, and suspend the copying when no
Create/Store messages are printed for a while.

> (2) Sometimes I get 'ls: ...: Connection timed out'. I tried 'cfs
> waitforever'. Yes, it waits forever. This timeout happens also on
> readonly access. If I cannot avoid this I can't use coda for my servers. 

Hmm, resolution must be stuck. I believe I just found a bad line of code
which is triggered when a new thread enters a volume that is currently
resolving, where it will wait forever. I actually have been stressing
the resolution/reintegration stuff lately and fixed several problems. I
hope to get some new binaries out of the door soon.

> Now I have a 'Local inconsistent object at /coda/bin' after such a
> connection timeout. How can I repair it?

Make sure you have tokens:
    # ctokens

Use the repair tool to fix the name-name conflicts in the bin directory.
    # repair /coda/bin /tmp/fix -owner 0

or...
    # repair
    repair > beginrepair /coda/bin
    repair > comparedirs /tmp/fix -owner 0
    repair > dorepair
    repair > quit
    #

> I have written a perl script to stress a filesystem. Are you
> interested in? Coda is about 2-3 times slower than an ext2fs. This is
> a good result I think.

The overhead is mostly caused part by the open and close syscalls that
go back up into userspace (to venus). i.e. The script will run slower
when the average filesize is smaller.

Jan
Received on 1999-08-26 22:13:01