Coda File System

Re: minimum Venus cache size

From: Jan Harkes <>
Date: Mon, 9 Dec 2002 10:59:03 -0500
On Sun, Dec 08, 2002 at 06:30:56PM -0500, Anthony Nicholson wrote:
> (1) What is the minimum cache size one can specify in venus-setup (the 
> value of "cacheblocks" in /etc/code/venus.conf)?
> (2) I originally had my system configured with ~ 20 meg cache 
> (cacheblocks = 20000), everything is working fine. I changed the cache 
> size to 4092, restarted venus using -init, and ti will run successfully 
> for a bit--before it coredumps with the following backtrace (from gdb):

About 20MB is the minimum that is known to work, you might be able to
get is smaller by tweaking things like the number of files or CML entries.

I would recommend to keep your minimum cache size to about twice the
size of the largest file. Especially if you are trying to keep the cache
as small as possible.

> (gdb) bt
> #0  0x420292e5 in sigsuspend () from /lib/i686/
> #1  0x080ae5ad in SigChoke (sig=11) at
> #2  <signal handler called>
> #3  rec_olist::first (this=0x20081f08) at
> #4  0x080e4dfe in rec_olist_iterator::operator() (this=0x81b4650)
>    at
> #5  0x080e54ab in rec_ohashtab_iterator::operator() (this=0x15089ed8)
>    at
> #6  0x0809211e in volrep_iterator::operator() (this=0x15089ed8)
>    at
> #7  0x080929a3 in vdb::TakeTransition (this=0x20083f88) at
> #8  0x080925bd in VolDaemon () at
> #9  0x080a4d10 in vproc::main (this=0x812a068) at
> #10 0x080a459a in VprocPreamble (init_lock=0x812a0a8) at
> #11 0x400999be in Create_Process_Part2 () at lwp.c:792
> It crashes with signal 11:
> 18:32:16 Fatal Signal (11); pid 7430 becoming a zombie...
> 18:32:16 You may use gdb to attach to 7430
> 18:33:13 RecovTerminate: clean shutdown
> As you can see from the tail of venus.log below, it seems to crash in 
> BeginRvmFlush(). I really need to decrease my cache size to the absolute 
> minimum Coda allows for an experiment we are doing.

No, it obviously crashes in the volume iterator, it tries to enter/exit
all volumes to trigger a connectivity state transistion. The small cache
might be causing volume replicas to be destroyed while we're walking
through the replicated volume list. It is the first volume we try to get
off of the list of volume replicas (rec_olist::first()).

The iterator is initialized before we start walking the lists and
nothing is holding the refcount on this first volume. As entering and
leaving a volume are yielding operations, another thread could easily
destroy the volume and the code crashes dereferencing a dead pointer.
The same problem probably also exists with the 'next' pointer.

The repvol_iterator, and volrep_iterator really should bump the refcount
on any volume objects they have a pointer to.

Received on 2002-12-09 11:03:03