Re: Venus Segfault

From: Alan Tam <>
Date: Thu, 10 Feb 2005 06:31:37 +0800
Jan Harkes wrote:

>>03:55:48 starting VDB scan
>>03:55:48 Fatal Signal (11); pid 6701 becoming a zombie...
>>03:55:48 You may use gdb to attach to 6701
>Between 'starting VDB scan' and the next message 'N volume replicas' is
>only a little bit of code. We iterate over the list of all known volumes
>and reset the non-persistent data. The only way that this could crash is
>if the linked list is somehow messed up. I don't know how your client
>got into that state since RVM should guarantee that any updates to this
>list are either atomically committed or aborted.
>So I have a pretty good idea where it crashed, but no idea how it
>managed to crash there.
Maybe it is caused by my manual editing of these files to [1] correct 
the wrongly detected machine names. Probably I should remove everything 
else and install again. I've got a lot of such experience anyway.

My general impression is that coda is sometimes working and sometimes 
not, given nearly the same installation procedure on a couple of testing 
machines. Maybe because the scripts didn't shutdown the processes 
correctly. Maybe because of the "strange" network configurations I have.

But still sometimes I do have no way to discover where the problems are. 
Process can be frozen, my not knowing what it is waiting for [2]. And in 
most cases, the messages logged are simply not enough to track down what 
is configured wrong.

Thanks for your help anyway.

- hostname
- db/scm/
- db/servers
- db/vicetab
- vol/remote/*
- vol/BigVolumeList

sltam_at_beta:/coda$ date; ls -l; date
Thu Feb 10 06:23:35 HKT 2005
total 9
dr-xr-xr-x   2 root guest 2048 Dec 25 02:57 ./
drwxr-xr-x  25 root root  4096 Feb  5 19:12 ../
lrw-r--r--   1 root guest    9 Feb 10 03:29 ->
Thu Feb 10 06:24:26 HKT 2005

