Coda File System

Re: Venus died

From: Jan Harkes <jaharkes_at_cs.cmu.edu>
Date: Thu, 8 Sep 2005 17:25:50 -0400
On Wed, Sep 07, 2005 at 10:59:23AM +0300, Denis Chapligin wrote:
> Okay, i've upgraded client to 6.0.11 and found a new bug:) 
> If server listens on one interface and clients tries to connect using
> another one interface on server, the server tells to the client the
> hostname of correct interface. But if this interface isn't accessible
> from the client, venus will die with:
> 
> ====
> venus: refcounted.h:55: void RefCountedObject::PutRef(): Assertion
> `refcount > 0
> ' failed.
> 10:43:11 Fatal Signal (6); pid 2296 becoming a zombie...
> 10:43:11 You may use gdb to attach to 2296
> 10:43:18 RecovTerminate: dirty shutdown (1 uncommitted transactions)
> ====
> 
> and after it venus will never start, failing during initialization.
> 
> ====
> 11:29:10 starting FSDB scan (4166, 100000) (25, 75, 4)
> 11:29:10        4 cache files in table (0 blocks)
> 11:29:10        4162 cache files on free-list
> venus: refcounted.h:55: void RefCountedObject::PutRef(): Assertion
> `refcount > 0
> ' failed.
> 11:29:10 Fatal Signal (6); pid 2367 becoming a zombie...
> 11:29:10 You may use gdb to attach to 2367
> 11:29:23 RecovTerminate: dirty shutdown (1 uncommitted transactions)
> ====

Interesting, I don't really see how the failed realm resolution /
connection attempt and the assertion are related, especially across
client restarts. I would at least expect it to die during the VDB scan,
because that is where we try to rebuild the realm and server structures.

There must be a double PutRef call somewhere.

One possible cause is that some object was created for the realm and it
thinks it has a valid reference. But it really doesn't, so when venus
tries to destroy the runt object the resulting putref on the realm would
make the realm's reference count go negative. And because we crash we
end up aborting the pending RVM transaction. As a result the removal of
the bad file/directory object is reverted during RVM recovery.

Jan
Received on 2005-09-08 17:26:22