Coda File System

Re: coda client hangs

From: Patrick Walsh <pwalsh_at_esoft.com>
Date: Fri, 27 May 2005 12:00:44 -0600
On Fri, 2005-05-27 at 13:44 -0400, Jan Harkes wrote:
> On Thu, May 26, 2005 at 04:32:13PM -0600, Patrick Walsh wrote:
> > 14:50:01 Coda token for user 0 has been discarded
> > 15:00:00 Coda token for user 0 has been discarded
> > 15:00:00 Coda token for user 0 has been discarded
> > 15:00:00 Coda token for user 0 has been discarded
> > 15:00:01 Coda token for user 0 has been discarded
> > 15:10:00 Coda token for user 0 has been discarded
> > 15:15:00 Coda token for user 0 has been discarded
> > 15:20:00 Coda token for user 0 has been discarded
> 
> I wonder why these tokens are being discarded, this message is only
> shown in 2 cases.
> 
> - A server believes the token is invalid (expired or unable to decrypt)
> - A user has explicitly called cunlog

	I just went back and checked our code.  I am really sorry.  A coworker
added a cunlog call that I wasn't aware was there.  I've removed this
call.  Do you think this may have been at the root of our problems?  It
seems like it could cause all kinds of trouble.

> Are you still creating an RPM package? It could be that rpmbuild
> implicitly strips everything before it packages the binaries. If you
> still have the build tree around somewhere you should be able to use the
> venus binary in the build tree even when the running venus is stripped.

	OK, that makes sense.  I'll do that.

> Soooo, now I have to go through the complete code and identify all
> places where we might be using these olist_iterators either directly or
> indirectly and check if they are already tracking the next ptr
> themselves (like we do when destroying connections) and if the objects
> are locked when we yield. And then I can remove the useless next-ptr
> bit.

	Yikes.  Sorry.

> Quick fix for you, in coda-src/venus/user.cc around line 372.
> ...
> Change that Suicide(1) to Suicide(0), this way the client won't tell the
> server is is disconnecting, so we won't make an RPC2 call, and as a
> result will not yield.

	I'd rather wait for the real fix.  Our servers aren't live just yet
anyway.  Although we will be hitting them with some load pretty soon
here.

	Thanks for your help, Jan.

-- 
Patrick Walsh
eSoft Incorporated
303.444.1600 x3350
http://www.esoft.com/

Received on 2005-05-27 14:01:46