Coda File System

Re: Problem with Coda server on non-SCM machine

From: Jan Harkes <jaharkes_at_cs.cmu.edu>
Date: Fri, 10 Sep 1999 10:51:18 -0400
Hi Foma,

It seems to be a segfault in the normal C++ memory allocator. I guess by
moving the RVM around, you put RVM in the same part of the address space
as the heap.

Could you cat /proc/`pidof codasrv`/maps, and verify this?

The normal layout is something like:
address
0x080..... - 0x081..... program code + data + heap (3 chunks)
0x15000000 - 0x15...... LWP stacks (many)
0x20000000 - 0x2....... RVM data (one big chunk)
0x40000000 - 0x4....... shared libraries and other mmaps.
0xbfff.000 - 0xc0000000 normal stack (grows down)
0xc0000000 - 0xffffffff kernel space (can't touch this)

On the other hand, it is the same area as where the `other servers' are
intialized.

Check if /vice/db/servers, and /vice/db/VSGDB are up-to-date (servers
should contain both the SCM and your new server). Maybe they haven't
been synced correctly from the SCM.

Make sure that updatesrv is running on the SCM and updateclnt is running
on your new server. (restart them with /etc/rc.d/init.d/update.init restart).

Jan


On Fri, Sep 10, 1999 at 09:02:34PM -0700, Fomin Alex wrote:
> Hi, reople.
> There is problem with coda-server.
> When  i runnig coda server as non SCM server on Linux RH6.0 (coda-server
> 5.3.1)  he write to SrvLog (see attachment #1). and go to zombie state.
> I try another parameters for RVM (starting address, heap length  etc. )-
> nothing.
> I attach to codasrv process (using gdb) ang he write  (see attachment
> #2).
> Help me please. Any idea fix this problem.
> 
> Foma.
> 

> 05:31:10 New SrvLog started at Tue Sep  7 05:31:10 1999
> 
> 05:31:10 Resource limit on data size are set to 2147483647
> 
> 05:31:10 Server etext 0x80f62da, edata 0x813d3bc
> 05:31:10 RvmType is Rvm
> 05:31:10 Main process doing a LWP_Init()
> 05:31:10 Main thread just did a RVM_SET_THREAD_DATA
> 
> 05:31:10 Setting Rvm Truncate threshhold to 5.
> 
> 05:31:34 The server (pid 1693) can be controlled using volutil commands
> 05:31:35 "volutil -help" will give you a list of these commands
> 05:31:35 If desperate,
> 		"kill -SIGWINCH 1693" will increase debugging level
> 05:31:35 	"kill -SIGUSR2 1693" will set debugging level to zero
> 05:31:35 	"kill -9 1693" will kill a runaway server
> 05:31:35 Vice file system salvager, version 3.0.
> 05:31:35 SanityCheckFreeLists: Checking RVM Vnode Free lists.
> 05:31:35 DestroyBadVolumes: Checking for destroyed volumes.
> 05:31:35 Attached 0 volumes; 0 volumes not attached
> lqman: Creating LockQueue Manager.....LockQueue Manager starting .....
> 05:31:35 LockQueue Manager just did a rvmlib_set_thread_data()
> 
> done
> 05:31:35 ****** FILE SERVER INTERRUPTED BY SIGNAL 11 ******
> 05:31:35 ****** Aborting outstanding transactions, stand by...
> 05:31:35 Uncommitted transactions: 0
> 05:31:35 Uncommitted transactions: 0
> 05:31:35 Becoming a zombie now ........
> 05:31:35 You may use gdb to attach to 1693

> (gdb) bt
> #0  0x40111861 in __libc_nanosleep ()
> #1  0x401117ed in __sleep (seconds=1) at ../sysdeps/unix/sysv/linux/sleep.c:78
> #2  0x80f5e47 in coda_assert (pred=0x80f6467 "0", file=0x80f6460 "srv.cc", 
>     line=318) at coda_assert.c:45
> #3  0x804a824 in zombie (sig=11) at srv.cc:318
> #4  <signal handler called>
> #5  0x4016c000 in builtin_modules ()
> #6  0x809a760 in ResCommInit () at rescomm.cc:102
> #7  0x804af00 in main (argc=12, argv=0xbffffcd4) at srv.cc:515
> #8  0x4009acb3 in __libc_start_main (main=0x804a840 <main>, argc=12, 
>     argv=0xbffffcd4, init=0x8049ce8 <_init>, fini=0x80f62dc <_fini>, 
>     rtld_fini=0x4000a350 <_dl_fini>, stack_end=0xbffffccc)
>     at ../sysdeps/generic/libc-start.c:78
> 
Received on 1999-09-10 10:52:26