Coda File System

Re: found the root of the problem

From: Jan Harkes <jaharkes_at_cs.cmu.edu>
Date: Wed, 6 Jun 2001 17:22:14 -0400
On Wed, Jun 06, 2001 at 03:44:36PM -0500, ctest_at_neural.dlsemc.com wrote:
> when i unsuccessfully connect from machine A.  (there are 2 routes, to take,
> and i want to take the route that keeps failing)
> 
> the fact that it gets posted here, and that i can ping between the
> machines on the "bad" addresses implies that the routes are ok.  The
> one that really screws with me is the client error logs when things go
> bad:
... 
> 06:08:19 CHILD: mount system call failed. Killing parent.
> ^^^^^^^^
> what could cause this?

Ok, venus forks itself and the child issues the mount system call. When
the mount syscall returns an error, the child kills the parent. Once
reason for failure is when /coda is already mounted, or when another
venus process has /dev/cfs0 opened. But it is also possible that the
mount fails while setting up the superblock. The kernel then makes
several upcalls to get information about the object that will become the
root of /coda from the original venus process.

> other-stuff:  if i stop coda-client, then change servers in
> /etc/coda/venus.conf, then start it again, should it have any
> knowledge of past servers?

Yeah, at least the ip-addresses of any server that hosts a volume that
has any objects still in the venus cache.


> i install coda-client on computer A, and have it hook to the server
> giving the client the address of B1 (computer B, card 1).  we'll call
> this the evil route.  it hooks to the server, we see the ip address of
> A1 show up in SrvLog and all works fine.  
> 
> now, this time we want to take the other route.  so we "dpkg --purge
> coda-client" (note there is a little buggy somewhere in the --purge
> option of coda-client.postrm that kills the deinstalation before
> finishing, but it seems to be livable), then re-install it, and give
> it address B2 to hook to.  it connects, and all *looks* fine.  when i
> check SrvLog, i find the ip address of B2, but then i find the IP
> address of B1 again!  So i yank the cable out of B1 and A1, to get rid
> of the evil route entirely, and then it stops working.  

Ah, I see why it fails...

Volume location information. The VRDB and VLDB store volumenames/id's
and server ip-addresses. There is only one ip-address per server, the
one that happens to coincide with the result of a gethostbyname for the
server name on the SCM.

So, I'm guessing that this ip-address happens to be the IP of B1. The
client initially connects to the server for a volume-location query
(same server process but logically a separate 'service'). The result of
this query is then used to connect to the server that hosts the volume.
The volume location information is retrieved from B2, but the volume is
accessed through B1.

The only way I see around this is actually using an extra ip-address
(maybe alias on lo0?)

	A1--------B1
       /	    \
     A0		     B0
       \	    /
	A2--------B1

/etc/hosts
x.x.1.1		client
x.x.2.1		client-if1
x.x.3.1		client-if2
x.x.1.2		server
x.x.2.2		server-if1
x.x.3.2		server-if2

and some routing like,
    route add server gw server-if1
    route add server gw server-if2

You don't need A0 (a 'universal-ip' for client) when masquerading is set
to '1' in /etc/coda/venus.conf, but you probably do when masquerading is
not enabled because the client sends the server it's ip-address so that
it can set up the SFTP connection.

Jan
Received on 2001-06-06 17:22:19