Coda File System

Re: add servers to a Coda cell

From: Jan Harkes <jaharkes_at_cs.cmu.edu>
Date: Tue, 15 Jan 2002 11:09:47 -0500
On Tue, Jan 15, 2002 at 08:11:38AM +0100, Ivan Popov wrote:
> > > It is no problem to set up a new non-scm machine, even create volumes on
> > > it (shared with the old server, too), as long as I don't add an entry
> > > for it in the "servers" file.
> >
> > How did you manage to create volumes on the non-scm machine if it isn't
> > in the servers file? I can only see possible corruption coming resulting
> > from that.
> 
> Well now when you say it and I can't test once more, I cannot argue.
> May be I managed to create volumes only while having both entries in
> "servers"?.. I realize I would need somewhere to pick the new
> server number from - but the information was present in "servers" on the
> non-scm machine (one and only line "<host> 2").

Yup, I just double checked and a server will not start when it's own
hostname is not found in the servers file. However it does look like
that whenever the "ipaddress=" option is set in server.conf, all
hosts that resolve to 127.x.x.x are set to that address (bad, but I
don't see an easy way to fix it right now).

> > As the VRDB and VLDB files are created by the SCM, this all depends
> > greatly on whether the SCM managed to correctly resolve all servers in
> > the servers file to ip-addresses, any failed resolves will result in a
> > 0.0.0.0 address (I believe Shafeeq added a test to 5.3.17 to block a
> > server from starting up when that happens).
> 
> Yes the test *is* there, and I have checked the name resolution
> a-lot-more-than-double :)

Well actually it was testing for 127.x.x.x, didn't see the 0.0.0.0 test,
so I've added it, along with a check for duplicate server ids.

> Well, it may be my missing domainname that causes harm??
> In this cell I have all of the machines in /etc/hosts, private ip numbers
> and no domainnames - shouldn't really matter?

No, that should work as long as /etc/nsswitch.conf has something like
'hosts: files dns' (the 'files' entry is important here).

> I have run before with two *identical* lines in "servers", probably some
> damaged vice-setup rerun created it, but it is probably totally
> irrelevant.

Yeah as long as the hostname and the id are identical in both lines
the second one is basically a no-op.

> Anyway, the system works despite all experiments as usual, as long as
> there is no mention of the second server in "servers".
...
> Just to point out - I don't have to create replicated volumes to get in
> trouble. In fact, I don't have to create *any* new volumes to get in
> trouble. Just add a line in "servers" and I'm sitting there.

That is interesting, I wonder what the difference is, could you send me
logs of the same server starting without the 'servers' line at level 100
and a run when it is started with that additional line in 'servers'.

Also run 'getvolinfo `cat /vice/db/ROOTVOLUME`' after the server has
started which should get some of the volume lookup stuff to run.

> My first try was to setup the second machine and "servers" on scm and do
> all the things, including creating volumes - but without restarting scm.
> Then no crashes occurred, but the new volumes were not mountable.

Ok, but as the SCM hadn't read the updated servers and VSGDB files it
cannot have created correct VRDB and VLDB files. Thinking about it, not
having the replication group id from the VSGDB might have resulted in an
error during volume creation.

> (It means I *may* have had the "servers" updated at times when I created
> the volumes)

So the scripts that were run on the SCM had right information as well as
the non-SCM server. However the SCM codasrv itself didn't know about the
new server or the new VSG's. The VRList should be ok, as well as the
VolumeList info from the non-SCM, but the VolumeList from the SCM and
the VRDB/VLDB files probably have some problems/inconsistencies.

> > > Two "unusual" things about my setup:
> 
> Well, it might be one more, kind of too evident thing... the servers run
> different versions - scm 5.3.15, the new one 5.3.17.

Nahh that shouldn't be a problem at all, I ran a similar combination all
the time while working on the new stuff in 5.3.17.

Jan
Received on 2002-01-15 11:10:05