Coda File System

Re: SCM promotion

From: root <coda_at_voidembraced.net>
Date: Thu, 08 Apr 2010 18:37:45 -0700
> On Wed, Apr 07, 2010 at 04:44:10AM -0700, root wrote:
>> >    /vice/vol/VRList. 
> 
> This file is kept in sync on all servers of a realm,
> if you installed with "coser".

I did!  So that file need not be touched. 


>> >If you have a copy of that file on another server, the other server
>> >can replace the SCM. The /vice/db/scm file needs to be updated on
>> >all servers, and the updateclnt/updatesrv daemons need to be
>> >restarted (/etc/rc.d/init.d/update.init restart)
> 
> Note that the contents of /vice/db/scm is compared to the contents
> of /vice/hostname which is used as the server name in the context
> of the Coda realm. This is valid for "coser" installations.

This is unclear to me.  I do see that my coser based install of coda/vice 
server does, in fact, have both a vice/db/scm and a vice/hostname, and they 
match. 


>> Is this still true in general, but specifically for modular-clog based 
>> deployments? 
> 
> You mean probably the "coser" installer, clog is not relevant in this
> context.

Indeed.  I will use the "coser" term in the future (and "cocli" I suppose 
for the client). 


>> Which hosts must the updateclnt/updatesrv daemons be restarted?  I assume 
>> the newly promoted SCM and any remaining coda server (vice) hosts, but I 
>> wish to be certain. 
> 
> All server hosts in the realm need the update daemons to be (re)started
> after the modification of the /vice/db/scm files. It is best to take all
> of them down before any modifications and start when all modifications
> are done. Otherwise you have to understand how they interact and in which
> cases they might overwrite your modifications on the fly.

Makes sense.  Assuming our SCM is DOWN for the count, what is the 
update/bounce order to permit zero downtime SCM promotion of one of the peer 
servers? 

Just to be certain, to update the update daemons on coser, all that need be 
done is a vice/bin/updateservice restart, correct?  (or a stop & start if 
we're running commands between the point that the update daemons are down 
and the time they are up). 


> Actually if you want to change which host is the scm while all servers
> are up, make the change of /vice/db/scm on the former scm, wait to let
> the change propagate and then restart the update daemons everywhere. Done.

Assumes that we have an SCM.  I'm assuming that our SCM is dead and 
irrepairable.  What's the easy-peasy way of promoting a new SCM without 
downtime? 

I suppose we must assume that there are 2 non-SCM coda servers up and 
running to permit a non-downtime promotion. 

Honestly, though, what breaks if the update daemon is down for a few seconds 
while we take them all down, update and start them again? 


Thanks as always! 


Regards,
 -Don
{void} 
Received on 2010-04-08 21:38:13