Coda File System

Re: a question: volume and SG properties

From: Jan Harkes <jaharkes_at_cs.cmu.edu>
Date: Tue, 22 Jan 2002 14:31:59 -0500
On Sat, Jan 19, 2002 at 11:32:36AM +0100, Ivan Popov wrote:
> On Fri, 18 Jan 2002, Jan Harkes wrote:
> > restored backup volume and it's original rw-replica. If they are
> > identical, it's just a matter of 'flipping a bit' in the volume header.
> 
> It would be great to have as a recovery solution. (See my note at the end
> below).

I know, it would be something like

    volutil -h server1 restore -f <volume dump file> /vicepa <volumename.0>
    volutil -h server1 make_rep <restored-volid> <replicated groupid>
    volutil -h server2 create_rep /vicepa <volumename.1> <replicated groupid>
    echo "<volumename> <replicated groupid> 2 <restored-volid> <server2-volid> 0 0 0 0 0 E00000xxx" >> /vice/db/VRList
    volutil makevrdb /vice/db/VRList

Only the make_rep step is missing. Also removing the explicit VSG
declarations in the VRList/VRDB would be nice as well, I've already done
that for the client, where the 'VSG' is now derived from the list of
servers that store a volume's replicas. This will eventually allow us to
grow/shink any single volume in a storage group.
    
> > volume entries in the VRList are reordered when creating the VRDB
> 
> > The behaviour can be fixed by removing the reordering (i.e. the call to
> > vre->Canonicalize()) when we're creating the VRDB, but... this will
> 
> Why that reordering came to existence then? There should have been some
> rationale behind that?

I don't know, the project started in 1986, I only joined in 1998, so
there is still a lot of code that surprises me. I asked Satya and he
told me it might have been added to speed up some searches, but to me
that looks like over-optimization without clear gain, especially as I
haven't as yet found any code that relies on the re-ordering.

> > >  - is it possible to change the member list in a storage group?
> > >    modify, add, delete?
> 
> > The client is pretty much ready to deal with this. We just need to hook
> > a forced volume revalidation as a result of f.i. a volume callback.
> 
> Does it mean that it would work with current software - if I for instance
> 
> 1. for each volume in a storage group
>    create empty volume replicas on a new server (how would I do that?)

With the current software, only if the ip-address of the new server
sorts after all other ip-addresses. 'volutil create_rep' will create the
new empty volume replicas.

> 2. add the new server to an existing storage group by editing VSGDB like
>    that:
>    E000100 srv1 srv2 newserv

You also need to add the volume-id's of the new replicas to the VRList,
i.e.

    vio:u.clement 7f000596 1 ca00004f 0 0 0 0 0 0 0 E0000154
becomes
    vio:u.clement 7f000596 2 ca00004f de000001 0 0 0 0 0 0 E0000154

> 3. restart all servers

Yup, servers are still bad.

> 4. reinit all clients

Maybe even 'force a disconnect/reconnect' will work, but I'm sure
something will go wrong, so reinit is the safest solution right now.

> 5. ls -alR /coda

That will trigger a runt resolution that would populate the new empty
replica.


> 6. remove an old server from VSGDB like that:
>    E000100 srv2 newserv

Currently, you can only remove the last server entry, and maybe the one
with the ip-address that ends up first after sorting. And you also need
to change all VRList entries. Destroying the left over replicas
shouldn't really be necessary, but can be done with "volutil purge".

> Sorry, I can't afford broken things :) If there is no better way,
> I'll hack a script collecting acls from a volume and storing along with
> the data archive, then restoring both data and acls.
> Surprised there is no such utility yet. (see my note at the beginning :)

I believe 'afio' allows you to add a script that will be run during
archiving and extraction that could be used to save the ACL's in the
same archive as the rest of the data. Otherwise, have a crawler
(modified volmunge?) that walks through a volume and writes out a shell
script that can restore ACL's in the root of the volume before you tar
up the volume.

> So long I am just expanding the system, but I will have to move data
> around for being able to make safe server upgrades, both hard- and
> software. How do you solve it??

Our servers have pretty much always had the same name and ip-address,
even when the hardware got changed. Any server in a replicated group can
be brought down for a while without anyone noticing, if it is anything
that would take more than 10/15 minutes we typically try to deal with it
in the evening. Although the other day we ran our triply replicated
group with only 2 servers for about two full days before finally fixing
a problem.

There are also about 3 'alpha' servers that have an identical setup and
are used as guinnea pigs for software/system upgrades. They are
configured pretty similar so in the worst case they could be used to
take over from a production machine. It's just a matter of changing the
ip-address/hostname, rebooting, re-creating volume replicas that were on
the original production server and resolving.

Jan
Received on 2002-01-22 14:32:03