Coda File System

From: Gabriel B. <gabriel.barros_at_gmail.com> Date: Wed, 16 Mar 2005 11:26:37 -0300

On Tue, 15 Mar 2005 16:36:37 -0500, Jan Harkes <jaharkes_at_cs.cmu.edu> wrote:
> On Mon, Mar 14, 2005 at 10:05:31AM -0300, Gabriel B. wrote:
> > > And once codasrv is started it asks if it can create the rootvolume.
> > 
> > I'm using the .deb package from CM servers. It never asked me about
> > root volume. I even opened a thread asking if the docs were outdate
> > because of this.
> 
> And in that thread I responded,
> 
> "Things have changed in the hope to simplify the initial setup."
>     http://www.coda.cs.cmu.edu/maillists/codalist/codalist-2005/7198.html
> 
> The change happened with the release of Coda-6.0.7,
> 
> "createvol_rep doesn't use the VSGDB anymore, instead of specifying a
> server group with the mystical E0000XXX numbers, the servers are now
> explicitly named; 'createvol_rep <volumename> <server>[/<partition] ...'"
>     http://www.coda.cs.cmu.edu/pub/coda/Announcement.6.0.7
> 
> > > > pbtool
> > > > > nu bossnes
> > > > > ng www 1076 (bossnes id)
> > > >
> > > > createvol_rep / camboinha.servers/vicepa -3
> > > > that didn't worked. i waite 2hours and ctrl_Clled.
> > > 
> > > Where did you get that '-3'?
> > 
> > from the list command
> > GROUP www OWNED BY bossnes
> >   *  id: -3
> >   *  owner id: 1076
> >   *  belongs to no groups
> >   *  cps: [ -3 ]
> >   *  has members: [ 1076 ]
> 
> Right, so createvol_rep interprets that as a server named '-3', and
> because it is '-3' we fail to catch it with the following test,

why is that?
camboinha# createvol_rep 
bad args:  createvol_rep <volname> <server>[/<partition>]
[<server>[/<partition>]]* [groupid]

How can i specify a group id then?

> 
>     # Validate the server
>     grep $SERVER ${vicedir}/db/servers > /dev/null 
>     if [ $? -ne 0 ];  then
> 	echo Server $SERVER not in servers file
> 	exit 1
>     fi
> 
> So we end up running
>     volutil -h "$SERVER" getvolumelist "/tmp/vollist.$$"
> 
> Which then tries to contact a server named '-3'. Now on my machine it
> quickly returns with '-3 is not a valid hostname'.

hum.. here it hangs. running the script with bash -x i see it hangs in
++ sed 's/[^\/]*\(.*\)/\1/'
+ PART=
+ grep -3 /vice/db/servers

So, indeed it treated the -3 as a server/partition argument. The
partition is null and the -3 is treated as an argument by some grep
linuxism. the fact that yours continue processing, means you probably
on some other unix.

> 
> > > > volutil create_rep /vicepa / 00001
> > > > bldvldb.sh
> > > > (a valid workaround?)
> > > 
> > > No it is not, since this only creates the underlying volume replica
> > > (which should be named "/.0") And again, where does that strange 00001
> > > number come from? The createvol_rep script does this first but then
> > > creates the replicated volume by dumping the existing (currently empty)
> > > VRDB into the /vice/db/VRList file, appending a entry that describes
> > > which replicas are part of the replicated volume and recreates a new
> > > VRDB file from the data in the updated /vice/db/VRList.
> > 
> > hum... is it in binary form or human readable? can you send an example?
> 
> Again, where does that strange 00001 come from? That is setting the
> replicated volume id to 1, but you can't actually have a replicated
> volume with the volume id 1 as replicated volumes are supposed to always
> have a volume id that looks like 0x7f0000nn.
> 
> The first byte in the 4-byte volume id number is used to map to the
> specific server identifier in /vice/db/servers on which the volume
> replica is located, and 0x7f (127) is reserved to indicate that this is
> a replicated volume that isn't located on any particular single server
> but represents a group of individual replicas. We need this, because in
> some cases we get just the volumeid and we don't know if it is supposed
> to be a replicated volume or some underlying volume replica.
> 
> So although the VRList file is human readable and to a certain extend
> can be edited by hand. I don't think it will be a wise thing to do so
> without really knowing how it is used to glue individual replicas
> toghether. There are a lot of constraints of what is considered valid
> or not and and a single misplaced character can break the parser that
> has to convert it back to the VRDB file.
> 

Hum, nice info. thanks!
now that i can use createvol_rep i'm sticking to your adivice and not
editing those files by hand. Except this time that i run out of Inodes
(was using the 2M files ftree) i manually blanked all those files and
created a new vicepa with 16M files.

now i can create the volumes by the book:
createvol_rep h.album.i.big camboinha.servers/vicepa

> > > The only 'worst case' that I know of is when we initially contact a
> > > realm since we are hit by multiple RPC2 timeouts, one when we try to
> get
> > > the rootvolume name, one when we fall back on getvolumeinfo, at this
> > > point the mount fails, but with a colorizing ls we get an additional
> > > readlink and getattr on the mountpoint both of which also trigger an
> > > attempt to mount the realm (i.e. another 4 timeouts). So we end up
> > > blocking for about 6 minutes if the realm is unreachable.
> > 
> > I just let a "cfs lv /coda/camboinha.servers" running friday. It's
> > monday and i had to control+c it.
> > The server is still running tought.
> 
> On the client run,
> 
>     strace -e trace=network -p `pidof venus` (optionally add "-o
> strace.dump")
> 
> This should show all the network related stuff that the client is doing.
> Here is what I get when I run 'cfs lv /coda/coda.cs.cmu.edu'
> 
>     # strace -e trace=network -p `pidof venus`  
>     Process 17369 attached - interrupt to quit
>     sendto(8, ..., 92, 0, {... sin_port=htons(2432),
> sin_addr=inet_addr("128.2.222.111")}, 16) = 92
>     sendto(8, ..., 92, 0, {... sin_port=htons(2432),
> sin_addr=inet_addr("128.2.209.199")}, 16) = 92
>     sendto(8, ..., 92, 0, {... sin_port=htons(2432),
> sin_addr=inet_addr("128.2.191.192")}, 16) = 92
>     recvfrom(8, .., 4360, 0, {... sin_port=htons(2432),
> sin_addr=inet_addr("128.2.222.111")}, [16]) = 156
>     recvfrom(8, ., 4360, 0, {... sin_port=htons(2432),
> sin_addr=inet_addr("128.2.209.199")}, [16]) = 156
>     recvfrom(8, ..., 4360, 0, {... sin_port=htons(2432),
> sin_addr=inet_addr("128.2.191.192")}, [16]) = 156
>     Process 17369 detached
> 
> Now coda.cs.cmu.edu is mapped by an entry in /etc/coda/realms to a group
> of 3 servers, so we're sending the request to all three servers, and
> then get three replies back. The Coda client decides which reply to
> actually use.
> 
> If your server isn't responding you would only see sendto's are
> exponentially increasing intervals for about 60 seconds, at which point
> the RPC2 layer gives up. This then percolates up and we end up returning
> ETIMEDOUT to userspace.
> 
> > now, i did a cunlog and a clog, and "time ls -la /coda/"
> > <ctrl+c>
> > 
> > real    14m44.403s
> > user    0m0.001s
> > sys     0m0.003s
> 
> You could try to use /bin/ls, which shouldn't be colorizing and as such
> doesn't try to stat every entry in coda, readlink all unmounted
> mountpoints, and then stat every link destination.
> 
> Also, is there anything in the venus.log file which could indicate that
> we've already ran out of worker threads (message looks something like),
> 
>     DispatchWorker: out of workers (max 20), queueing message
> 

hum, sorry, can't try those. i wiped that install (from the .deb) and
build the sources. I don't have thos weird errors anymore.

> > > > i then created two more volumes. now venus report  "2 volume
> replicas"
> > > 
> > > Did you mount those volumes then? How would venus know about the newly
> > > created volumes? Those 2 replicas are probably the one that is at /coda
> > > and the one at /coda/camboinha.servers.
> > 
> > It's show in the venus startup. And i'm starting it with -init every
> > time. cfs hangs as well, so i will never know wich volumes it claim to
> > have found.
> 
> Those volumes are 'CodaRoot_at_localhost', the volume/directory that
> is mounted at /coda). And 'Repair_at_localhost', the volume that is used
> during local/global repair. Both are internal volumes that always exist.
> Both volumes are in the 'localhost' realm, which is an invalid name
> since localhost represents 'this machine' and as such is not usable in a
> Coda's global volume naming scheme.
> 
> > Did someone have success using the .deb version? I tried it with 3
> > sets of server/clients. each more troublesome than the other.
> 
> Not counting 'testserver.coda.cs.cmu.edu' and the 6 servers responsible
> for 'coda.cs.cmu.edu' and the 2 servers for 'isr.coda', all running the
> server packages on debian-testing?
> 
> I am using both client and server debian packages on a machine at home,
> my laptop and my desktop at work (debian-unstable) although I tend to
> alternate with recent CVS builds. Also one of the students in our group
> is using the Coda client debian packages on something like 12-15 laptops
> to move data for his experiments.
> 
> Now a lot of things depend on whether you have a traditional /dev, devfs
> or udev, if your kernel is 2.4 or 2.6, if your machine has a static or
> dynamic IP address, if the network connection is permanent or
> intermittend, if you have a multi-homed machine and how exactly the
> multi-homing is set up, since there are about 3-4 different variations
> on that theme, if there are (possibly masquerading) firewalls in your
> network, and much more.
> 
> There are literally thousands of combinations that might make or break a
> seemingly simply setup. Coda servers by far the most sensitive, since
> they are expected to be reachable through a single static ip-address and
> that servers have reliable fairly fast connections to each other (i.e.
> located in the same machine room) and there are some assumptions like
> 'gethostbyname(hostname())' returning a usable IP address that we can
> pass to a client instead of 127.0.0.1.
> 
> A Coda client is a lot less picky, as it is assumed to be mobile and as
> such can hop from one network to another and possibly has unreliable
> connections. My laptop switches quite a bit between various wired and
> wireless networks as well as running the Coda traffic through an openvpn
> tunnel. I used to use dialup almost daily, but nowadays it is a cable
> modem connection. But still, I've configured everything so that it never
> tries to route to the Coda servers over multiple networks at the same
> time as the servers wouldn't know where to send the replies to.

Nice to hear all that. truly. unfortunatelly here i have almost all
you pointed out above, but the scripts never behaved nicelly with the
.deb
simply installing from source solved most of my problems. i'm even
running a client on the server. Now i only have to deal with the
hassle of using million of files....
With the .deb i was constantly fighting with the scripts. now i can
simply follow the docs :)

Thanks!

gabriel

Coda File System

Re: starting over