Coda File System

Re: reintegration failures over modem

From: Greg Troxel <gdt_at_ir.bbn.com>
Date: Mon, 17 Sep 2001 13:49:44 -0400
> >   rpc2 params are not taking effect on the backfetch (are they
> >   controlled by the server?)
> 
> I believe so, although the server typically already has more lenient
> timeouts compared to the client, I believe the overall rpc2 timeout is
> set to 60 seconds instead of 15 sec..

This is all happening much faster than 60 seconds.

Does rpc2 do selective acks, so that if the server gets some of the
packets in the stream, the client only resends the missing ones?  And
does every iteration (send, ask for ack) where some progress is made
keep the transfer alive?  The transfers seem to be getting abandoned
very early on.

What rpc2debug level do you suggest to see all the relevant events?
(Keep in mind that tcpdump is near useless since all the data traffic
is encrypted. :-)

> >   there are some sort of timeout problems or something else that is
> >   causing reintegation backfetches to fail over 28.8 modems
> 
> I'm very often using a 33.6K modem. Only occasionally the backfetches
> fail, but on the other hand, most of my traffic is related to reading
> email, so the client hoards/fetches new emails, and pretty much the only
> write traffic is renaming them between directories from new/ to cur/.

For me, this is reliably failing.  Try

cfs wd -age 60
dd if=/dev/zero of=foo bs=1k count=8
cfs lv .
cfs fr .

On my system, this will time out every time over the modem, but works
fine on the local net (which is 802.11 and one router hop).

> I've come to the conclusion that the whole server->client path
> (callbacks, backprobes and backfetches) is very unreliable. While
> working over a modem connection I often get 'NAKed' by the servers,
> which indicates that a backprobe failed, the server cleaned out the
> client's information and the client has to reestablish all callbacks.

I am pretty sure there is something worse going on; this isn't
probabilistic failure.

  Probe ( 13:44:32 )
  BackProbe codasrv.ir.bbn.com ( 13:44:32 )

This is from cfs cs; I presume it's the other order when the server
wants to know if the client is up?

It does seem that the backfetch is using the default rpc2 params, and
not the ones that venus/vice set.  But I can't figure things out
because rpc2 is pretty hairy and I haven't spent long enough trying yet.
Received on 2001-09-17 13:51:35