Coda File System

Re: Backing up coda volumes

From: Jan Harkes <>
Date: Tue, 24 Feb 2004 14:32:29 -0500
On Mon, Feb 23, 2004 at 07:35:37PM -0500, Jan Harkes wrote:
... <snipped description of using Amanda to perform Coda backups>
> I'm not yet 100% convinced about the reliability, it was working quite
> well during the experimental setup when my desktop was running the
> Amanda server, but since we moved that functionality to a machine in the
> lab a lot of incremental backups seem to fail. Haven't figured out yet
> what exactly is causing that.

Yay, found the culprit, the network link to the new backup server turned
out to be a bit extremely lossy and triggers a synchronization problem
in the UDP communication between the Amanda server (dumper?) and amandad
on the Amanda client.

What happens is that the ACK for a reply is lost when a sendbackup is
started, so amandad retransmits the RSP packets a couple of times.
(these contain the data and mesg ports that dumper is supposed to
connect to).

Now when the next volume is scheduled, the dumper sees the old RSP
packet and assumes it belongs to the current REQ. So it ends up trying
to connect to ports that are long gone.

The following patch makes sure that we ignore RSP packets that have an
incorrect sequence number, and thus are clearly not responses to our
currently outstanding request.


--- amanda-2.4.4/common-src/protocol.c.orig	2003-04-24 15:38:25.000000000 -0400
+++ amanda-2.4.4/common-src/protocol.c	2004-02-24 14:12:06.000000000 -0500
@@ -733,7 +733,7 @@
-	    else if(pkt->type == P_REP) {
+	    else if(pkt->type == P_REP && pkt->sequence == p->origseq) {
 		/* no ack, just rep */
 		p->state = S_REPWAIT;
@@ -764,7 +764,7 @@
 	    else if(action != A_RCVDATA)
 		goto badaction;
 	    /* got the packet with the right handle, now check it */
-	    if(pkt->type != P_REP) {
+	    if(pkt->type != P_REP || pkt->sequence != p->origseq) {
Received on 2004-02-24 14:34:26