Coda File System

directory code problems

From: Greg Troxel <gdt_at_fnord.ir.bbn.com>
Date: 25 Apr 2000 11:52:56 -0400
I have a server which has been more or less stable (NetBSD/i386, code
from cvs, 256 Mb ram, log/data on raw partitions).  When I ran my
do-rfc-mirror script (on the same machine as the server)

  #!/bin/sh
  cd /coda/project/rfc || (echo "cd failed"; exit 1)
  wget -np -nH -P rfc --cut-dirs=3 -m ftp://ftp.normos.org/ietf/rfc/
  wget -np -nH -P internet-drafts --cut-dirs=3 -m ftp://ftp.normos.org/ietf/draft/ 

I found that venus and the server died horribly.  Restarting the
server choked on an assertion in dir/dirinode.c:152.  I cvs up'd fresh
code, and rebuilt everything. Still the assert (in DI_Pages).
Stepping through codasrv with gdb, I found that there was a pdi with
all 128 pointers filled in.  This seems ok - but the invariant isn't
written down in the file, and I couldn't figure out how it was
maintained when dirs are being written.  Also, there seems to be
fencepost error in that the loop checking for pointers seems to be
like it should terminate at i < DIR_MAXPAGES, rather than =, since 128
isn't a valid index into an array of 128.

I applied the following patch (which I'm not at al sure about!!!) and
my server runs again.  I then removed some files in a huge directory.

Index: dirinode.c
===================================================================
RCS file: /coda-src/coda/coda-src/dir/dirinode.c,v
retrieving revision 4.11
diff -u -r4.11 dirinode.c
--- dirinode.c	1999/12/07 01:03:11	4.11
+++ dirinode.c	2000/04/25 15:45:03
@@ -145,11 +145,11 @@
 	int i = 0;
 	CODA_ASSERT(pdi);
 
-	while( pdi->di_pages[i] && (i <= DIR_MAXPAGES)) 
+	while( pdi->di_pages[i] && (i < DIR_MAXPAGES)) 
 		i++;
 	
 	/* check this guy is valid */
-	CODA_ASSERT(i< DIR_MAXPAGES);
+	CODA_ASSERT(i<= DIR_MAXPAGES);
 	return i;
 }
 
It appears that directories can only have 128 * 2048 bytes of storage,
and there is no indirect-block like scheme (FFS file lengths).

It seems like there is no intrinsic reason the last pointer can't be
used; I don't get the </<= sense of the test, or the reason for the
assert, since 128 seems like a valid outcome.

I'm also not sure how the server avoids writing more than 128 pages;
see DI_DhToDi - there is no check against DIR_MAXPAGES that I can see.


        Greg Troxel <gdt_at_ir.bbn.com>
Received on 2000-04-25 11:56:00