Coda File System

Re: venus failure during file rename

From: Jan Harkes <jaharkes_at_cs.cmu.edu>
Date: Tue, 28 Mar 2006 16:18:41 -0500
On Fri, Mar 24, 2006 at 11:45:10AM +0200, Denis Chapligin wrote:
> I've tried to rename file on coda using mv (mv sxsetup sxsetup.old)
> and this operation blocked for a while. When i've tired of waiting and
> interrupted mv, i've found that the old file is disappeared and new file
> is inaccessible:

Not really sure what could be going on there. Rename normally works, so
you must have hit some corner case.

> 10:32:46 DispatchWorker: signal received (seq = 36)

This is where you hit ^C.

> 10:32:47 worker::Return: message write error 3 (op = 14, seq = 36), wrote -1 of 12 bytes

And here is when the rename finally returned, but by that time the
kernel wasn't waiting for the reply anymore.

> chollya_at_tau:/coda/WEB/onlyslon.org/slonax/dl$  ls -l
> ls: sxsetup.old: No such file or directory

I don't know if this 'ls' is where the old file used to be, or the place
you moved it to. In this case, readdir returned the file name, but
either the client (or the server) couldn't find the object it refers to.

A possibly reason could be that the client got a conflict on the rename
and went into disconnected mode. The rename operation is for the most
part an operation on the source and destination directories, the only
bit where the file is updated is to change the parent directory. If the
rename was from one name to another in the same directory, only the
directory is updated and the file is untouched.

I can't tell what is returned the ENOENT error, it could be the Coda
client, or the Coda server. If you have another client you could check
if the server side is correct, either in the state before the rename, or
after the rename. It is most likely a caching issue, I haven't really
seen these things go terribly wrong on the server-end.

It could be the kernel cache, possibly some program had a reference to
the renamed file and we failed to flush the directory or inode cache
objects. It is also possible that venus has the wrong data cached. A
disconnection/reconnection should trigger cache revalidation. I've also
seen some cases where a reference count got messed up in venus, and it
was unable to refresh the object with an up-to-date copy from the
servers.

> What i did wrong and how this can be fixed?

I don't think you necessarily did anything wrong, this is probably one
of the hard to trigger/hard to find/hard to debug corner cases that are
still lurking. If you can find a (reliable) way to reproduce the problem
that would help tremendously.

Jan
Received on 2006-03-28 16:19:43