Coda File System

Re: process suspension issues

From: Jan Harkes <>
Date: Thu, 20 May 2004 09:42:09 -0400
On Wed, May 19, 2004 at 02:50:30PM -0400, wrote:
>    Also, I don't know if cp behaves well if some of the read/write
>    syscalls return EINTR, or if it just gives up.  It is unclear to me
>    how and to what extent a good program is supposed to cope with this
>    and retry.
> Yep. When I designed the scsh syscall interface, I decreed that syscalls never
> return EINTR -- they loop & resume. Want to bail out on an interrupt? Scheme
> has exceptions, so have your signal handler throw out to a handler. EINTR is,
> in my opinion, very bogus. If I missed something important here, OS wizards
> are welcome to set me straight.

The only reason the kernel doesn't loop for you is that some
applications are using SIGALRM/EINTR to abort long running operations
and returning from the syscall is the only way to hand control back to
such an application. I would even go as far as guess that the signal
handler in the application doesn't get run until the system call
actually returns. The kernel code could easily switch to looping
internally by returning ERESTARTSYS (which you'll never see in
userspace) but that might affect signal handling in the application.

Some system calls are not restartable, if the kernel completed anything,
it will return partial success first i.e. interrupted write doesn't
return EINTR, but the number of bytes that got written. Another example
in the Coda case is close. Once the reference counts on the
filedescriptor have been dropped the fd is unusable and the application
can't call close again. By the time we queue the upcall to venus, we've
passed the point of no return and we have to wait for the request to
complete and block interrupts.

>    On BSD, grok sys/coda/*.c, the VFS rules, and signal delivery.  After
>    several weeks of just reading code, you will be enlightened and able
>    to fix any issues that are present :-)

And that's probably one of the best reasons why Coda isn't ready for
prime time. We can't expect everyone to go read through kernel code for
several weeks to figure out why some file didn't end up on disk.

Received on 2004-05-20 09:46:47