Coda File System

Re: CODA Scalability

From: Ivan Popov <pin_at_math.chalmers.se>
Date: Wed, 21 Aug 2002 10:17:09 +0200 (MET DST)
Hello Nick,

> I am in the process of setting up a _home_ fileserver with twin 80-gig
> disks (RAID-1 mirrored) and am looking for a distributed filesystem

It should be doable, depending on how big your files are - i.e. it is the

===> number of files <===

being the limitation, not the total size of the files.

The 4%-estimation is based on "typical" file size distribution that can
vary a lot.

> drives I already have), scalability (up to the 200+ gig range) and

Depends on if it suits you to have 200 G split on say 4 servers (possibly
running on the same machine). That is, the limitation lies in
adressability inside one client-server pair. By running multiple
server processes with independent configurations and dividing data between
them you can scale your installation.

> single namespace across my network (mostly 100baseT linux boxes but
> also some Windoze and one colocated server across an untrusted WAN).

Windows client is considered in alpha stage but I haven't seen complaints
on the list, so it may work rather well.

> So for an 80 gig server that would be 3.2 gigs of RVM metadata, but
> does it really need 3.2 gigs of main memory and/or 3.2 gigs of swap
> space? My server has only 512 megs of main memory, soon 1024.

I think you have to stay below 2G of metadata yet. Not sure.
And you have to have more *virtual* memory than your RVM - that is
the sum of your physical RAM and your swap has to exceed that size,
say for 1.9G RVM you would need say 1G RAM and 1G swap giving 2G
virtual memory.

> investigating distributed filesystems for linux. At the moment
> the most appropriate choice for any replicated data seems to be
> Inter-Mezzo because it uses a real filesystem as a base (and a
> real or virtual filesystem on the client). However I can't
> replicate 80+ gigs on every client, so the LAN-connected

I haven't ever tried InterMezzo but I assume it does not cache the whole
data on each client.
It might be something for you.

> What would be nice is a transparent filesystem or union filesystem
> so that I can acquire data in different ways and put it all together
> into one namespace which never has to change (just gets bigger)
> but I think that linux doesn't have a union filesystem, and if
> it did, it's not clear how that would be distributed.

I suppose there would be all kinds of semantical and manageability
problems. (Not trying to FUD the idea - feel free to try it! It would be
exciting to hear about your experience).
Union mounts are available for linux, see e.g.

http://kernelnewbies.org/status/latest.html

(there have been several different implementations of union mounts
and overlay filesystems before, but this one is in the mainstream
tree)

Good luck,
--
Ivan
Received on 2002-08-21 04:18:18