Coda File System

From: Ivan Popov <pin_at_medic.chalmers.se> Date: Mon, 4 Apr 2005 18:42:50 +0200

Hi Patrick,

> > > 	Our total storage need in coda will be around 40gb.
> > 
> > Then you want to run rvmsizer to check - probably you will be fine with
> > one server process, then use the maximal rvm size available, 1G.
> 
> 	rvmsizer suggests a rvm size of 70mb (and that's with a little cushion
> added by me).  Would you recommend bumping it up to 500mb or 1G anyway?

did you run it on the whole dataset? Then I guess you have big files.

> Note that these machines have 1GB of RAM and if the RVM must reside in
> memory, then it seems that it ought to be smaller than 1GB (provided the
> number of files and directories is sufficiently small).  Is that the
> correct thinking?  Or is the RVM metadata information no longer
> completely mapped into memory?

RVM is mapped to virtual memory, so you should have (RAM+swap) bigger
than RVM. In reality, it will be hopefully not used fully.

If you have as much memory, your servers will be comfortable with 0.5G RVM.
I would use the value (a lot) higher than the rvmsizer's suggestion, as
1. you lose nothing except the rvm area on the disk
2. you may want to put more files on your server later, and it is not
trivial to resize rvm

> 	So volumes are completely arbitrary and should be kept small.  Is there

Yes.

> a rule of thumb regarding how small?  At what point is there a

There are different things to remember, say that you set file quota per volume.
I keep volumes so that they correspond to "natural" clustering of data.
Each user gets 4 volumes with a bit different properties
(backup strategy, quota), each software package gets its volume.
If a volume seems to become extremely big compared to the rest, I try
to split it. Still, they vary in size a lot, from 0 to several gigabyte,
and from 0 to about 30000 files (not the same volumes as those containing
lots of data).

> performance hit?  Is extra administration required to manage more
> volumes?

Given a suitable policy and corresponding scripts it is easy.
In my experience, it is crucial to have a clear mapping between volumes
and their mountpoints. 1:1 is the most straightforward way, very
convenient to script also.

> 	Since I'm creating a partition just for file data for coda, is there a
> best-performing fs type to use?  ext2 or ext3?  Or does it make
> absolutely no difference?

Works well with ext3 for me. A nice feature is that you do not have
long fsck... though you will probably have anyway - as the uptime is most
often longer than the distance between forced fscks, a power fail gives you
a fsck :)

> here with questions about backing up.  I'm a bit unclear as to why
> standard dump type utilities can't be run on a /coda filesystem.  Also,

It is a bit tricky, to make backup on the client side.
You want to make sure that the client is very well connected, that
it has the cache bigger than the data set you make backup of,
that you have suitable acls and tokens, and finally you miss the internal
bookkeeping that the servers do, which make backups a lot more efficient.

> I thought that having multiple replicating servers provided automatic
> backup.  But I'm not ready to tackle these questions fully just yet...

They provide backup against hardware failures, but not against unintentional
file removal ("rm -r" which you did not mean but notice only a week later...)
or overwriting, nor against data corruption.
Multilevel online backups would cover most of these issues, but
nobody yet went ahead and implemented... (it is rather straightforward).

Good luck,
--
Ivan

Coda File System

Re: partitions and partition sizes