Coda Authentication and Protection: Coda Protection

4. Coda Protection

To quote Satya's paper: "The fundamental question is: Can agent X perform operation Y on object Z". The discussion here breaks up in several components:

A protection database is used to group principals conveniently. This goes beyond Unix users and groups in that transtive group membership is taken into account
Access Control Lists are attached to directory vnodes. Access Control lists allow much finer granularity of access principally through two mechanisms:
1. Access for an arbitrary number of principals can be specified, not just for "owner", "group", "others"
2. Negative rights can be assigned which override positive ones

In more detail the structure of these two components are held in:

the protection database is has entries which are users or groups. The attributes of users and groups are:
1. users, and groups. Groups and users are identified with 32 bit integers, positive for users, negative for groups. These integers should not be reused.
2. Users have a list of groups to which they belong and which they own. Groups have a list of groups to which they belong and of users and groups belonging to the group.
3. A precomputed Current Protection Subdomain, the CPS for each user and group. Informally the CPS of a user or group X is the set of groups of which X is a member, directly or indirectly, including X itself.
4. An access list which defines which groups and users may and may not change the entry in the protection database.
5. Some exceptional groups and users are defined, such as System:Administrators and System:Anyuser and user Anonymous.
At the moment the precomputed CPS and access lists are not used.
The counter part are the Access Control Lists or ACL s, stored right behind the shared part of large Vnodes in server RVM. ACL's have attributes pairs of
1. a right or negative right; such rights are read, write, lookup, insert, delete, administer.
2. a group or user to which the right applies.

Note that transitive Membership and exclusion of membership relations among members and groups can be defined. Satya's papers explains several importance advantages and possibilities for using the system based on this scheme.

The total rights are the union of the rights of members of the CPS of the user. The negative rights are also a union and overide positive rights in case of conflict.

4.1 Checking permissions.

When a client first contacts a server the RPC routine ViceNewConnection ( srvproc2.cc ) is called. This calls BuildClient ( clientproc.cc ). BuildClient gets the ViceId from the username if necessary. A ClientEntry client is malloced and a the routine SetUsername is called to fill in more details. Finally AL_GetInternalCPS is called which builds up the CPS for the client entry.

AL_GetInternalCPS gets to the guts of the protection database file. It mallocs the length needed to encode the CPS and hangs it off the client entry.

The algorithm to check permissions is short and sweet and explained in Satya's paper. The detailed checks are in libal but it is instructive to step momentarily to the server code. Before performing an operation the server will do a Check_XX_Semantics routine which among other things checks permissions:

It gets a pointer to the ACL for the large Vnode.
It enters the GetRights routine which does some complicated checking of rights for anonymous users before calling AL_CheckRights using the CPS attached to the client and the ACL.

4.2 The authorization library.

This system is ready for a rewrite, although the ideas are fine. Let's step through some of the ingredients.

The .pdb .pcf mystery

When setting up a Coda server the system administrator can combine existing Unix password files and groups files into a .pdb file using the program pwd2pdb. This is merely a convenient tool to quickly get a usable .pdb file the structure of which has been described above.

A perl script could do this equally well and might easily do a precomputation of the CPS. The .pdb file is a text file which has a very useful comment at the top describing it's structure:

############################
# VICE protection database #
############################

# Lines such as these are comments. Comments and whitespace are ignored.

# This file consists of user entries and group entries in no particular order.
# An empty entry indicates the end.

# A user entry has the form:
# UserName      UserId
#               "Is a group I directly belong to"_List
#               "Is a group in my CPS"_List
#               "Is a group owned by me"_List
#               Access List
#               ;

# A group entry has the form:
# GroupName     GroupId OwnerId
#               "Is a group I directly belong to"_List
#               "Is a group in my CPS"_List
#               "Is a user or group who is a direct member of me"_List
#               Access List
#               ;


# A simple list has the form ( i1 i2 i3 ..... )

# An access list has two tuple lists:
#               one for positive and the other for negative rights:
#               (+ (i1 r1) (i2 r2) ...)
#               (- (i1 r1) (i2 r2) ...)

The program pcfgen is much more obscure, yet does something elementary: it creates a file vice.pcf containing 5 arrays:

USeeks: The offset of a user entry in the pdb file
GSeeks: The offset of a group entry in the pdb file
LitPool: An array of concatenated, null terminated strings of usernames and groupnames. The names are sorted alphabetically.
USorted: The sorted entry number of uid i
UOffset: The ofset of the i-th entry in LitPool
GSorted: Contains in spot i the sorted entry number of group with gid i in the LitPool
GOffset: The offset of entry i in the LitPool

4.3 A few of the AL_ routines

The AL_Initialize routine creates these arrays in VM for the server to access them. Finally the AL_GetInternalCPS builds a CPS by parsing the .pdb file at the offset given by the array entries, using a lex parser for pdb files. Transitive group membership and unexplained excluded entries from groups are not implemented.

4.4 Problematic aspects the ACL and PDB implementation.

Much of the AL implementation is now out of date or was incomplete to start with. The following points give a possible solution to this problem:

Local and Coda user id's.

It is not desirable that a userid should be the same on every client, particularly in the case of access to multiple coda clusters. Venus should have access to a translation table on the client (probably owned by root on the client) which translates local id's to remote Coda ids for various cells.

The pdb and pcf database

This database describes a wonderful structure but the implementation is poor:

It is undesirable to have to keep vice.pdb, vice.pcf and the password database auth2.pwd in sync. The utilities to move databases synchrnously indicate the history of these problems on busy servers.
There is no way a client can modify the pdb. All au operations only modify the auth2.pwd file.
The database is huge and large index tables and string table might needlessly be held in the Coda server's vm.
The system is extremely difficult to understand (particularly if one hasn't read this document or Satya's paper).

We propose therefore to make a pts server in one or two forms:

The protection database could very conveniently be held in an LDAP database. These database servers could run on every fileserver and replicas can be held in sync using slurpd, the LDAP replication tool. This is similar but more sophisticated than the replication of kerberos databases between mastern and slave KDC's.
Updates go to the master LDAP server, like in Kerberos.
The same LDAP database could serve GetVolInfo requests. I will come back to this.
LDAP is already kerberized and has tight access control.
The list of routines to change is short and sweet:
1. AL_GetInternalCPS
2. AL_IdFromName
3. AL_NameFromId
4. some others
The list of things we can remove from AL is long and spicy:
1. the pdb stuff can go: it will be easy to create entries in the ldap pdb
2. the pcf stuff and the internal versions of the seek arrays can go. This eliminates major hassles with lex and flex we have had.
3. routines keeping cached pdb and pcf informatin up to date can be eliminated.
A user of pts should be authenticated.
CPS's attached to ClientEntry's should be refetched when updated (currently nothing is done when the .pdb file is modified to modify existing CPS's in client entry's . (How does ubiq do this?). Possibilities:
1. Best: the ldap server when notified by slurpd of an update would prod the fileserver to re-fetch the CPS. This might not be possible without modifying ldapd.
2. The pts client program gets all servers from LDAP and places an RPC2 with all of them to drop the cached CPS structures which it has just modified.
3. A LDAP server should contain a table of userid's for which the pdb has been modified. Every fileserver should periodically read this list and refetch the client CPS's.
Client programs to modify entries could be short and sweet, with one exception: they should recompute the precomputed CPS in ldap.
The implementation of a fully features PTS server with a nice client program is very simple: the ldap server has all the basic routines.

Alternatively, we stick with the distribution of pdb databases through Update. The following modifications seems desirable:

We need a pts type server to modify the pdb database. This has to run on the SCM and should change the auth2.pwd and pdb databases in conjunction.
The database format for the pdb and pcf file is really not great and should be replaced by a collection of (g)dbm database files. This would also work for the volume databases.

The disadvantage of the second approach is that we effectively need to construct a networked database server for pdb files, while ldap is such a tool. The advantage is that we will control the code entirely rather than rely on 3rd parties construction.

Next Previous Contents

Coda File System