There are two databases. The log database contains a record of operations and events. (Operation logging is optional.) The file database contains a record of known files. In general, the file database is configured with sqlite options favoring stability, while the log database is configured for speed, as operation logging tends to outnumber file operations by a large margin. FILES: id (unique key) path (varchar, if known) dev (integer) ino (integer) uid (integer) gid (integer) mode (integer) rdev (integer) deleting (integer) There are two indexes on the file database, one by path and one by device and inode. Earlier versions of pseudo ignored symlinks, but this turned out to create problems; specifically, if you had a symlink to a directory, and accessed a file through that, it could create unexpected results. Names are fully canonicalized by the client, except for functions which would operate directly on a symlink, in which case the last path component is not replaced. It is not an error to have multiple entries with the same device and inode. Updates to uid, gid, mode, or rdev are applied to every file with the same device and inode. Operations by name are handled by looking up the name to obtain the device and inode, then modifying all matching records. If a file shows up with no name (this should VERY rarely happen), it is stored in the database with the special name 'NAMELESS FILE'. This name can never be sent by the client (all names are sent as absolute paths). If a laterThe request comes in with a valid name, the 'NAMELESS FILE' is renamed to it so it can be unlinked later. The "deleting" field is used to track files which are in the midst of being possibly-deleted. The issue is that if you issue an unlink database operation only after removing a file, there is a window during which another program can genuinely create a new file with the same name or inode, and that request can reach the daemon before the unlink operation does. To address this, three operations are addded: may-unlink, did-unlink, and cancel-unlink. The may-unlink operation tags a file for possible deletion, setting "deleting" to 1. A clash involving a file marked for deletion is resolved by deleting the existing database entry without complaint. A did-unlink operation deletes a file ONLY if it is marked for deletion. A cancel-unlink operation unmarks a file for deletion. You can still have a race condition here, but it seems a lot less likely. The failure would be: Process A marks file X for deletion. Process A unlinks file X. Process B creates file X. Process B requests a link for X. The server unlinks the previously marked X. The server creates a new link for X, not marked for deletion. Process B marks file X for deletion. Process A's delete-marked-file request finally shows up. The server unlinks the newly-marked X. Process B fails to delete the file. Process B attempts to cancel the mark-for-deletion. This shouldn't be a common occurrence. Rename operations use a pair of paths, separated by a null byte; the client sends the total length of both names (plus the null byte), and the server knows to split them around the null byte. The impact of a rename on things contained within a directory is handled in SQL: UPDATE files SET path = replace(path, oldpath, newpath) WHERE path = oldpath; UPDATE files SET path = replace(path, oldpath, newpath) WHERE (path > oldpath || '/') && (path < oldpath || '0); That is to say, anything which either starts with "oldpath/" or is exactly equal to oldpath gets renamed, with oldpath replaced by newpath... The unusual constructions are to address two key issues. One is that an "OR" would prevent proper use of the index. The other is that a pattern, such as "LIKE oldpath || '/%'", would prevent use of the index (at least in sqlite). The gimmick is that the only things greater than 'a/' and less than 'a0' are strings which begin with 'a/' and have additional characters after it. LOGS id (unique key) stamp (integer, seconds since epoch) operation (id from operations, can be null) client (integer identifier) dev (integer) ino (integer) mode (integer) path (varchar) result (result id) severity (severity id) text (anything else you wanted to say) tag (identifier for operations) access (integer) The log database contains a primary table (logs). As of this writing it is not indexed, because indexing is expensive during writes (common, for the log database) and very few queries are usually run. The log database also contains, when created, tables of operations, result types, and severities. These exist so that queries can be run against a log database even if these values might have changed in a newer build of pseudo. The tables of operations and severities are just id->name pairs. No enforcement of the relation is currently provided. The log database "tag" field, added since the initial release of pseudo, is available for tagging operations. When a client connects to the pseudo server, it passes the value of the environment variable PSEUDO_TAG; this tag is then recorded for all log entries pertaining to that client. The log database "access" field, added since the initial release of pseudo, stores values which are bitwise masks of the values 1 (execute), 2 (write), 4 (read), and 8 (append), used to report the modes with which a file was opened. These values are not completely reliable. A value of 0 is typical for non-open operations, and a value outside the 0-15 range (usually -1) indicates that something went wrong trying to identify the mode of a given open.