diff options
Diffstat (limited to 'doc')
-rw-r--r-- | doc/database | 81 | ||||
-rw-r--r-- | doc/overview | 96 | ||||
-rw-r--r-- | doc/pseudo_ipc | 76 | ||||
-rw-r--r-- | doc/utils | 33 |
4 files changed, 286 insertions, 0 deletions
diff --git a/doc/database b/doc/database new file mode 100644 index 0000000..cac7a3a --- /dev/null +++ b/doc/database @@ -0,0 +1,81 @@ +There are two databases. The log database contains a record of operations +and events. (Operation logging is optional.) The file database contains a +record of known files. In general, the file database is configured with +sqlite options favoring stability, while the log database is configured for +speed, as operation logging tends to outnumber file operations by a large +margin. + +FILES: + id (unique key) + path (varchar, if known) + dev (integer) + ino (integer) + uid (integer) + gid (integer) + mode (integer) + rdev (integer) + +There are two indexes on the file database, one by path and one by device +and inode. Earlier versions of pseudo ignored symlinks, but this turned +out to create problems; specifically, if you had a symlink to a directory, +and accessed a file through that, it could create unexpected results. Names +are fully canonicalized by the client, except for functions which would +operate directly on a symlink, in which case the last path component is not +replaced. + +It is not an error to have multiple entries with the same device and inode. +Updates to uid, gid, mode, or rdev are applied to every file with the same +device and inode. Operations by name are handled by looking up the name +to obtain the device and inode, then modifying all matching records. + +If a file shows up with no name (this should VERY rarely happen), it is stored +in the database with the special name 'NAMELESS FILE'. This name can never +be sent by the client (all names are sent as absolute paths). If a later +request comes in with a valid name, the 'NAMELESS FILE' is renamed to it so +it can be unlinked later. + +Rename operations use a pair of paths, separated by a null byte; the client +sends the total length of both names (plus the null byte), and the server +knows to split them around the null byte. The impact of a rename on things +contained within a directory is handled in SQL: + UPDATE files SET path = replace(path, oldpath, newpath) WHERE + path = oldpath; + UPDATE files SET path = replace(path, oldpath, newpath) WHERE + (path > oldpath || '/') && (path < oldpath || '0); +That is to say, anything which either starts with "oldpath/" or is exactly +equal to oldpath gets renamed, with oldpath replaced by newpath... The +unusual constructions are to address two key issues. One is that an "OR" +would prevent proper use of the index. The other is that a pattern, +such as "LIKE oldpath || '/%'", would prevent use of the index (at least +in sqlite). The gimmick is that the only things greater than 'a/' and less +than 'a0' are strings which begin with 'a/' and have additional characters +after it. + +LOGS + id (unique key) + stamp (integer, seconds since epoch) + operation (id from operations, can be null) + client (integer identifier) + dev (integer) + ino (integer) + mode (integer) + path (varchar) + result (result id) + severity (severity id) + text (anything else you wanted to say) + tag (identifier for operations) + +The log database contains a primary table (logs). As of this writing it +is not indexed, because indexing is expensive during writes (common, for +the log database) and very few queries are usually run. + +The log database also contains, when created, tables of operations, result +types, and severities. These exist so that queries can be run against +a log database even if these values might have changed in a newer build +of pseudo. The tables of operations and severities are just id->name pairs. +No enforcement of the relation is currently provided. + +The log database "tag" field, added since the initial release of pseudo, +is available for tagging operations. When a client connects to the +pseudo server, it passes the value of the environment variable PSEUDO_TAG; +this tag is then recorded for all log entries pertaining to that client. diff --git a/doc/overview b/doc/overview new file mode 100644 index 0000000..7341e3c --- /dev/null +++ b/doc/overview @@ -0,0 +1,96 @@ +Overview: + +The pseudo program and library combine to provide an environment which +provides the illusion of root permissions with respect to file creation, +ownership, and related functions. At this time, this does not extend to +emulating chroot functions or a virtual password database, but these features +may be added. + +The underlying mechanism of pseudo is a library inserted using LD_PRELOAD, +which provides replacement symbols for core C library functions. At this +time, the implementation is specific to modern glibc. Support for other +systems is certainly possible, but not currently implemented or immediately +planned. The symbols wrapped are generally those that are documented in +section 2 of the manual -- the ones which are essentially system calls. + +The library works by replacing each real function with a wrapper function +which obtains the addresses of "real" functions (those in the next library +down in the chain, typically glibc) and then calls custom-written wrappers +which alter the behavior of these functions and return results corresponding +to the virtual environment. + +Underlying this is access to a server process, which is automatically +spawned by the library if one is not available. The server process maintains +a UNIX domain socket while it is active, and maintains a database (using +sqlite) of files known to the system. Files are recorded in the database +only if they are created within the virtualized environment or have been +altered by it; files merely read are not added. + +There are four layers of logic for performing or wrapping any function, +although not all functions involve all four layers: + +1. The generic wrapper, which handles details such as thread-synchronization. +This function handles the mutex used to keep multiple threads from trying to +write to the same socket at once, and also disables wrappers when a value +called "antimagic" is set. The antimagic value is set internally by the +pseudo client code, and the check for whether or not to use it is controlled +by the mutex (actually by the mutex owner variable, which is protected by +the mutex.) Without that, read operations in another thread during the +"antimagic" part of an operation would bypass pseudo, yielding erratically +wrong results! +2. The wrapper function itself. This function may translate a single +operation into two or more logical operations. This function has no awareness +of the database, but can send queries to the general client code. +3. The general client code. This code maintains additional data, such as +a mapping of file descriptors to paths. In most cases, this code also +forwards requests to the server code. (If the server is unavailable, the +client can restart it.) +4. The server code. This code is fairly simple; all it does is maintain +the database of file information. Operations consist either of a request +for information (e.g., a stat(2) call) or notification of a change. The +server sends back failure or success notices. + +As a fairly typical example, the progress of a stat(2) call is: + +* The __xstat() wrapper is called. This wrapper checks the version argument + against the _STAT_VER constant in case we some day run into a system where + programs call stat with different versions of struct stat. (Hasn't happened + yet.) +* The __xstat() wrapper calls the __fxstatat() wrapper, which in turn calls + the __fxstatat64() wrapper (this allows us to have only one copy of the + logic shared among all the path-based stat syscalls). +* The __fxstatat64() wrapper calls the underlying __fxstatat64() function, + which has been mapped to the name real___fxstatat64(). (If this fails, + the wrapper function returns immediately.) +* The __fxstatat64() wrapper passes the resulting stat buffer and path to the + client code and asks for a response. +* The client code converts the stat buffer into a pseudo_msg_t message + object, and canonicalizes the path (resolving symlinks and eliminating + extra slashes, as well as references to . and ..). +* The client code now sends the pseudo_msg_t object and converted path to + the server as a message. +* The server receives the message. Since this is a stat() operation (using + a path, not a dev/inode pair, for identification), the server searches its + database for existing entries with the corresponding name. +* If the server finds an object, it updates the contents of the pseudo_msg_t + with the recorded values for uid, gid, mode, and raw device number, and + sends the message back with status SUCCEED. +* The server also performs sanity checks to see whether there may be other + suspiciously-similar entries in the database, in which case it emits + diagnostics. (Usually to pseudo.log.) +* If the server finds no object, it sends the message back with status FAIL. +* The client code returns the message to the wrapper function. +* If the status was SUCCEED, the wrapper function copies the modified + fields back into its stat buffer; otherwise, it does not. +* The wrapper function returns the original exit status from stat. + +Most of the functions wrapped are syscalls. There are a few exceptions, such +as mkstemp, fopen, and freopen. These are wrapped because, in glibc, they +call internal functions which make inline assembly syscalls, rather than +calling the syscall entry points. In each case, the wrapper makes the real +call without intervention, then snoops the results for a file descriptor to +path mapping. (This would be done to opendir/fdopendir/closedir as well, +but the DIR * is opaque and can't be snooped practically. This is why +some versions of 'rm -r' can, at higher diagnostic levels, generate a slew +of warnings about file descriptors being reopened when no close was +observed.) diff --git a/doc/pseudo_ipc b/doc/pseudo_ipc new file mode 100644 index 0000000..6a73ec8 --- /dev/null +++ b/doc/pseudo_ipc @@ -0,0 +1,76 @@ +MESSAGE PASSING + +typedef struct { + pseudo_msg_type_t type; + op_id_t op; + res_id_t result; + int xerrno; + int client; + dev_t dev; + unsigned long long ino; + uid_t uid; + gid_t gid; + unsigned long long mode; + dev_t rdev; + unsigned int pathlen; + int nlink; + char path[]; +} pseudo_msg_t; + +This structure is used for every communication between the client and the +server. The last field is optional (it's a C99ism called a flexible array +member, allowing a single allocation to hold both the structure and the +variable-length character data at the end). + +All messages contain items up through 'pathlen'. If pathlen is not zero, +an additional pathlen bytes containing path are provided; path is +null-terminated. + +Every message from client should get a response from server. The server +never really sends a path, currently, but maybe it will someday. Note that +all server responses will in general share a single message object, +and future operations may cause that object to be reallocated; the same +goes for messages received by the server. Basically, pseudo_msg_receive +is not thread-safe; this is part of (but not all of) the reason that there's +mutex stuff in the wrappers. (The other part is the "antimagic" being +able to blow things up.) + +type is one of PING, OP, SHUTDOWN, ACK, or NAK. The client only sends PING or +OP. The server should always send ACK. When run with '-S', the pseudo +program runs as a client, sending a SHUTDOWN message to a server -- but only +if it can find one, it does not start a new one. In this case, the server +could respond with a NAK, in which case it sends a message in which "path" +is a list of space-separated PIDs of currently-living clients, for the program +to print out in an error message. The server will not shut down while there +are living clients. (The request, though, causes it to shut down immediately +when there are no more clients, rather than waiting for the timeout +period.) + +result is the result of a particular operation. It applies only in replies +to OP messages. + +client should be the client's PID on send, and the server's client number for +that client on response. (The response isn't checked, and this is just a +debugging feature.) + +dev/ino/uid/gid/mode/rdev/path are information about the file. They should +all be provided on send if possible, but the server only generally changes +uid/gid/mode/rdev on response, and never sends a path back. Dev and inode +are currently changed by stat-by-path operations, but this may turn out to +be wrong. + +xerrno is used to contain a changed errno if, at some point, the server wants +to override the default errno. Normally, the client just uses its existing +errno. + +nlink is used to forward the number of links. The server DOES NOT modify +this. Rather, nlink is used to provide better diagnostics when checking +paths against inodes. + +32/64 bit: This structure should have the same offsets for every element, +including path, on both 64-bit and 32-bit machines. (Check with 'offsets.c'.) +It is *not* an error if sizeof(pseudo_msg_t) is different; the padding +happens after the path element. (Note: This is contrary to C99, TC1, but is +correct according to the current standard. Anyway, gcc's always done it this +way.) The data written are always pathlen + offsetof(pseudo_msg_t, path), +and that's correct. diff --git a/doc/utils b/doc/utils new file mode 100644 index 0000000..35520d4 --- /dev/null +++ b/doc/utils @@ -0,0 +1,33 @@ +pseudolog + Displays or creates log entries. This offers a quick first + approximation of the sorts of queries one is likely to need to + run. + +pseudo + The pseudo server. Run on the command line, the default behavior + is to set up a pseudo environment (LD_PRELOAD, etc) then run either + the specified command or a shell by default. The -d option specifies + a background daemon, and -f specifies a foreground daemon (which + may display output directly). The launcher function isn't really + 32-bit/64-bit aware, but if you have both types of libraries in + suitably-named directories, it'll do the right thing anyway. + + Path may be in environment as PSEUDO_PREFIX, specified on command + line with -P path, or inferred from the path to $0. (The last + generates a diagnostic.) + + To stop the pseudo server, either wait a while (the default timeout + is 30 seconds, or whatever you specified with "-t" or PSEUDO_OPTS + when starting it) or run "pseudo -S". The server will not exit + while clients are active, but requesting a shutdown sets the timeout + to one second, so it will exit quickly after the last client + disconnects. + +libpseudo.so + The library providing the wrapper functionality, which spawns + the pseudo server automatically if needed. If the environment + variable PSEUDO_ENOSYS_ABORT is set, attempts to call missing + system calls will abort() rather than merely emitting a diagnostic. + +pseudodb + allows browsing and modification of db (not implemented) |