Age | Commit message (Collapse) | Author |
|
The path logic had gotten gradually more baroque, originally
from the need to handle trailing slashes in paths (because
we need "*/" to be a glob that matches only directories),
then from the need not to handle "regularfile/." and the like.
The revised solution is to move the invariant from "partial
path always ends with slash" to "appending always adds a
slash". Because this implies wanting to know the type of
the current partial path, and we also check the type of the
new path whenever we append an element, we now have a shared
stat buf for those. (With some manual tweaking, such as
an empty path is implicitly root and thus a directory, and
anything in which we want to expand a symlink is a
directory.)
Signed-off-by: Seebs <seebs@seebs.net>
|
|
FASTOP is logically fine as long as there's no errors, but
if there could be errors, we lose error detection. Responding
instantly with a trivial ACK for FASTOP messages slightly
reduces performance but improves reliability and seems to work
better.
Signed-off-by: Seebs <seebs@seebs.net>
|
|
If you're running pseudo in docker, a script that creates a pseudo
daemon can exit, causing docker to kill pseudo before it's done writing
the database.
Since the client sending the shutdown request doesn't have its socket
closed explicitly by the server, we can just read from the socket in
the client to create a delay until the actual exit, which can take
a while if there's an in-memory DB.
Signed-off-by: Seebs <seebs@seebs.net>
|
|
Fix provided by Patrick Ohly <patrick.ohly@intel.com>. This resolves
the actual cause of the path length mismatches, and explains why
I couldn't quite explain why the previous one had only sometimes
worked, also why it showed up on directories but not plain files.
Signed-off-by: Seebs <seebs@seebs.net>
|
|
So we had this really strange problem where, sometimes but not always,
pseudo would have strange problems on startup, where the pseudo server
would end up running under pseudo. And this produced the most fascinating
thing, which was:
unsetenv("LD_PRELOAD");
assert(getenv("LD_PRELOAD") == NULL);
for (int i = 0; environ[i]; ++i) {
assert(strncmp(environ[i], "LD_PRELOAD=", 11));
}
(pseudocode untested)
This would crash on the environ search. Because getenv() was not searching
environ.
WHAT.
So it turns out, *bash overrides getenv, setenv, and so on*. Under those
names. Hiding the glibc ones. And this creates horrible problems if you
assumed that your code could call those functions and expect them to work.
So as a workaround, pseudo now uses dlsym to find getenv, etc., from
glibc, and invokes those directly if possible. Also the client now uses
unwrapped fork/exec for spawning the server, which cleans up the
behavior of that code quite a bit.
|
|
Improved/simplified logic for the client spawning servers, to make it
(I hope) easier to see what it's trying to do and when. Also clearer
diagnostics about what may have gone wrong, and I don't check the pid file
unless there's a problem.
|
|
Server process now waits for its forked child when daemonizing, allowing
us to yield meaningful exit status. Lock is now taken by the child, since
it has a way to tell the parent about the exit status. (We send SIGUSR1 to
the server to cause the wait loop to stop when the client is ready to go.)
This allows us to switch to fcntl locking, which should in theory allow us
to run with the pseudo directory NFS-mounted. Woot!
Also mark a couple of overly spammy messages as PDBGF_VERBOSE to reduce the
volume of uninteresting dup spam when looking at client behaviors.
Client now uses execve to spawn server to work around a very strange behavior
of unsetenv.
Signed-off-by: Peter Seebach <peter.seebach@windriver.com>
|
|
This is the big overhaul to have the server provide meaningful exit status
to clients.
In the process, I discovered that the server was running with signals blocked
if launched by a client, which is not a good thing, and prevented this from
working as intended.
Still looking to see why more than one server spawn seems to happen.
|
|
Improve event logging a little bit more, increase default event log size,
reduce retries (we shouldn't need that many if nothing's wrong), and make
the server log timestamps during database cleanup, since I'm suspicious of
that as a possible source of delays. Also cause server to emit a useful exit
status if it can't get a lock, and client to check server exit status when
spawning server.
|
|
For debugging the client/server startup, add an event logger to allow
better recording of events that we may, or may not, want to dump out
listings of later.
|
|
First, if aborting, display message even when no debugging is set, because
that's probably a big deal.
Second, if you use "pseudo <cmd>", try to die with the same signal that killed
the child process, if it died from a signal rather than exiting cleanly. (You
can't just pass the exit status out in that case, because exit(N) doesn't work
for N outside the range of non-signal exit statuses.)
|
|
There's a possible race condition if multiple clients try to start while
the server's down, especially if it's shutting down and thus holding a lock
but ignoring them. Logic altered to retry more often, at greater intervals.
Also, we are fine with being unable to spawn the server, because that can
happen if another client spawned it successfully. So we just retry sending
the message in a bit if we couldn't spawn a server, or immediately if we
could. (Because "could" spawn a server includes successfully communicating
with the newly-spawned server; the server-side code makes sure that the
child process won't exit before we expect such attempts to work, even if
they take a while.)
|
|
Race conditions exist when the server shutdown takes long enough for
three attempts to access the server to fail. Solution: Add a slight
delay to the retry. Delay is variable (using getpid()%5). (Not "random"
because I have no evidence that the process the client is running in
will have seeded RNG, and I don't want to seed it and possibly screw
them up).
|
|
In some cases, there can be a race with multiple clients trying to
start a server at once, and they should just retry their messages,
rather than aborting. I haven't been able to consistently reproduce
this, so it's not very well tested, but it seems reasonable.
|
|
Add some debug messages useful for tracking down xattr behaviors.
Signed-off-by: Peter Seebach <peter.seebach@windriver.com>
|
|
When xattr emulation is used to store extended attributes, dummy
entries get made in the db using whatever UID/GID were in the real
stat buffer if no entry already existed. Change these to -1, and
treat -1 uid/gid as a missing entry for stat purposes.
xattrdb was not merging existing uid/gid values. Change this by
loading existing values to merge them in when executing chown/chmod
commands.
Newly-created files could end up with a filesystem mode of 0 if
you used umask, but this breaks xattrdb.
Signed-off-by: Peter Seebach <peter.seebach@windriver.com>
|
|
Dropping the alloc from file paths meant that pseudo_exec_path
could end up just returning its original argument, which was
const-qualified, meaning its return should also be const-qualified.
Signed-off-by: Peter Seebach <peter.seebach@windriver.com>
|
|
OP_OPEN and OP_EXEC are used only when logging. The server can now
tell the client (in response to initial ping) whether or not it is
logging, and if it isn't, the client doesn't send those messages.
Signed-off-by: Peter Seebach <peter.seebach@windriver.com>
|
|
This is a moderately experimental feature which stores values in an
extended attribute called 'user.pseudo_data' instead of in the database.
Still missing: Database<->filesystem synchronization for this.
For at least some workloads, this can dramatically improve performance.
Signed-off-by: Peter Seebach <peter.seebach@windriver.com>
|
|
Instead of allocating (and then freeing) these paths all the time,
use a rotating selection of buffers of fixed but probably large enough
size (the same size that would have been the maximum anyway in
general). With the exception of fts_open, there's no likely way to
end up needing more than two or three such paths at a time. fts_open
dups the paths since it could have a large number and need them for
a while. This dramatically reduces (in principle) the amount of allocation
and especially reallocation going on.
Signed-off-by: Peter Seebach <peter.seebach@windriver.com>
|
|
A partially-implemented profiler for client time, which basically just
inserts (optional) gettimeofday calls in various places and stashes data
in a flat file containing one data block per pid.
Signed-off-by: Peter Seebach <peter.seebach@windriver.com>
|
|
Having the same logic twice was sorta bugging me. Now the
function-like-macro is sorta bugging me, and I'll just let
it.
|
|
This is derived in significant part from contributions to oe-core
by Peter A. Bigot. I reworked the path routine a bit to use an
already duplicated string instead of allocating copies of parts of
it.
The first issue was just that there was a missing antimagic() around
some of the path operations. The second is that we wanted to have
a way to provide a fallback password file which isn't the host's,
but which can be used in the case where the target filesystem hasn't
got a password yet, for bootstrapping purposes. (So there's a minimal
password file that just has root, basically.)
Also, I noticed a design flaw, which is that if you ended up
calling pseudo_pwd_lck_open() twice in a row, the second time
through, pseudo would first check whether it had a path name
for the file (it does), and thus not allocate one, then call
the close routine (which frees it and nulls the pointer), then
open a new one... and not have a file name, so the next attempt
to close it wouldn't unlink the file. This shouldn't ever
come up in real code, but it was bugging me.
Signed-off-by: Peter Seebach <peter.seebach@windriver.com>
|
|
|
|
We used to rely on filesystem operations to apply the umask when
appropriate, but when we started masking out 022, that stopped working.
Start watching umask.
|
|
Wait until the server has finished processing all of our messages
before exiting. Otherwise, it's possible for a command which sends
a no-response message and then exits to be followed by another
command which assumes the first one's done, and the second command's
messages can get processed first.
|
|
The xattr first-pass implementation was allocating a buffer to
hold the name and value for a set operation, then pseudo_client was
allocating *another* buffer to hold the path and those two values.
pseudo_client_op develops more nuanced argument handling, and also
uses a static buffer for the extended paths it sometimes needs. So
for the typical use case, only occasional operations will need to
reallocate/expand the buffer, and we'll be down to copying things
into that buffer once per operation, instead of having two alloc/free
pairs and two copies.
And of course, that wasn't two alloc/free pairs, it was one alloc/free
pair and one alloc without a free. Whoops.
Signed-off-by: Peter Seebach <peter.seebach@windriver.com>
|
|
Issue #1: If an operation came in for an item with no path
provided by the wrapper, the client would not construct the
combined "path" value. Fixed, and missing paths are now
consistently handled as 0-byte paths.
Issue #2: The database code was assuming the values were
strings, and ignoring a specified length.
Issue #3: The computation of the length of the stored value
was off by one, because it was including the extra terminating
null the client added in case the value was a path.
With this in place, "cp -a" on CentOS is consistently
duplicating the system.posix_acl_access fields as expected,
but unfortunately not handling their permissions too.
(Intent is to translate a system.posix_acl_access setxattr
into corresponding permissions whenever possible.)
Signed-off-by: Peter Seebach <peter.seebach@windriver.com>
|
|
Initial, incomplete, support for extended attributes. Extended
attributes are implemented fairly naively, using a second table
in the file database using the primary file table's id as a
foreign key. The ON DELETE CASCADE behavior requires sqlite 3.6.19
or later with foreign key and trigger support compiled in.
To reduce round-trips, the client does not check for existing
attributes, but rather, sends three distinct set messages;
OP_SET_XATTR, OP_CREATE_XATTR, OP_REPLACE_XATTR. A SET message
always succeeds, a CREATE fails if the attribute already
exists, and a REPLACE fails if the attribute does not already
exist.
The /* flags */ feature of makewrappers is used to correct
path names appropriately, so all functions are already working
with complete paths, and can always use functions that work
on links; if they were supposed to dereference, the path
fixup code got that.
The xattr support is enabled, for now, conditional on
whether getfattr --help succeeds.
Not yet implemented: Translation for system.posix_acl_access,
which is used by "cp -a" (or "cp --preserve-all") on some
systems to try to copy modes.
Signed-off-by: Peter Seebach <peter.seebach@windriver.com>
|
|
In some cases, we'd rather pseudo fail than fall back to using
/etc/passwd or /etc/group. Make the determination of what to fall
back to when neither PSEUDO_PASSWD nor a chroot directory contains
passwd/group files controllable by a configure-time flag, controlled
by --with-passwd-fallback= or --without-passwd-fallback.
|
|
|
|
This is a moderately intrusive change. The basic overall effect:
Debugging messages are now controlled, not by a numeric "level",
but by a series of flags, which are expressed as a string of
letters. Each flag has a single-letter form used for string
specifications, a name, a description, a numeric value (1 through N),
and a flag value (which is 1 << the numeric value). (This does mean
that no flag has the value 1, so we only have 31 bits available.
Tiny violins play.)
The other significant change is that the pseudo_debug calls
are now implemented with a do/while macro containing a conditional,
so that computationally-expensive arguments are never evaluated
if the corresponding debug flags weren't set. The assumption is
that in the vast majority of cases (specifically, all of them
so far) the debug flags for a given call are a compile-time constant,
so the nested conditional will never actually show up in code
when compiled with optimization; we'll just see the appropriate
conditional test.
The VERBOSE flag is magical, in that if the VERBOSE flag is
used in a message, the debug flags have to have both VERBOSE and
at least one other flag for the call to be made.
This should dramatically improve performance for a lot of cases
without as much need for PSEUDO_NDEBUG, and improve the ability of
users to get coherent debugging output that means something and is
relevant to a given case.
It's also intended to set the stage for future development work
involving improving the clarity and legibility of pseudo's diagnostic
messages in general.
Old things which used numeric values for PSEUDO_DEBUG will sort
of continue to work, though they will almost always be less verbose
than they used to. There should probably be a pass through adding
"| PDBGF_CONSISTENCY" to a lot of the messages that are specific
to some other type.
|
|
|
|
Some filesystems have buggy semantics where stat(2) will return incorrect
sizes for files for a while after some changes, sometimes, unless they've
been fsync'd. We still want to disable fsync most of the time, but enabling
it for specific programs can be useful.
Signed-off-by: Peter Seebach <peter.seebach@windriver.com>
|
|
Most pseudo operations don't actually USE the server's response. So
why wait for a response?
This patch introduces a new message type, PSEUDO_MSG_FASTOP. It
also tags pseudo operation types with whether or not they need to
give a response. This requires updates to maketables to allow non-string
types for additional columns, and the addition of some quotes to the
SQL query enums/query_type.in table.
A few routines are altered to change their behavior and whether or not
they perform a stat operation. The only operations that do wait are
OP_FSTAT and OP_STAT, OP_MKNOD, and OP_MAY_UNLINK. Rationale:
You can't query the server for replacement information and not wait for
it. Makes no sense.
There's extra checking in mknod, because we really do want to fail out
if we couldn't do that -- that implies that we haven't created a thing
that will look like a node.
The result from OP_MAY_UNLINK is checked because it's used to determine
whether we need to send a DID_UNLINK or CANCEL_UNLINK. It might be cheaper
to send two messages without waiting than to send one, wait, and maybe
send another, but I don't want to send invalid messages.
This is highly experimental.
|
|
There were a couple of cases where pseudo built against GLIBC_2.7 or
newer was ending up with dependencies on symbols which required
GLIBC_2.7. With these gone, it appears that a libpseudo.so can be
used on an older host in some cases. None were particularly important
or intentional:
1. pseudo_util was conditionally calling open() with only two arguments,
which can invoke a new __open2() function in some systems. Don't care,
and the docs specifically state that the mode argument is "ignored" when
O_CREAT is absent, so it's not necessary to omit it.
2. The calls to sscanf/fscanf in pseudo_client.c were getting translated
into a special new iso_c99 sscanf/fscanf, and we don't care because we're
not using those features; #define _GNU_SOURCE suppresses the extra-compliant
behavior.
Signed-off-by: seebs <peter.seebach@windriver.com>
|
|
The _plain thing was added because of clashes between Linux
("struct stat64 for 64-bit file sizes") and Darwin ("struct stat
is already 64 bits"). But it turns out not to be enough,
because stat will *fail* if it cannot represent a file size,
so when something like unlinkat() calls a non-64-bit stat in
order to determine whether a file exists, it gets the wrong
answer if the file is over 2GB in size.
Solution: Continue using PSEUDO_STATBUF, and also provide
defines for base_stat() which can be either real_stat() or
real_stat64(), etcetera.
This eliminates any reason to need the _plain functions. It
also suggests that the other real___fxstatat() calls should
someday go away because that is an ugly, ugly, implementation
detail.
As part of testing this, fix up some bitrot which affected
Darwin (such as the continue outside of a loop, but inside
an #ifdef; that was left over from the conversion of
init_one_wrapper to a separate function).
|
|
1. Fix *at() where dirfd is obtained through dirfd(DIR *).
The dirfd(DIR *) interface allows you to get the fd for a DIR *,
meaning you can use it with openat(), meaning you can need its
path. This causes a segfault. Also fixed the base_path
code not to segfault in that case, but first fix the
underlying problem.
2. Implement renameat()
After three long years, someone tried to use this. This was impossibly
hard back when pseudo was written, because there was only one dirfd
provided for. Thing is, now, the canonicalization happens in wrapfuncs,
so a small tweak to makewrappers to recognize that oldpath should use
olddirfd if it exists is enough to get us fully canonicalized paths
when needed.
|
|
2011-11-01:
* (mhatle) Stop valgrind from reporting use of uninitialized
memory from pseudo_client:client_ping()
Signed-off-by: Mark Hatle <mark.hatle@windriver.com>
|
|
Change from internal PSEUDO_RELOADED to external PSEUDO_UNLOAD environment
variable. Enable external programs to have a safe and reliable way to unload
pseudo on the next exec*. PSEUDO_UNLOAD also will disable pseudo if we're in a
fork/clone situation in the same way PSEUDO_DISABLED=1 would.
Rename the PSEUDO_DISABLED tests, and create a similar set for the new
PSEUDO_UNLOAD.
Signed-off-by: Mark Hatle <mark.hatle@windriver.com>
|
|
|
|
|
|
This is a spiffied-up rebase of a bunch of intermediate changes, presented
as a whole because it is, surprisingly, less confusing that way. The basic
idea is to separate the guts code into categories ranging from generic
stuff that can be the same everywhere and specific variants. The big scary
one is the Darwin support, which actually seems to run okay on 64-bit OS X
10.6. (No other variants were tested.) The other example given is support
for the old clone() syscall on RHEL 4, which affects some wrlinux use cases.
There's a few minor cleanup bits here, such as a function with inconsistent
calling conventions, but nothing really exciting.
|
|
|
|
directly rather than via an on-demand spawn from the client, the
directory is never created.
|
|
This is fussy, because we have to actually do the path search ourselves
as best we can to handle unqualified paths. The result, though, is
more meaningful logs.
Along the way, fix some bitrot in the comments in pseudo_fix_path and
friends.
|
|
It'd be handy for the WR build system if new state directories could
be created as needed. It is made so. And to answer the first
question everyone, including me, has on reading this: You can't
do system("mkdir -p ...") because the invoked shell would need to
run under pseudo, so it'd have to check for a server, and...
|
|
When pseudo is disabled, we skip a bunch of the prefix, localstate, etc
processing. This allows pseudo to run with a directory that does not yet
exist.
Signed-off-by: Mark Hatle <mark.hatle@windriver.com>
|
|
2010-12-09:
* (mhatle) Add doc/program_flow to attempt to explain startup/running
* (mhatle) guts/* minor cleanup
* (mhatle) Reorganize into a new constructor for libpseudo ONLY
pseudo main() now manually calls the util init
new / revised init for client, wrappers and utils
* (mhatle) Add central "reinit" function
* (mhatle) Add manul execv* functions
* (mhatle) rename pseudo_populate_wrappers to pseudo_check_wrappers
Signed-off-by: Mark Hatle <mark.hatle@windriver.com>
|
|
* (mhatle) Add guts/clone.c to cleanup the clone support
* (mhatle) guts/clone.c only run setupenv and reinit when NOT PSEUDO_RELOADED
* (mhatle) guts/execve.c whitespace fixe
* (mhatle) guts/fork.c similar to guts/clone.c change
* (mhatle) pseudo_client.c add reinit function
* (mhatle) pseudo_client.c revise client reset, include code from pseudo_wrappers.c
* (mhatle) pseudo_server.c move the pid writing to the parent
* (mhatle) pseudo_wrappers.c clone cleanup and populate cleanup
Signed-off-by: Mark Hatle <mark.hatle@windriver.com>
|