Age | Commit message (Collapse) | Author |
|
The previous commit had the logic reversed, fix it.
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
|
|
In rare cases we see failures, often in linux-libc-headers for things like:
| INSTALL /XXX/linux-libc-headers/6.1-r0/image/usr/include
| abort()ing pseudo client by server request. See https://wiki.yoctoproject.org/wiki/Pseudo_Abort for more details on this.
Pseudo log:
path mismatch [2 links]: ino 46662476 db 'NAMELESS FILE' req '/XXX/linux-libc-headers/6.1-r0/image/usr'.
Setup complete, sending SIGUSR1 to pid 3630890.
Whilst this doesn't easily reproduce, the issue is that multiple different processes are
likely working on the directory and the creation in pseudo might not match accesses
made by other processes.
Ultimately, the "NAMELESS FILE" is harmless and pseudo will reconcile things
so rather than error out, we should ignore this case.
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
|
|
We're seeing failures where a file is renamed under pseudo but an
access appears to be made to the old filename before the OP_RENAME has
hit the database but after the real_rename has applied in the kernel.
This is effectively the MAY_UNLINK problem for the original filename. There
were protections for the newpath but not the oldpath.
To try and avoid these aborts(), mark the original path as MAY_UNLINK however
we need to be careful not to trigger the DID_UNLINK but instead update
the deleting entry in the database as the rename completes. To do this,
we no clear the deleting flag during the database rename operation in SQL.
Also, we have to stop removing the by_ino matches where by_ino.deleting is
set, else we may lose the renamed file's attributes that way.
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
|
|
pseudo_get_value() returns newly allocated memory that the caller must free,
so add some free() calls.
Signed-off-by: Ross Burton <ross.burton@arm.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
|
|
Slightly alter a fallthrough comment so that GCC recognises it, and
add a default: case to a switch which explicitly only handles a few
values.
Signed-off-by: Ross Burton <ross.burton@arm.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
|
|
Some operations may call unlink() on an open fd, then call fchown/fchmod/fstat
on that fd. This would currently readd its entry to the database, which
is necessary to preserve its permissions information however since that
file will be lost when it is closed, we don't want the DB entry to persist.
Marking it as may_unlink means the code will know its likely been deleted
and ignore the entry later, giving improved behaviour that simple path
mismatch warnings. We can use an nlink of zero to detect this.
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
|
|
Rather than mapping mismatched inode entries to paths, thrown an abort()
instead. Add a new result type to allow the server to pass back
this instruction to the client.
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
|
|
When we see cases where the inode no longer matches the file path, pseudo
notices but currently reuses the database entry. This can happen where for
example, a file is deleted and a new file created outside of pseudo where
the inode number is reused.
Change this to ignore the likely stale database entry instead. We're
seeing bugs where inode reuse for deleted files causes permission corruption.
(See bug #14057 for example). We don't want to delete the database entry
as the permissions may need to be applied to that file (and testing shows
we do need the path matching code which handles that).
I appreciate this should never happen under the original design of pseudo
where all file accesses are monitored by pseudo. The reality is to do that,
we'd have to run pseudo:
a) for all tasks
b) as one pseudo database for all of TMPDIR
Neither of these is realistically possible for performance reasons.
I believe pseudo to be much better at catching all accesses than it
might once have been. As such, these "fixups" are in the cases I've
seen in the logs, always incorrect.
It therefore makes more sense to ignore the database data rather than
corrupt the file permissions or worse. Looking at the pseudo logs
in my heavily reused build directories, the number of these
errors is staggering. This issue would explain many weird bugs we've
seen over the years.
There is a risk that we could not map permissions in some case where
we currently would. I have not seen evidence of this in any logs I've
read though. This change, whilst going against the original design,
is in my view the safer option for the project at this point given we
don't use pseudo as originally designed and never will be able to.
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
|
|
If the user decides to fix a database, remove the files that do not
exist anymore.
If only DB test is selected do not change the behaviour (return error).
Signed-off-by: Ricardo Ribalda <ricardo@ribalda.com>
Upstream-Status: Submitted [https://lists.openembedded.org/g/openembedded-core/message/137045]
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
|
|
This adds SPDX license headers to all source files in pseudo so license
identification models current best practise.
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
|
|
We've had really weird sporadic things, but it looks like the
underlying problem is that something's getting symlinked over
(which is clearly impossible), but we're nuking everything with
that inode, rather than just the entry for that path, and this
is causing us to drop ownership info for an existing thing.
|
|
|
|
Stop producing the path mismatch messages for things with multiple
links. We could probably make this smarter, but for now this is
sufficient to reduce the spam.
pseudo_debug(0, ...) now produces a message, as a special case.
Signed-off-by: Seebs <seebs@seebs.net>
|
|
When OP_CREAT comes in, if there's an existing entry with the
same inode, we nuke that entry because CREAT implies a new file
got that inode, and if there's an existing entry with the same
path, we don't create a link, because there's already something
there.
So if there's a single entry with the same path and inode, it
gets deleted and not recreated. Oops.
Now distinguishing between a single exact match of both path
and inode, and distinct matches by path and by inode.
This addresses a use case where parallel creations could
in theory cause pseudo to mistakenly drop database entries,
potentially. (I could only trigger it by doing this to files
which had existing entries, and then deleting the files outside
of pseudo, though.)
Signed-off-by: seebs <seebs@seebs.net>
|
|
If you're running pseudo in docker, a script that creates a pseudo
daemon can exit, causing docker to kill pseudo before it's done writing
the database.
Since the client sending the shutdown request doesn't have its socket
closed explicitly by the server, we can just read from the socket in
the client to create a delay until the actual exit, which can take
a while if there's an in-memory DB.
Signed-off-by: Seebs <seebs@seebs.net>
|
|
This reverts commit d5a6b5e2d23288452b374a1ffe32b6b3e52c13fc.
(Because it turns out I was wrong.)
|
|
For some reason, extended attributes were getting garbage
appended in some cases. Attempt to fix this.
Signed-off-by: Seebs <seebs@seebs.net>
|
|
When deleting files, we *do* know the inode and attribute, most of the
time, so we pass those in whenever possible. The full purge of unmatched
xattrs should not happen when the correct dev/ino are believed to be known.
Signed-off-by: Seebs <seebs@seebs.net>
|
|
extended attributes are a property of inodes, not paths. There can be
multiple file database entries for a single inode, so switch to using
inodes rather than paths.
Still to-do: Delete them when deleting the last file with a given device
and inode.
|
|
The variable name is required but wasn't being extracted from the client's
message, resulting in xattr removal never working. This does not fully
address some deeper problems with the xattr implementation, but at least
the common removal case works.
|
|
Server process now waits for its forked child when daemonizing, allowing
us to yield meaningful exit status. Lock is now taken by the child, since
it has a way to tell the parent about the exit status. (We send SIGUSR1 to
the server to cause the wait loop to stop when the client is ready to go.)
This allows us to switch to fcntl locking, which should in theory allow us
to run with the pseudo directory NFS-mounted. Woot!
Also mark a couple of overly spammy messages as PDBGF_VERBOSE to reduce the
volume of uninteresting dup spam when looking at client behaviors.
Client now uses execve to spawn server to work around a very strange behavior
of unsetenv.
Signed-off-by: Peter Seebach <peter.seebach@windriver.com>
|
|
This is the big overhaul to have the server provide meaningful exit status
to clients.
In the process, I discovered that the server was running with signals blocked
if launched by a client, which is not a good thing, and prevented this from
working as intended.
Still looking to see why more than one server spawn seems to happen.
|
|
Improve event logging a little bit more, increase default event log size,
reduce retries (we shouldn't need that many if nothing's wrong), and make
the server log timestamps during database cleanup, since I'm suspicious of
that as a possible source of delays. Also cause server to emit a useful exit
status if it can't get a lock, and client to check server exit status when
spawning server.
|
|
First, if aborting, display message even when no debugging is set, because
that's probably a big deal.
Second, if you use "pseudo <cmd>", try to die with the same signal that killed
the child process, if it died from a signal rather than exiting cleanly. (You
can't just pass the exit status out in that case, because exit(N) doesn't work
for N outside the range of non-signal exit statuses.)
|
|
The automatic shutdown immediately after running a command seems to
be causing more problems than it's worth, so now it requires an explicit
-S.
|
|
mknod(2) automatically defaults to S_IFREG if not given an explicit
file type, so pseudo should too. Otherwise, GNU tar can (for some
reason, it mostly does this when extracting xattrs?) invoke mknod
instead of open with O_CREAT to create a file, and just provide the
permission bits, and pseudo creates a "weird file" with no type bits
in the database, which is unhelpful.
Signed-off-by: Peter Seebach <peter.seebach@windriver.com>
|
|
gcc is better about warnings and spotted variables being assigned but
not used. Clever gcc. Cleaned up the old bits.
Signed-off-by: Peter Seebach <peter.seebach@windriver.com>
|
|
Add some debug messages useful for tracking down xattr behaviors.
Signed-off-by: Peter Seebach <peter.seebach@windriver.com>
|
|
When setting an extended attribute using the database, we create a
dummy entry for the file (so there will be a file row corresponding
to that path name for later lookups). But this entry was coming in
with host UID/GID values in some cases. Instead, use -1 uid/gid,
and have STAT report those as failures rather than as existing
values. (Other cases should not be copying them. I think.)
Signed-off-by: Peter Seebach <peter.seebach@windriver.com>
|
|
OP_OPEN and OP_EXEC are used only when logging. The server can now
tell the client (in response to initial ping) whether or not it is
logging, and if it isn't, the client doesn't send those messages.
Signed-off-by: Peter Seebach <peter.seebach@windriver.com>
|
|
Instead of allocating (and then freeing) these paths all the time,
use a rotating selection of buffers of fixed but probably large enough
size (the same size that would have been the maximum anyway in
general). With the exception of fts_open, there's no likely way to
end up needing more than two or three such paths at a time. fts_open
dups the paths since it could have a large number and need them for
a while. This dramatically reduces (in principle) the amount of allocation
and especially reallocation going on.
Signed-off-by: Peter Seebach <peter.seebach@windriver.com>
|
|
In the case where an "oldpath" is actually the data for an xattr
operation, don't truncate it. Trailing slashes should only be removed
from things which are actually filenames.
|
|
There was supposed to be a check for filenames showing up
with a trailing slash when the file was not a directory. What
actually made it in was a check for a mismatch between "is
a directory" and "has trailing slash", which produced spurious
messages saying the path had a trailing slash whenever a
directory path did *not* have a trailing slash. But that's
valid and should not produce diagnostics. Let alone thousands
of diagnostics.
Signed-off-by: Peter Seebach <peter.seebach@windriver.com>
|
|
It turns out that "a/" is equivalent to "a/.", and that in particular
it should fail when a is not a directory. Pseudo's been silently stripping
them and this breaks things. Attempt to fix that, lightly tested.
|
|
Trying to track down problems which sometimes result in files showing
up as nameless files, producing clashes later. Looks like there were two
issues; one is we were creating links for files that we'd already
found by inode. The other is that rename was sending bogus LINK messages
in some cases. Also simplified the find_file_dev path to extract the
path as part of the initial operation, since there wasn't any case where
that wasn't being done immediately afterwards.
|
|
So it turns out that the sanity checks should be skipped on did_unlink,
because otherwise if an inode gets reused for a different file type,
it'll get nuked. This is pretty rare, but appears to bite us occasionally
during debug stripping.
|
|
It turns out that, in the fairly common case where the did-unlink stuff
has saved us from worse problems, pseudo produces probably-spurious
error messages about the path mismatch when the did-unlink shows up.
Change that into a debug message. Also fix a typo in a comment.
|
|
More complicated, because we actually need to make com.apple stuff work
probably.
|
|
Issue #1: If an operation came in for an item with no path
provided by the wrapper, the client would not construct the
combined "path" value. Fixed, and missing paths are now
consistently handled as 0-byte paths.
Issue #2: The database code was assuming the values were
strings, and ignoring a specified length.
Issue #3: The computation of the length of the stored value
was off by one, because it was including the extra terminating
null the client added in case the value was a path.
With this in place, "cp -a" on CentOS is consistently
duplicating the system.posix_acl_access fields as expected,
but unfortunately not handling their permissions too.
(Intent is to translate a system.posix_acl_access setxattr
into corresponding permissions whenever possible.)
Signed-off-by: Peter Seebach <peter.seebach@windriver.com>
|
|
Initial, incomplete, support for extended attributes. Extended
attributes are implemented fairly naively, using a second table
in the file database using the primary file table's id as a
foreign key. The ON DELETE CASCADE behavior requires sqlite 3.6.19
or later with foreign key and trigger support compiled in.
To reduce round-trips, the client does not check for existing
attributes, but rather, sends three distinct set messages;
OP_SET_XATTR, OP_CREATE_XATTR, OP_REPLACE_XATTR. A SET message
always succeeds, a CREATE fails if the attribute already
exists, and a REPLACE fails if the attribute does not already
exist.
The /* flags */ feature of makewrappers is used to correct
path names appropriately, so all functions are already working
with complete paths, and can always use functions that work
on links; if they were supposed to dereference, the path
fixup code got that.
The xattr support is enabled, for now, conditional on
whether getfattr --help succeeds.
Not yet implemented: Translation for system.posix_acl_access,
which is used by "cp -a" (or "cp --preserve-all") on some
systems to try to copy modes.
Signed-off-by: Peter Seebach <peter.seebach@windriver.com>
|
|
When invoked as a non-daemon, either with a command or to get a
shell, shut the server down automatically. This is currently
implemented by sending a shutdown request; I also considered
adding an environment flag for "shut down when there's no clients",
but this was simpler.
Main reason this is useful: In development, it's often the case
that I want to run a single command under pseudo, then check
the database directly with sqlite.
Signed-off-by: Peter Seebach <peter.seebach@windriver.com>
|
|
This is a moderately intrusive change. The basic overall effect:
Debugging messages are now controlled, not by a numeric "level",
but by a series of flags, which are expressed as a string of
letters. Each flag has a single-letter form used for string
specifications, a name, a description, a numeric value (1 through N),
and a flag value (which is 1 << the numeric value). (This does mean
that no flag has the value 1, so we only have 31 bits available.
Tiny violins play.)
The other significant change is that the pseudo_debug calls
are now implemented with a do/while macro containing a conditional,
so that computationally-expensive arguments are never evaluated
if the corresponding debug flags weren't set. The assumption is
that in the vast majority of cases (specifically, all of them
so far) the debug flags for a given call are a compile-time constant,
so the nested conditional will never actually show up in code
when compiled with optimization; we'll just see the appropriate
conditional test.
The VERBOSE flag is magical, in that if the VERBOSE flag is
used in a message, the debug flags have to have both VERBOSE and
at least one other flag for the call to be made.
This should dramatically improve performance for a lot of cases
without as much need for PSEUDO_NDEBUG, and improve the ability of
users to get coherent debugging output that means something and is
relevant to a given case.
It's also intended to set the stage for future development work
involving improving the clarity and legibility of pseudo's diagnostic
messages in general.
Old things which used numeric values for PSEUDO_DEBUG will sort
of continue to work, though they will almost always be less verbose
than they used to. There should probably be a pass through adding
"| PDBGF_CONSISTENCY" to a lot of the messages that are specific
to some other type.
|
|
Most pseudo operations don't actually USE the server's response. So
why wait for a response?
This patch introduces a new message type, PSEUDO_MSG_FASTOP. It
also tags pseudo operation types with whether or not they need to
give a response. This requires updates to maketables to allow non-string
types for additional columns, and the addition of some quotes to the
SQL query enums/query_type.in table.
A few routines are altered to change their behavior and whether or not
they perform a stat operation. The only operations that do wait are
OP_FSTAT and OP_STAT, OP_MKNOD, and OP_MAY_UNLINK. Rationale:
You can't query the server for replacement information and not wait for
it. Makes no sense.
There's extra checking in mknod, because we really do want to fail out
if we couldn't do that -- that implies that we haven't created a thing
that will look like a node.
The result from OP_MAY_UNLINK is checked because it's used to determine
whether we need to send a DID_UNLINK or CANCEL_UNLINK. It might be cheaper
to send two messages without waiting than to send one, wait, and maybe
send another, but I don't want to send invalid messages.
This is highly experimental.
|
|
Change from internal PSEUDO_RELOADED to external PSEUDO_UNLOAD environment
variable. Enable external programs to have a safe and reliable way to unload
pseudo on the next exec*. PSEUDO_UNLOAD also will disable pseudo if we're in a
fork/clone situation in the same way PSEUDO_DISABLED=1 would.
Rename the PSEUDO_DISABLED tests, and create a similar set for the new
PSEUDO_UNLOAD.
Signed-off-by: Mark Hatle <mark.hatle@windriver.com>
|
|
This is a spiffied-up rebase of a bunch of intermediate changes, presented
as a whole because it is, surprisingly, less confusing that way. The basic
idea is to separate the guts code into categories ranging from generic
stuff that can be the same everywhere and specific variants. The big scary
one is the Darwin support, which actually seems to run okay on 64-bit OS X
10.6. (No other variants were tested.) The other example given is support
for the old clone() syscall on RHEL 4, which affects some wrlinux use cases.
There's a few minor cleanup bits here, such as a function with inconsistent
calling conventions, but nothing really exciting.
|
|
directly rather than via an on-demand spawn from the client, the
directory is never created.
|
|
This reverts commit 49d4d35918d457b0e9206679ecad3b9c84f11e66.
|
|
The cached data values were being collected when an OP_EXEC call was made.
This is incorrect as the values are only for logging purposes. It's believed
this caused an occasional crash in certain instances.
Signed-off-by: Mark Hatle <mark.hatle@windriver.com>
|
|
The problem is that path_by_ino could end up being the same pointer
as cache_path, after which, if cache_path were freed (or kept around
for later), there would be malloc arena problems.
Also, fix the calculation for pathlen to increase cache hits. The
IPC messages use length of path *plus one* as the length, because
the buffer is defined to include its terminating null byte.
|
|
The pathlen that is cached could be wrong in certain operations (RENAME).
Fix this by resetting it to the proper path length.
Signed-off-by: Mark Hatle <mark.hatle@windriver.com>
|