Age | Commit message (Collapse) | Author |
|
commit d35d3660e065b69fdb8bf512f3d899f350afce52 upstream.
The binder driver makes the assumption proc->context pointer is invariant after
initialization (as documented in the kerneldoc header for struct proc).
However, in commit f0fe2c0f050d ("binder: prevent UAF for binderfs devices II")
proc->context is set to NULL during binder_deferred_release().
Another proc was in the middle of setting up a transaction to the dying
process and crashed on a NULL pointer deref on "context" which is a local
set to &proc->context:
new_ref->data.desc = (node == context->binder_context_mgr_node) ? 0 : 1;
Here's the stack:
[ 5237.855435] Call trace:
[ 5237.855441] binder_get_ref_for_node_olocked+0x100/0x2ec
[ 5237.855446] binder_inc_ref_for_node+0x140/0x280
[ 5237.855451] binder_translate_binder+0x1d0/0x388
[ 5237.855456] binder_transaction+0x2228/0x3730
[ 5237.855461] binder_thread_write+0x640/0x25bc
[ 5237.855466] binder_ioctl_write_read+0xb0/0x464
[ 5237.855471] binder_ioctl+0x30c/0x96c
[ 5237.855477] do_vfs_ioctl+0x3e0/0x700
[ 5237.855482] __arm64_sys_ioctl+0x78/0xa4
[ 5237.855488] el0_svc_common+0xb4/0x194
[ 5237.855493] el0_svc_handler+0x74/0x98
[ 5237.855497] el0_svc+0x8/0xc
The fix is to move the kfree of the binder_device to binder_free_proc()
so the binder_device is freed when we know there are no references
remaining on the binder_proc.
Fixes: f0fe2c0f050d ("binder: prevent UAF for binderfs devices II")
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Todd Kjos <tkjos@google.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200622200715.114382-1-tkjos@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
|
|
commit 211b64e4b5b6bd5fdc19cd525c2cc9a90e6b0ec9 upstream.
Binderfs binder-control devices are cleaned up via binderfs_evict_inode
too() which will use refcount_dec_and_test(). However, we missed to set
the refcount for binderfs binder-control devices and so we underflowed
when the binderfs instance got unmounted. Pretty obvious oversight and
should have been part of the more general UAF fix. The good news is that
having test cases (suprisingly) helps.
Technically, we could detect that we're about to cleanup the
binder-control dentry in binderfs_evict_inode() and then simply clean it
up. But that makes the assumption that the binder driver itself will
never make use of a binderfs binder-control device after the binderfs
instance it belongs to has been unmounted and the superblock for it been
destroyed. While it is unlikely to ever come to this let's be on the
safe side. Performance-wise this also really doesn't matter since the
binder-control device is only every really when creating the binderfs
filesystem or creating additional binder devices. Both operations are
pretty rare.
Fixes: f0fe2c0f050d ("binder: prevent UAF for binderfs devices II")
Link: https://lore.kernel.org/r/CA+G9fYusdfg7PMfC9Xce-xLT7NiyKSbgojpK35GOm=Pf9jXXrA@mail.gmail.com
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Cc: stable@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Acked-by: Todd Kjos <tkjos@google.com>
Link: https://lore.kernel.org/r/20200311105309.1742827-1-christian.brauner@ubuntu.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
|
|
commit f0fe2c0f050d31babcad7d65f1d550d462a40064 upstream.
This is a necessary follow up to the first fix I proposed and we merged
in 2669b8b0c79 ("binder: prevent UAF for binderfs devices"). I have been
overly optimistic that the simple fix I proposed would work. But alas,
ihold() + iput() won't work since the inodes won't survive the
destruction of the superblock.
So all we get with my prior fix is a different race with a tinier
race-window but it doesn't solve the issue. Fwiw, the problem lies with
generic_shutdown_super(). It even has this cozy Al-style comment:
if (!list_empty(&sb->s_inodes)) {
printk("VFS: Busy inodes after unmount of %s. "
"Self-destruct in 5 seconds. Have a nice day...\n",
sb->s_id);
}
On binder_release(), binder_defer_work(proc, BINDER_DEFERRED_RELEASE) is
called which punts the actual cleanup operation to a workqueue. At some
point, binder_deferred_func() will be called which will end up calling
binder_deferred_release() which will retrieve and cleanup the
binder_context attach to this struct binder_proc.
If we trace back where this binder_context is attached to binder_proc we
see that it is set in binder_open() and is taken from the struct
binder_device it is associated with. This obviously assumes that the
struct binder_device that context is attached to is _never_ freed. While
that might be true for devtmpfs binder devices it is most certainly
wrong for binderfs binder devices.
So, assume binder_open() is called on a binderfs binder devices. We now
stash away the struct binder_context associated with that struct
binder_devices:
proc->context = &binder_dev->context;
/* binderfs stashes devices in i_private */
if (is_binderfs_device(nodp)) {
binder_dev = nodp->i_private;
info = nodp->i_sb->s_fs_info;
binder_binderfs_dir_entry_proc = info->proc_log_dir;
} else {
.
.
.
proc->context = &binder_dev->context;
Now let's assume that the binderfs instance for that binder devices is
shutdown via umount() and/or the mount namespace associated with it goes
away. As long as there is still an fd open for that binderfs binder
device things are fine. But let's assume we now close the last fd for
that binderfs binder device. Now binder_release() is called and punts to
the workqueue. Assume that the workqueue has quite a bit of stuff to do
and doesn't get to cleaning up the struct binder_proc and the associated
struct binder_context with it for that binderfs binder device right
away. In the meantime, the VFS is killing the super block and is
ultimately calling sb->evict_inode() which means it will call
binderfs_evict_inode() which does:
static void binderfs_evict_inode(struct inode *inode)
{
struct binder_device *device = inode->i_private;
struct binderfs_info *info = BINDERFS_I(inode);
clear_inode(inode);
if (!S_ISCHR(inode->i_mode) || !device)
return;
mutex_lock(&binderfs_minors_mutex);
--info->device_count;
ida_free(&binderfs_minors, device->miscdev.minor);
mutex_unlock(&binderfs_minors_mutex);
kfree(device->context.name);
kfree(device);
}
thereby freeing the struct binder_device including struct
binder_context.
Now the workqueue finally has time to get around to cleaning up struct
binder_proc and is now trying to access the associate struct
binder_context. Since it's already freed it will OOPs.
Fix this by introducing a refounct on binder devices.
This is an alternative fix to 51d8a7eca677 ("binder: prevent UAF read in
print_binder_transaction_log_entry()").
Fixes: 3ad20fe393b3 ("binder: implement binderfs")
Fixes: 2669b8b0c798 ("binder: prevent UAF for binderfs devices")
Fixes: 03e2e07e3814 ("binder: Make transaction_log available in binderfs")
Related : 51d8a7eca677 ("binder: prevent UAF read in print_binder_transaction_log_entry()")
Cc: stable@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Acked-by: Todd Kjos <tkjos@google.com>
Link: https://lore.kernel.org/r/20200303164340.670054-1-christian.brauner@ubuntu.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
|
|
commit 2669b8b0c798fbe1a31d49e07aa33233d469ad9b upstream.
On binder_release(), binder_defer_work(proc, BINDER_DEFERRED_RELEASE) is
called which punts the actual cleanup operation to a workqueue. At some
point, binder_deferred_func() will be called which will end up calling
binder_deferred_release() which will retrieve and cleanup the
binder_context attach to this struct binder_proc.
If we trace back where this binder_context is attached to binder_proc we
see that it is set in binder_open() and is taken from the struct
binder_device it is associated with. This obviously assumes that the
struct binder_device that context is attached to is _never_ freed. While
that might be true for devtmpfs binder devices it is most certainly
wrong for binderfs binder devices.
So, assume binder_open() is called on a binderfs binder devices. We now
stash away the struct binder_context associated with that struct
binder_devices:
proc->context = &binder_dev->context;
/* binderfs stashes devices in i_private */
if (is_binderfs_device(nodp)) {
binder_dev = nodp->i_private;
info = nodp->i_sb->s_fs_info;
binder_binderfs_dir_entry_proc = info->proc_log_dir;
} else {
.
.
.
proc->context = &binder_dev->context;
Now let's assume that the binderfs instance for that binder devices is
shutdown via umount() and/or the mount namespace associated with it goes
away. As long as there is still an fd open for that binderfs binder
device things are fine. But let's assume we now close the last fd for
that binderfs binder device. Now binder_release() is called and punts to
the workqueue. Assume that the workqueue has quite a bit of stuff to do
and doesn't get to cleaning up the struct binder_proc and the associated
struct binder_context with it for that binderfs binder device right
away. In the meantime, the VFS is killing the super block and is
ultimately calling sb->evict_inode() which means it will call
binderfs_evict_inode() which does:
static void binderfs_evict_inode(struct inode *inode)
{
struct binder_device *device = inode->i_private;
struct binderfs_info *info = BINDERFS_I(inode);
clear_inode(inode);
if (!S_ISCHR(inode->i_mode) || !device)
return;
mutex_lock(&binderfs_minors_mutex);
--info->device_count;
ida_free(&binderfs_minors, device->miscdev.minor);
mutex_unlock(&binderfs_minors_mutex);
kfree(device->context.name);
kfree(device);
}
thereby freeing the struct binder_device including struct
binder_context.
Now the workqueue finally has time to get around to cleaning up struct
binder_proc and is now trying to access the associate struct
binder_context. Since it's already freed it will OOPs.
Fix this by holding an additional reference to the inode that is only
released once the workqueue is done cleaning up struct binder_proc. This
is an easy alternative to introducing separate refcounting on struct
binder_device which we can always do later if it becomes necessary.
This is an alternative fix to 51d8a7eca677 ("binder: prevent UAF read in
print_binder_transaction_log_entry()").
Fixes: 3ad20fe393b3 ("binder: implement binderfs")
Fixes: 03e2e07e3814 ("binder: Make transaction_log available in binderfs")
Related : 51d8a7eca677 ("binder: prevent UAF read in print_binder_transaction_log_entry()")
Cc: stable@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Acked-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
|
|
commit 16981742717b04644a41052570fb502682a315d2 upstream.
For BINDER_TYPE_PTR and BINDER_TYPE_FDA transactions, the
num_valid local was calculated incorrectly causing the
range check in binder_validate_ptr() to miss out-of-bounds
offsets.
Fixes: bde4a19fc04f ("binder: use userspace pointer as base of buffer space")
Signed-off-by: Todd Kjos <tkjos@google.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20191213202531.55010-1-tkjos@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
|
|
commit 2a9edd056ed4fbf9d2e797c3fc06335af35bccc4 upstream.
The old loop wouldn't stop when reaching `start` if `start==NULL`, instead
continuing backwards to index -1 and crashing.
Luckily you need to be highly privileged to map things at NULL, so it's not
a big problem.
Fix it by adjusting the loop so that the loop variable is always in bounds.
This patch is deliberately minimal to simplify backporting, but IMO this
function could use a refactor. The jump labels in the second loop body are
horrible (the error gotos should be jumping to free_range instead), and
both loops would look nicer if they just iterated upwards through indices.
And the up_read()+mmput() shouldn't be duplicated like that.
Cc: stable@vger.kernel.org
Fixes: 457b9a6f09f0 ("Staging: android: add binder driver")
Signed-off-by: Jann Horn <jannh@google.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Link: https://lore.kernel.org/r/20191018205631.248274-3-jannh@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
|
|
commit a7a74d7ff55a0c657bc46238b050460b9eacea95 upstream.
binder_alloc_mmap_handler() attempts to detect the use of ->mmap() on a
binder_proc whose binder_alloc has already been initialized by checking
whether alloc->buffer is non-zero.
Before commit 880211667b20 ("binder: remove kernel vm_area for buffer
space"), alloc->buffer was a kernel mapping address, which is always
non-zero, but since that commit, it is a userspace mapping address.
A sufficiently privileged user can map /dev/binder at NULL, tricking
binder_alloc_mmap_handler() into assuming that the binder_proc has not been
mapped yet. This leads to memory unsafety.
Luckily, no context on Android has such privileges, and on a typical Linux
desktop system, you need to be root to do that.
Fix it by using the mapping size instead of the mapping address to
distinguish the mapped case. A valid VMA can't have size zero.
Fixes: 880211667b20 ("binder: remove kernel vm_area for buffer space")
Cc: stable@vger.kernel.org
Signed-off-by: Jann Horn <jannh@google.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Link: https://lore.kernel.org/r/20191018205631.248274-2-jannh@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
|
|
commit 8eb52a1ee37aafd9b796713aa0b3ab9cbc455be3 upstream.
binder_alloc_print_pages() iterates over
alloc->pages[0..alloc->buffer_size-1] under alloc->mutex.
binder_alloc_mmap_handler() writes alloc->pages and alloc->buffer_size
without holding that lock, and even writes them before the last bailout
point.
Unfortunately we can't take the alloc->mutex in the ->mmap() handler
because mmap_sem can be taken while alloc->mutex is held.
So instead, we have to locklessly check whether the binder_alloc has been
fully initialized with binder_alloc_get_vma(), like in
binder_alloc_new_buf_locked().
Fixes: 8ef4665aa129 ("android: binder: Add page usage in binder stats")
Cc: stable@vger.kernel.org
Signed-off-by: Jann Horn <jannh@google.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Link: https://lore.kernel.org/r/20191018205631.248274-1-jannh@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
|
|
commit 45d02f79b539073b76077836871de6b674e36eb4 upstream.
binder_mmap() tries to prevent the creation of overly big binder mappings
by silently truncating the size of the VMA to 4MiB. However, this violates
the API contract of mmap(). If userspace attempts to create a large binder
VMA, and later attempts to unmap that VMA, it will call munmap() on a range
beyond the end of the VMA, which may have been allocated to another VMA in
the meantime. This can lead to userspace memory corruption.
The following sequence of calls leads to a segfault without this commit:
int main(void) {
int binder_fd = open("/dev/binder", O_RDWR);
if (binder_fd == -1) err(1, "open binder");
void *binder_mapping = mmap(NULL, 0x800000UL, PROT_READ, MAP_SHARED,
binder_fd, 0);
if (binder_mapping == MAP_FAILED) err(1, "mmap binder");
void *data_mapping = mmap(NULL, 0x400000UL, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
if (data_mapping == MAP_FAILED) err(1, "mmap data");
munmap(binder_mapping, 0x800000UL);
*(char*)data_mapping = 1;
return 0;
}
Cc: stable@vger.kernel.org
Signed-off-by: Jann Horn <jannh@google.com>
Acked-by: Todd Kjos <tkjos@google.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Link: https://lore.kernel.org/r/20191016150119.154756-1-jannh@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
|
|
commit 49ed96943a8e0c62cc5a9b0a6cfc88be87d1fcec upstream.
Currently, a transaction to context manager from its own process
is prevented by checking if its binder_proc struct is the same as
that of the sender. However, this would not catch cases where the
process opens the binder device again and uses the new fd to send
a transaction to the context manager.
Reported-by: syzbot+8b3c354d33c4ac78bfad@syzkaller.appspotmail.com
Signed-off-by: Hridya Valsaraju <hridya@google.com>
Acked-by: Todd Kjos <tkjos@google.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20190715191804.112933-1-hridya@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit a56587065094fd96eb4c2b5ad65571daad32156d upstream.
In case the target node requests a security context, the
extra_buffers_size is increased with the size of the security context.
But, that size is not available for use by regular scatter-gather
buffers; make sure the ending of that buffer is marked correctly.
Acked-by: Todd Kjos <tkjos@google.com>
Fixes: ec74136ded79 ("binder: create node flag to request sender's security context")
Signed-off-by: Martijn Coenen <maco@android.com>
Cc: stable@vger.kernel.org # 5.1+
Link: https://lore.kernel.org/r/20190709110923.220736-1-maco@android.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit bb4a2e48d5100ed3ff614df158a636bca3c6bf9f upstream.
The buffer copy functions assumed the caller would ensure
correct alignment and that the memory to be copied was
completely within the binder buffer. There have been
a few cases discovered by syzkallar where a malformed
transaction created by a user could violated the
assumptions and resulted in a BUG_ON.
The fix is to remove the BUG_ON and always return the
error to be handled appropriately by the caller.
Acked-by: Martijn Coenen <maco@android.com>
Reported-by: syzbot+3ae18325f96190606754@syzkaller.appspotmail.com
Fixes: bde4a19fc04f ("binder: use userspace pointer as base of buffer space")
Suggested-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Todd Kjos <tkjos@google.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 1909a671dbc3606685b1daf8b22a16f65ea7edda upstream.
syzkallar found a 32-byte memory leak in a rarely executed error
case. The transaction complete work item was not freed if put_user()
failed when writing the BR_TRANSACTION_COMPLETE to the user command
buffer. Fixed by freeing it before put_user() is called.
Reported-by: syzbot+182ce46596c3f2e1eb24@syzkaller.appspotmail.com
Signed-off-by: Todd Kjos <tkjos@google.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
There is a race between the binder driver cleaning
up a completed transaction via binder_free_transaction()
and a user calling binder_ioctl(BC_FREE_BUFFER) to
release a buffer. It doesn't matter which is first but
they need to be protected against running concurrently
which can result in a UAF.
Signed-off-by: Todd Kjos <tkjos@google.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Based on 1 normalized pattern(s):
this software is licensed under the terms of the gnu general public
license version 2 as published by the free software foundation and
may be copied distributed and modified under those terms this
program is distributed in the hope that it will be useful but
without any warranty without even the implied warranty of
merchantability or fitness for a particular purpose see the gnu
general public license for more details
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 285 file(s).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190529141900.642774971@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Add SPDX license identifiers to all Make/Kconfig files which:
- Have no license information of any form
These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:
GPL-2.0-only
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc update part 2 from Greg KH:
"Here is the "real" big set of char/misc driver patches for 5.2-rc1
Loads of different driver subsystem stuff in here, all over the places:
- thunderbolt driver updates
- habanalabs driver updates
- nvmem driver updates
- extcon driver updates
- intel_th driver updates
- mei driver updates
- coresight driver updates
- soundwire driver cleanups and updates
- fastrpc driver updates
- other minor driver updates
- chardev minor fixups
Feels like this tree is getting to be a dumping ground of "small
driver subsystems" these days. Which is fine with me, if it makes
things easier for those subsystem maintainers.
All of these have been in linux-next for a while with no reported
issues"
* tag 'char-misc-5.2-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (255 commits)
intel_th: msu: Add current window tracking
intel_th: msu: Add a sysfs attribute to trigger window switch
intel_th: msu: Correct the block wrap detection
intel_th: Add switch triggering support
intel_th: gth: Factor out trace start/stop
intel_th: msu: Factor out pipeline draining
intel_th: msu: Switch over to scatterlist
intel_th: msu: Replace open-coded list_{first,last,next}_entry variants
intel_th: Only report useful IRQs to subdevices
intel_th: msu: Start handling IRQs
intel_th: pci: Use MSI interrupt signalling
intel_th: Communicate IRQ via resource
intel_th: Add "rtit" source device
intel_th: Skip subdevices if their MMIO is missing
intel_th: Rework resource passing between glue layers and core
intel_th: SPDX-ify the documentation
intel_th: msu: Fix single mode with IOMMU
coresight: funnel: Support static funnel
dt-bindings: arm: coresight: Unify funnel DT binding
coresight: replicator: Add new device id for static replicator
...
|
|
When allocating space in the target buffer for the security context,
make sure the extra_buffers_size doesn't overflow. This can only
happen if the given size is invalid, but an overflow can turn it
into a valid size. Fail the transaction if an overflow is detected.
Signed-off-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Restore the behavior of locking mmap_sem for reading in
binder_alloc_free_page(), as was first done in commit 3013bf62b67a
("binder: reduce mmap_sem write-side lock"). That change was
inadvertently reverted by commit 5cec2d2e5839 ("binder: fix race between
munmap() and direct reclaim").
In addition, change the name of the label for the error path to
accurately reflect that we're taking the lock for reading.
Backporting note: This fix is only needed when *both* of the commits
mentioned above are applied. That's an unlikely situation since they
both landed during the development of v5.1 but only one of them is
targeted for stable.
Fixes: 5cec2d2e5839 ("binder: fix race between munmap() and direct reclaim")
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Acked-by: Todd Kjos <tkjos@android.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
An munmap() on a binder device causes binder_vma_close() to be called
which clears the alloc->vma pointer.
If direct reclaim causes binder_alloc_free_page() to be called, there
is a race where alloc->vma is read into a local vma pointer and then
used later after the mm->mmap_sem is acquired. This can result in
calling zap_page_range() with an invalid vma which manifests as a
use-after-free in zap_page_range().
The fix is to check alloc->vma after acquiring the mmap_sem (which we
were acquiring anyway) and skip zap_page_range() if it has changed
to NULL.
Signed-off-by: Todd Kjos <tkjos@google.com>
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The selinux-testsuite found an issue resulting in a BUG_ON()
where a conditional relied on a size_t going negative when
checking the validity of a buffer offset.
Fixes: 7a67a39320df ("binder: add function to copy binder object from buffer")
Reported-by: Paul Moore <paul@paul-moore.com>
Tested-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
binder has used write-side mmap_sem semaphore to release memory
mapped at address space of the process. However, right lock to
release pages is down_read, not down_write because page table lock
already protects the race for parallel freeing.
Please do not use mmap_sem write-side lock which is well known
contented lock.
Cc: Todd Kjos <tkjos@google.com>
Cc: Martijn Coenen <maco@android.com>
Cc: Arve Hjønnevåg <arve@android.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Fixes crash found by syzbot:
kernel BUG at drivers/android/binder_alloc.c:LINE! (2)
Reported-and-tested-by: syzbot+55de1eb4975dec156d8f@syzkaller.appspotmail.com
Signed-off-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Fixes sparse issues reported by the kbuild test robot running
on https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
char-misc-testing: bde4a19fc04f5 ("binder: use userspace pointer as base
of buffer space")
Error output (drivers/android/binder_alloc_selftest.c):
sparse: warning: incorrect type in assignment (different address spaces)
sparse: expected void *page_addr
sparse: got void [noderef] <asn:1> *user_data
sparse: error: subtraction of different types can't work
Fixed by adding necessary "__user" tags.
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Now that alloc->buffer points to the userspace vm_area
rename buffer->data to buffer->user_data and rename
local pointers that hold user addresses. Also use the
"__user" tag to annotate all user pointers so sparse
can flag cases where user pointer vaues are copied to
kernel pointers. Refactor code to use offsets instead
of user pointers.
Signed-off-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Remove user_buffer_offset since there is no kernel
buffer pointer anymore.
Signed-off-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Remove the kernel's vm_area and the code that maps
buffer pages into it.
Signed-off-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Refactor the functions to validate and fixup struct
binder_buffer pointer objects to avoid using vm_area
pointers. Instead copy to/from kernel space using
binder_alloc_copy_to_buffer() and
binder_alloc_copy_from_buffer(). The following
functions were refactored:
refactor binder_validate_ptr()
binder_validate_fixup()
binder_fixup_parent()
Signed-off-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
When creating or tearing down a transaction, the binder driver
examines objects in the buffer and takes appropriate action.
To do this without needing to dereference pointers into the
buffer, the local copies of the objects are needed. This patch
introduces a function to validate and copy binder objects
from the buffer to a local structure.
Signed-off-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Avoid vm_area when copying to or from binder buffers.
Instead, new copy functions are added that copy from
kernel space to binder buffer space. These use
kmap_atomic() and kunmap_atomic() to create temporary
mappings and then memcpy() is used to copy within
that page.
Also, kmap_atomic() / kunmap_atomic() use the appropriate
cache flushing to support VIVT cache architectures.
Allow binder to build if CPU_CACHE_VIVT is defined.
Several uses of the new functions are added here. More
to follow in subsequent patches.
Signed-off-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The binder driver uses a vm_area to map the per-process
binder buffer space. For 32-bit android devices, this is
now taking too much vmalloc space. This patch removes
the use of vm_area when copying the transaction data
from the sender to the buffer space. Instead of using
copy_from_user() for multi-page copies, it now uses
binder_alloc_copy_user_to_buffer() which uses kmap()
and kunmap() to map each page, and uses copy_from_user()
for copying to that page.
Signed-off-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
We need the char-misc fixes in here as well.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
binderfs should not have a separate device_initcall(). When a kernel is
compiled with CONFIG_ANDROID_BINDERFS register the filesystem alongside
CONFIG_ANDROID_IPC. This use-case is especially sensible when users specify
CONFIG_ANDROID_IPC=y, CONFIG_ANDROID_BINDERFS=y and
ANDROID_BINDER_DEVICES="".
When CONFIG_ANDROID_BINDERFS=n then this always succeeds so there's no
regression potential for legacy workloads.
Signed-off-by: Christian Brauner <christian@brauner.io>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
We currently adhere to the reserved devices limit when creating new
binderfs devices in binderfs instances not located in the inital ipc
namespace. But it is still possible to rob the host instances of their 4
reserved devices by creating the maximum allowed number of devices in a
single binderfs instance located in a non-initial ipc namespace and then
mounting 4 separate binderfs instances in non-initial ipc namespaces. That
happens because the limit is currently not respected for the creation of
the initial binder-control device node. Block this nonsense by performing
the same check in binderfs_binder_ctl_create() that we perform in
binderfs_binder_device_create().
Fixes: 36bdf3cae09d ("binderfs: reserve devices for initial mount")
Signed-off-by: Christian Brauner <christian@brauner.io>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Several users have tried to only rely on binderfs to provide binder devices
and set CONFIG_ANDROID_BINDER_DEVICES="" empty. This is a great use-case of
binderfs and one that was always intended to work. However, this is
currently not possible since setting CONFIG_ANDROID_BINDER_DEVICES="" emtpy
will simply panic the kernel:
kobject: (00000000028c2f79): attempted to be registered with empty name!
WARNING: CPU: 7 PID: 1703 at lib/kobject.c:228 kobject_add_internal+0x288/0x2b0
Modules linked in: binder_linux(+) bridge stp llc ipmi_ssif gpio_ich dcdbas coretemp kvm_intel kvm irqbypass serio_raw input_leds lpc_ich i5100_edac mac_hid ipmi_si ipmi_devintf ipmi_msghandler sch_fq_codel ib_i
CPU: 7 PID: 1703 Comm: modprobe Not tainted 5.0.0-rc2-brauner-binderfs #263
Hardware name: Dell DCS XS24-SC2 /XS24-SC2 , BIOS S59_3C20 04/07/2011
RIP: 0010:kobject_add_internal+0x288/0x2b0
Code: 12 95 48 c7 c7 78 63 3b 95 e8 77 35 71 ff e9 91 fe ff ff 0f 0b eb a7 0f 0b eb 9a 48 89 de 48 c7 c7 00 63 3b 95 e8 f8 95 6a ff <0f> 0b 41 bc ea ff ff ff e9 6d fe ff ff 41 bc fe ff ff ff e9 62 fe
RSP: 0018:ffff973f84237a30 EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffff8b53e2472010 RCX: 0000000000000006
RDX: 0000000000000007 RSI: 0000000000000086 RDI: ffff8b53edbd63a0
RBP: ffff973f84237a60 R08: 0000000000000342 R09: 0000000000000004
R10: ffff973f84237af0 R11: 0000000000000001 R12: 0000000000000000
R13: ffff8b53e9f1a1e0 R14: 00000000e9f1a1e0 R15: 0000000000a00037
FS: 00007fbac36f7540(0000) GS:ffff8b53edbc0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fbac364cfa7 CR3: 00000004a6d48000 CR4: 00000000000406e0
Call Trace:
kobject_add+0x71/0xd0
? _cond_resched+0x19/0x40
? mutex_lock+0x12/0x40
device_add+0x12e/0x6b0
device_create_groups_vargs+0xe4/0xf0
device_create_with_groups+0x3f/0x60
? _cond_resched+0x19/0x40
misc_register+0x140/0x180
binder_init+0x1ed/0x2d4 [binder_linux]
? trace_event_define_fields_binder_transaction_fd_send+0x8e/0x8e [binder_linux]
do_one_initcall+0x4a/0x1c9
? _cond_resched+0x19/0x40
? kmem_cache_alloc_trace+0x151/0x1c0
do_init_module+0x5f/0x216
load_module+0x223d/0x2b20
__do_sys_finit_module+0xfc/0x120
? __do_sys_finit_module+0xfc/0x120
__x64_sys_finit_module+0x1a/0x20
do_syscall_64+0x5a/0x120
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7fbac3202839
Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 1f f6 2c 00 f7 d8 64 89 01 48
RSP: 002b:00007ffd1494a908 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
RAX: ffffffffffffffda RBX: 000055b629ebec60 RCX: 00007fbac3202839
RDX: 0000000000000000 RSI: 000055b629c20d2e RDI: 0000000000000003
RBP: 000055b629c20d2e R08: 0000000000000000 R09: 000055b629ec2310
R10: 0000000000000003 R11: 0000000000000246 R12: 0000000000000000
R13: 000055b629ebed70 R14: 0000000000040000 R15: 000055b629ebec60
So check for the empty string since strsep() will otherwise return the
emtpy string which will cause kobject_add_internal() to panic when trying
to add a kobject with an emtpy name.
Fixes: ac4812c5ffbb ("binder: Support multiple /dev instances")
Cc: Martijn Coenen <maco@google.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Acked-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
We need the char-misc fixes in here as well.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
To allow servers to verify client identity, allow a node
flag to be set that causes the sender's security context
to be delivered with the transaction. The BR_TRANSACTION
command is extended in BR_TRANSACTION_SEC_CTX to
contain a pointer to the security context string.
Signed-off-by: Todd Kjos <tkjos@google.com>
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
In a previous commit we switched from a d_alloc_name() + d_lookup()
combination to setup a new dentry and find potential duplicates to the more
idiomatic lookup_one_len(). As far as I understand, this also means we need
to switch from d_add() to d_instantiate() since lookup_one_len() will
create a new dentry when it doesn't find an existing one and add the new
dentry to the hash queues. So we only need to call d_instantiate() to
connect the dentry to the inode and turn it into a positive dentry.
If we were to use d_add() we sure see stack traces like the following
indicating that adding the same dentry twice over the same inode:
[ 744.441889] CPU: 4 PID: 2849 Comm: landscape-sysin Not tainted 5.0.0-rc1-brauner-binderfs #243
[ 744.441889] Hardware name: Dell DCS XS24-SC2 /XS24-SC2 , BIOS S59_3C20 04/07/2011
[ 744.441889] RIP: 0010:__d_lookup_rcu+0x76/0x190
[ 744.441889] Code: 89 75 c0 49 c1 e9 20 49 89 fd 45 89 ce 41 83 e6 07 42 8d 04 f5 00 00 00 00 89 45 c8 eb 0c 48 8b 1b 48 85 db 0f 84 81 00 00 00 <44> 8b 63 fc 4c 3b 6b 10 75 ea 48 83 7b 08 00 74 e3 41 83 e4 fe 41
[ 744.441889] RSP: 0018:ffffb8c984e27ad0 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff13
[ 744.441889] RAX: 0000000000000038 RBX: ffff9407ef770c08 RCX: ffffb8c980011000
[ 744.441889] RDX: ffffb8c984e27b54 RSI: ffffb8c984e27ce0 RDI: ffff9407e6689600
[ 744.441889] RBP: ffffb8c984e27b28 R08: ffffb8c984e27ba4 R09: 0000000000000007
[ 744.441889] R10: ffff9407e5c4f05c R11: 973f3eb9d84a94e5 R12: 0000000000000002
[ 744.441889] R13: ffff9407e6689600 R14: 0000000000000007 R15: 00000007bfef7a13
[ 744.441889] FS: 00007f0db13bb740(0000) GS:ffff9407f3b00000(0000) knlGS:0000000000000000
[ 744.441889] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 744.441889] CR2: 00007f0dacc51024 CR3: 000000032961a000 CR4: 00000000000006e0
[ 744.441889] Call Trace:
[ 744.441889] lookup_fast+0x53/0x300
[ 744.441889] walk_component+0x49/0x350
[ 744.441889] ? inode_permission+0x63/0x1a0
[ 744.441889] link_path_walk.part.33+0x1bc/0x5a0
[ 744.441889] ? path_init+0x190/0x310
[ 744.441889] path_lookupat+0x95/0x210
[ 744.441889] filename_lookup+0xb6/0x190
[ 744.441889] ? __check_object_size+0xb8/0x1b0
[ 744.441889] ? strncpy_from_user+0x50/0x1a0
[ 744.441889] user_path_at_empty+0x36/0x40
[ 744.441889] ? user_path_at_empty+0x36/0x40
[ 744.441889] vfs_statx+0x76/0xe0
[ 744.441889] __do_sys_newstat+0x3d/0x70
[ 744.441889] __x64_sys_newstat+0x16/0x20
[ 744.441889] do_syscall_64+0x5a/0x120
[ 744.441889] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 744.441889] RIP: 0033:0x7f0db0ec2775
[ 744.441889] Code: 00 00 00 75 05 48 83 c4 18 c3 e8 26 55 02 00 66 0f 1f 44 00 00 83 ff 01 48 89 f0 77 30 48 89 c7 48 89 d6 b8 04 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 03 f3 c3 90 48 8b 15 e1 b6 2d 00 f7 d8 64 89
[ 744.441889] RSP: 002b:00007ffc36bc9388 EFLAGS: 00000246 ORIG_RAX: 0000000000000004
[ 744.441889] RAX: ffffffffffffffda RBX: 00007ffc36bc9300 RCX: 00007f0db0ec2775
[ 744.441889] RDX: 00007ffc36bc9400 RSI: 00007ffc36bc9400 RDI: 00007f0dad26f050
[ 744.441889] RBP: 0000000000c0bc60 R08: 0000000000000000 R09: 0000000000000001
[ 744.441889] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffc36bc9400
[ 744.441889] R13: 0000000000000001 R14: 00000000ffffff9c R15: 0000000000c0bc60
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Christian Brauner <christian@brauner.io>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The binderfs_binder_ctl_create() call is a no-op on subsequent calls and
the first call is done before we unlock the suberblock. Hence, there is no
need to take inode_lock() in there. Let's remove it.
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Christian Brauner <christian@brauner.io>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Al pointed out that first calling kill_litter_super() before cleaning up
info is more correct since destroying info doesn't depend on the state of
the dentries and inodes. That the opposite remains true is not guaranteed.
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Christian Brauner <christian@brauner.io>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
- switch from d_alloc_name() + d_lookup() to lookup_one_len():
Instead of using d_alloc_name() and then doing a d_lookup() with the
allocated dentry to find whether a device with the name we're trying to
create already exists switch to using lookup_one_len(). The latter will
either return the existing dentry or a new one.
- switch from kmalloc() + strscpy() to kmemdup():
Use a more idiomatic way to copy the name for the new dentry that
userspace gave us.
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Christian Brauner <christian@brauner.io>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Al pointed out that on binderfs_fill_super() error
deactivate_locked_super() will call binderfs_kill_super() so all of the
freeing and putting we currently do in binderfs_fill_super() is unnecessary
and buggy. Let's simply return errors and let binderfs_fill_super() take
care of cleaning up on error.
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Christian Brauner <christian@brauner.io>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
- make binderfs control dentry immutable:
We don't allow to unlink it since it is crucial for binderfs to be
useable but if we allow to rename it we make the unlink trivial to
bypass. So prevent renaming too and simply treat the control dentry as
immutable.
- add is_binderfs_control_device() helper:
Take the opportunity and turn the check for the control dentry into a
separate helper is_binderfs_control_device() since it's now used in two
places.
- simplify binderfs_rename():
Instead of hand-rolling our custom version of simple_rename() just dumb
the whole function down to first check whether we're trying to rename the
control dentry. If we do EPERM the caller and if not call simple_rename().
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Christian Brauner <christian@brauner.io>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The comment stems from an early version of that patchset and is just
confusing now.
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Christian Brauner <christian@brauner.io>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Fix to return a negative error code -ENOMEM from the new_inode() and
d_make_root() error handling cases instead of 0, as done elsewhere in
this function.
Fixes: 849d540ddfcd ("binderfs: implement "max" mount option")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Christian Brauner <christian@brauner.io>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
kbuild reported a build faile in [1]. This is triggered when CONFIG_IPC_NS
is not set. So let's make the use of init_ipc_ns conditional on
CONFIG_IPC_NS being set.
[1]: https://lists.01.org/pipermail/kbuild-all/2019-January/056903.html
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The binderfs instance in the initial ipc namespace will always have a
reserve of 4 binder devices unless explicitly capped by specifying a lower
value via the "max" mount option.
This ensures when binder devices are removed (on accident or on purpose)
they can always be recreated without risking that all minor numbers have
already been used up.
Cc: Todd Kjos <tkjos@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
It doesn't make sense to call the header binder_ctl.h when its sole
existence is tied to binderfs. So give it a sensible name. Users will far
more easily remember binderfs.h than binder_ctl.h.
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Since binderfs can be mounted by userns root in non-initial user namespaces
some precautions are in order. First, a way to set a maximum on the number
of binder devices that can be allocated per binderfs instance and second, a
way to reserve a reasonable chunk of binderfs devices for the initial ipc
namespace.
A first approach as seen in [1] used sysctls similiar to devpts but was
shown to be flawed (cf. [2] and [3]) since some aspects were unneeded. This
is an alternative approach which avoids sysctls completely and instead
switches to a single mount option.
Starting with this commit binderfs instances can be mounted with a limit on
the number of binder devices that can be allocated. The max=<count> mount
option serves as a per-instance limit. If max=<count> is set then only
<count> number of binder devices can be allocated in this binderfs
instance.
This allows to safely bind-mount binderfs instances into unprivileged user
namespaces since userns root in a non-initial user namespace cannot change
the mount option as long as it does not own the mount namespace the
binderfs mount was created in and hence cannot drain the host of minor
device numbers
[1]: https://lore.kernel.org/lkml/20181221133909.18794-1-christian@brauner.io/
[2]; https://lore.kernel.org/lkml/20181221163316.GA8517@kroah.com/
[3]: https://lore.kernel.org/lkml/CAHRSSEx+gDVW4fKKK8oZNAir9G5icJLyodO8hykv3O0O1jt2FQ@mail.gmail.com/
[4]: https://lore.kernel.org/lkml/20181221192044.5yvfnuri7gdop4rs@brauner.io/
Cc: Todd Kjos <tkjos@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
When currently mounting binderfs in the same ipc namespace twice:
mount -t binder binder /A
mount -t binder binder /B
then the binderfs instances mounted on /A and /B will be the same, i.e.
they will have the same superblock. This was the first approach that seemed
reasonable. However, this leads to some problems and inconsistencies:
/* private binderfs instance in same ipc namespace */
There is no way for a user to request a private binderfs instance in the
same ipc namespace.
This request has been made in a private mail to me by two independent
people.
/* bind-mounts */
If users want the same binderfs instance to appear in multiple places they
can use bind mounts. So there is no value in having a request for a new
binderfs mount giving them the same instance.
/* unexpected behavior */
It's surprising that request to mount binderfs is not giving the user a new
instance like tmpfs, devpts, ramfs, and others do.
/* past mistakes */
Other pseudo-filesystems once made the same mistakes of giving back the
same superblock when actually requesting a new mount (cf. devpts's
deprecated "newinstance" option).
We should not make the same mistake. Once we've committed to always giving
back the same superblock in the same IPC namespace with the next kernel
release we will not be able to make that change so better to do it now.
/* kdbusfs */
It was pointed out to me that kdbusfs - which is conceptually closely
related to binderfs - also allowed users to get a private kdbusfs instance
in the same IPC namespace by making each mount of kdbusfs a separate
instance. I think that makes a lot of sense.
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|