aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/xen
AgeCommit message (Collapse)Author
2019-09-12xen/pciback: Don't disable PCI_COMMAND on PCI device reset.Konrad Rzeszutek Wilk
commit 7681f31ec9cdacab4fd10570be924f2cef6669ba upstream. There is no need for this at all. Worst it means that if the guest tries to write to BARs it could lead (on certain platforms) to PCI SERR errors. Please note that with af6fc858a35b90e89ea7a7ee58e66628c55c776b "xen-pciback: limit guest control of command register" a guest is still allowed to enable those control bits (safely), but is not allowed to disable them and that therefore a well behaved frontend which enables things before using them will still function correctly. This is done via an write to the configuration register 0x4 which triggers on the backend side: command_write \- pci_enable_device \- pci_enable_device_flags \- do_pci_enable_device \- pcibios_enable_device \-pci_enable_resourcess [which enables the PCI_COMMAND_MEMORY|PCI_COMMAND_IO] However guests (and drivers) which don't do this could cause problems, including the security issues which XSA-120 sought to address. Reported-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-23fs: stream_open - opener for stream-like files so that read and write can ↵Kirill Smelkov
run simultaneously without deadlock commit 10dce8af34226d90fa56746a934f8da5dcdba3df upstream. Commit 9c225f2655e3 ("vfs: atomic f_pos accesses as per POSIX") added locking for file.f_pos access and in particular made concurrent read and write not possible - now both those functions take f_pos lock for the whole run, and so if e.g. a read is blocked waiting for data, write will deadlock waiting for that read to complete. This caused regression for stream-like files where previously read and write could run simultaneously, but after that patch could not do so anymore. See e.g. commit 581d21a2d02a ("xenbus: fix deadlock on writes to /proc/xen/xenbus") which fixes such regression for particular case of /proc/xen/xenbus. The patch that added f_pos lock in 2014 did so to guarantee POSIX thread safety for read/write/lseek and added the locking to file descriptors of all regular files. In 2014 that thread-safety problem was not new as it was already discussed earlier in 2006. However even though 2006'th version of Linus's patch was adding f_pos locking "only for files that are marked seekable with FMODE_LSEEK (thus avoiding the stream-like objects like pipes and sockets)", the 2014 version - the one that actually made it into the tree as 9c225f2655e3 - is doing so irregardless of whether a file is seekable or not. See https://lore.kernel.org/lkml/53022DB1.4070805@gmail.com/ https://lwn.net/Articles/180387 https://lwn.net/Articles/180396 for historic context. The reason that it did so is, probably, that there are many files that are marked non-seekable, but e.g. their read implementation actually depends on knowing current position to correctly handle the read. Some examples: kernel/power/user.c snapshot_read fs/debugfs/file.c u32_array_read fs/fuse/control.c fuse_conn_waiting_read + ... drivers/hwmon/asus_atk0110.c atk_debugfs_ggrp_read arch/s390/hypfs/inode.c hypfs_read_iter ... Despite that, many nonseekable_open users implement read and write with pure stream semantics - they don't depend on passed ppos at all. And for those cases where read could wait for something inside, it creates a situation similar to xenbus - the write could be never made to go until read is done, and read is waiting for some, potentially external, event, for potentially unbounded time -> deadlock. Besides xenbus, there are 14 such places in the kernel that I've found with semantic patch (see below): drivers/xen/evtchn.c:667:8-24: ERROR: evtchn_fops: .read() can deadlock .write() drivers/isdn/capi/capi.c:963:8-24: ERROR: capi_fops: .read() can deadlock .write() drivers/input/evdev.c:527:1-17: ERROR: evdev_fops: .read() can deadlock .write() drivers/char/pcmcia/cm4000_cs.c:1685:7-23: ERROR: cm4000_fops: .read() can deadlock .write() net/rfkill/core.c:1146:8-24: ERROR: rfkill_fops: .read() can deadlock .write() drivers/s390/char/fs3270.c:488:1-17: ERROR: fs3270_fops: .read() can deadlock .write() drivers/usb/misc/ldusb.c:310:1-17: ERROR: ld_usb_fops: .read() can deadlock .write() drivers/hid/uhid.c:635:1-17: ERROR: uhid_fops: .read() can deadlock .write() net/batman-adv/icmp_socket.c:80:1-17: ERROR: batadv_fops: .read() can deadlock .write() drivers/media/rc/lirc_dev.c:198:1-17: ERROR: lirc_fops: .read() can deadlock .write() drivers/leds/uleds.c:77:1-17: ERROR: uleds_fops: .read() can deadlock .write() drivers/input/misc/uinput.c:400:1-17: ERROR: uinput_fops: .read() can deadlock .write() drivers/infiniband/core/user_mad.c:985:7-23: ERROR: umad_fops: .read() can deadlock .write() drivers/gnss/core.c:45:1-17: ERROR: gnss_fops: .read() can deadlock .write() In addition to the cases above another regression caused by f_pos locking is that now FUSE filesystems that implement open with FOPEN_NONSEEKABLE flag, can no longer implement bidirectional stream-like files - for the same reason as above e.g. read can deadlock write locking on file.f_pos in the kernel. FUSE's FOPEN_NONSEEKABLE was added in 2008 in a7c1b990f715 ("fuse: implement nonseekable open") to support OSSPD. OSSPD implements /dev/dsp in userspace with FOPEN_NONSEEKABLE flag, with corresponding read and write routines not depending on current position at all, and with both read and write being potentially blocking operations: See https://github.com/libfuse/osspd https://lwn.net/Articles/308445 https://github.com/libfuse/osspd/blob/14a9cff0/osspd.c#L1406 https://github.com/libfuse/osspd/blob/14a9cff0/osspd.c#L1438-L1477 https://github.com/libfuse/osspd/blob/14a9cff0/osspd.c#L1479-L1510 Corresponding libfuse example/test also describes FOPEN_NONSEEKABLE as "somewhat pipe-like files ..." with read handler not using offset. However that test implements only read without write and cannot exercise the deadlock scenario: https://github.com/libfuse/libfuse/blob/fuse-3.4.2-3-ga1bff7d/example/poll.c#L124-L131 https://github.com/libfuse/libfuse/blob/fuse-3.4.2-3-ga1bff7d/example/poll.c#L146-L163 https://github.com/libfuse/libfuse/blob/fuse-3.4.2-3-ga1bff7d/example/poll.c#L209-L216 I've actually hit the read vs write deadlock for real while implementing my FUSE filesystem where there is /head/watch file, for which open creates separate bidirectional socket-like stream in between filesystem and its user with both read and write being later performed simultaneously. And there it is semantically not easy to split the stream into two separate read-only and write-only channels: https://lab.nexedi.com/kirr/wendelin.core/blob/f13aa600/wcfs/wcfs.go#L88-169 Let's fix this regression. The plan is: 1. We can't change nonseekable_open to include &~FMODE_ATOMIC_POS - doing so would break many in-kernel nonseekable_open users which actually use ppos in read/write handlers. 2. Add stream_open() to kernel to open stream-like non-seekable file descriptors. Read and write on such file descriptors would never use nor change ppos. And with that property on stream-like files read and write will be running without taking f_pos lock - i.e. read and write could be running simultaneously. 3. With semantic patch search and convert to stream_open all in-kernel nonseekable_open users for which read and write actually do not depend on ppos and where there is no other methods in file_operations which assume @offset access. 4. Add FOPEN_STREAM to fs/fuse/ and open in-kernel file-descriptors via steam_open if that bit is present in filesystem open reply. It was tempting to change fs/fuse/ open handler to use stream_open instead of nonseekable_open on just FOPEN_NONSEEKABLE flags, but grepping through Debian codesearch shows users of FOPEN_NONSEEKABLE, and in particular GVFS which actually uses offset in its read and write handlers https://codesearch.debian.net/search?q=-%3Enonseekable+%3D https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfusedaemon.c#L1080 https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfusedaemon.c#L1247-1346 https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfusedaemon.c#L1399-1481 so if we would do such a change it will break a real user. 5. Add stream_open and FOPEN_STREAM handling to stable kernels starting from v3.14+ (the kernel where 9c225f2655 first appeared). This will allow to patch OSSPD and other FUSE filesystems that provide stream-like files to return FOPEN_STREAM | FOPEN_NONSEEKABLE in their open handler and this way avoid the deadlock on all kernel versions. This should work because fs/fuse/ ignores unknown open flags returned from a filesystem and so passing FOPEN_STREAM to a kernel that is not aware of this flag cannot hurt. In turn the kernel that is not aware of FOPEN_STREAM will be < v3.14 where just FOPEN_NONSEEKABLE is sufficient to implement streams without read vs write deadlock. This patch adds stream_open, converts /proc/xen/xenbus to it and adds semantic patch to automatically locate in-kernel places that are either required to be converted due to read vs write deadlock, or that are just safe to be converted because read and write do not use ppos and there are no other funky methods in file_operations. Regarding semantic patch I've verified each generated change manually - that it is correct to convert - and each other nonseekable_open instance left - that it is either not correct to convert there, or that it is not converted due to current stream_open.cocci limitations. The script also does not convert files that should be valid to convert, but that currently have .llseek = noop_llseek or generic_file_llseek for unknown reason despite file being opened with nonseekable_open (e.g. drivers/input/mousedev.c) Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Yongzhi Pan <panyongzhi@gmail.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Juergen Gross <jgross@suse.com> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Tejun Heo <tj@kernel.org> Cc: Kirill Tkhai <ktkhai@virtuozzo.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Julia Lawall <Julia.Lawall@lip6.fr> Cc: Nikolaus Rath <Nikolaus@rath.org> Cc: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Kirill Smelkov <kirr@nexedi.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-03-08pvcalls-front: fix potential null dereferenceWen Yang
commit b4711098066f1cf808d4dc11a1a842860a3292fe upstream. static checker warning: drivers/xen/pvcalls-front.c:373 alloc_active_ring() error: we previously assumed 'map->active.ring' could be null (see line 357) drivers/xen/pvcalls-front.c 351 static int alloc_active_ring(struct sock_mapping *map) 352 { 353 void *bytes; 354 355 map->active.ring = (struct pvcalls_data_intf *) 356 get_zeroed_page(GFP_KERNEL); 357 if (!map->active.ring) ^^^^^^^^^^^^^^^^^ Check 358 goto out; 359 360 map->active.ring->ring_order = PVCALLS_RING_ORDER; 361 bytes = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, 362 PVCALLS_RING_ORDER); 363 if (!bytes) 364 goto out; 365 366 map->active.data.in = bytes; 367 map->active.data.out = bytes + 368 XEN_FLEX_RING_SIZE(PVCALLS_RING_ORDER); 369 370 return 0; 371 372 out: --> 373 free_active_ring(map); ^^^ Add null check on map->active.ring before dereferencing it to avoid any NULL pointer dereferences. Fixes: 9f51c05dc41a ("pvcalls-front: Avoid get_free_pages(GFP_KERNEL) under spinlock") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Suggested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Wen Yang <wen.yang99@zte.com.cn> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> CC: Boris Ostrovsky <boris.ostrovsky@oracle.com> CC: Juergen Gross <jgross@suse.com> CC: Stefano Stabellini <sstabellini@kernel.org> CC: Dan Carpenter <dan.carpenter@oracle.com> CC: xen-devel@lists.xenproject.org CC: linux-kernel@vger.kernel.org Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-03-08pvcalls-front: Avoid get_free_pages(GFP_KERNEL) under spinlockWen Yang
commit 9f51c05dc41a6d69423e3d03d18eb7ab22f9ec19 upstream. The problem is that we call this with a spin lock held. The call tree is: pvcalls_front_accept() holds bedata->socket_lock. -> create_active() -> __get_free_pages() uses GFP_KERNEL The create_active() function is only called from pvcalls_front_accept() with a spin_lock held, The allocation is not allowed to sleep and GFP_KERNEL is not sufficient. This issue was detected by using the Coccinelle software. v2: Add a function doing the allocations which is called outside the lock and passing the allocated data to create_active(). v3: Use the matching deallocators i.e., free_page() and free_pages(), respectively. v4: It would be better to pre-populate map (struct sock_mapping), rather than introducing one more new struct. v5: Since allocating the data outside of this call it should also be freed outside, when create_active() fails. Move kzalloc(sizeof(*map2), GFP_ATOMIC) outside spinlock and use GFP_KERNEL instead. v6: Drop the superfluous calls. Suggested-by: Juergen Gross <jgross@suse.com> Suggested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Suggested-by: Stefano Stabellini <sstabellini@kernel.org> Signed-off-by: Wen Yang <wen.yang99@zte.com.cn> Acked-by: Stefano Stabellini <sstabellini@kernel.org> CC: Julia Lawall <julia.lawall@lip6.fr> CC: Boris Ostrovsky <boris.ostrovsky@oracle.com> CC: Juergen Gross <jgross@suse.com> CC: Stefano Stabellini <sstabellini@kernel.org> CC: xen-devel@lists.xenproject.org CC: linux-kernel@vger.kernel.org Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-03-08xen/pvcalls: remove set but not used variable 'intf'YueHaibing
commit 1f8ce09b36c41a026a37a24b20efa32000892a64 upstream. Fixes gcc '-Wunused-but-set-variable' warning: drivers/xen/pvcalls-back.c: In function 'pvcalls_sk_state_change': drivers/xen/pvcalls-back.c:286:28: warning: variable 'intf' set but not used [-Wunused-but-set-variable] It not used since e6587cdbd732 ("pvcalls-back: set -ENOTCONN in pvcalls_conn_back_read") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-03-08pvcalls-back: set -ENOTCONN in pvcalls_conn_back_readStefano Stabellini
commit e6587cdbd732eacb4c7ce592ed46f7bbcefb655f upstream. When a connection is closing we receive on pvcalls_sk_state_change notification. Instead of setting the connection as closed immediately (-ENOTCONN), let's read one more time from it: pvcalls_conn_back_read will set the connection as closed when necessary. That way, we avoid races between pvcalls_sk_state_change and pvcalls_back_ioworker. Signed-off-by: Stefano Stabellini <stefanos@xilinx.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-03-08pvcalls-front: properly allocate skStefano Stabellini
commit beee1fbe8f7d57d6ebaa5188f9f4db89c2077196 upstream. Don't use kzalloc: it ends up leaving sk->sk_prot not properly initialized. Use sk_alloc instead and define our own trivial struct proto. Signed-off-by: Stefano Stabellini <stefanos@xilinx.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-03-08pvcalls-front: don't try to free unallocated ringsStefano Stabellini
commit 96283f9a084e23d7cda2d3c5d1ffa6df6cf1ecec upstream. inflight_req_id is 0 when initialized. If inflight_req_id is 0, there is no accept_map to free. Fix the check in pvcalls_front_release. Signed-off-by: Stefano Stabellini <stefanos@xilinx.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-03-08pvcalls-front: read all data before closing the connectionStefano Stabellini
commit b79470b64fa9266948d1ce8d825ced94c4f63293 upstream. When a connection is closing in_error is set to ENOTCONN. There could still be outstanding data on the ring left by the backend. Before closing the connection on the frontend side, drain the ring. Signed-off-by: Stefano Stabellini <stefanos@xilinx.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-02-05xen: Fix x86 sched_clock() interface for xenJuergen Gross
commit 867cefb4cb1012f42cada1c7d1f35ac8dd276071 upstream. Commit f94c8d11699759 ("sched/clock, x86/tsc: Rework the x86 'unstable' sched_clock() interface") broke Xen guest time handling across migration: [ 187.249951] Freezing user space processes ... (elapsed 0.001 seconds) done. [ 187.251137] OOM killer disabled. [ 187.251137] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done. [ 187.252299] suspending xenstore... [ 187.266987] xen:grant_table: Grant tables using version 1 layout [18446743811.706476] OOM killer enabled. [18446743811.706478] Restarting tasks ... done. [18446743811.720505] Setting capacity to 16777216 Fix that by setting xen_sched_clock_offset at resume time to ensure a monotonic clock value. [boris: replaced pr_info() with pr_info_once() in xen_callback_vector() to avoid printing with incorrect timestamp during resume (as we haven't re-adjusted the clock yet)] Fixes: f94c8d11699759 ("sched/clock, x86/tsc: Rework the x86 'unstable' sched_clock() interface") Cc: <stable@vger.kernel.org> # 4.11 Reported-by: Hans van Kranenburg <hans.van.kranenburg@mendix.com> Signed-off-by: Juergen Gross <jgross@suse.com> Tested-by: Hans van Kranenburg <hans.van.kranenburg@mendix.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2018-12-30pvcalls-front: fixes incorrect error handlingPan Bian
commit 975ef94a0284648fb0137bd5e949b18cef604e33 upstream. kfree() is incorrectly used to release the pages allocated by __get_free_page() and __get_free_pages(). Use the matching deallocators i.e., free_page() and free_pages(), respectively. Signed-off-by: Pan Bian <bianpan2016@163.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2018-12-30Revert "xen/balloon: Mark unallocated host memory as UNUSABLE"Igor Druzhinin
commit 123664101aa2156d05251704fc63f9bcbf77741a upstream. This reverts commit b3cf8528bb21febb650a7ecbf080d0647be40b9f. That commit unintentionally broke Xen balloon memory hotplug with "hotplug_unpopulated" set to 1. As long as "System RAM" resource got assigned under a new "Unusable memory" resource in IO/Mem tree any attempt to online this memory would fail due to general kernel restrictions on having "System RAM" resources as 1st level only. The original issue that commit has tried to workaround fa564ad96366 ("x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 00-1f, 30-3f, 60-7f)") also got amended by the following 03a551734 ("x86/PCI: Move and shrink AMD 64-bit window to avoid conflict") which made the original fix to Xen ballooning unnecessary. Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2018-12-30xen: xlate_mmu: add missing header to fix 'W=1' warningSrikanth Boddepalli
commit 72791ac854fea36034fa7976b748fde585008e78 upstream. Add a missing header otherwise compiler warns about missed prototype: drivers/xen/xlate_mmu.c:183:5: warning: no previous prototype for 'xen_xlate_unmap_gfn_range?' [-Wmissing-prototypes] int xen_xlate_unmap_gfn_range(struct vm_area_struct *vma, ^~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Srikanth Boddepalli <boddepalli.srikanth@gmail.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Joey Pabalinas <joeypabalinas@gmail.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2018-11-13xen: remove size limit of privcmd-buf mapping interfaceJuergen Gross
commit 3941552aec1e04d63999988a057ae09a1c56ebeb upstream. Currently the size of hypercall buffers allocated via /dev/xen/hypercall is limited to a default of 64 memory pages. For live migration of guests this might be too small as the page dirty bitmask needs to be sized according to the size of the guest. This means migrating a 8GB sized guest is already exhausting the default buffer size for the dirty bitmap. There is no sensible way to set a sane limit, so just remove it completely. The device node's usage is limited to root anyway, so there is no additional DOS scenario added by allowing unlimited buffers. While at it make the error path for the -ENOMEM case a little bit cleaner by setting n_pages to the number of successfully allocated pages instead of the target size. Fixes: c51b3c639e01f2 ("xen: add new hypercall buffer mapping device") Cc: <stable@vger.kernel.org> #4.18 Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-13xen/balloon: Support xend-based toolstackBoris Ostrovsky
commit 3aa6c19d2f38be9c6e9a8ad5fa8e3c9d29ee3c35 upstream. Xend-based toolstacks don't have static-max entry in xenstore. The equivalent node for those toolstacks is memory_static_max. Fixes: 5266b8e4445c (xen: fix booting ballooned down hvm guest) Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: <stable@vger.kernel.org> # 4.13 Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-13xen-swiotlb: use actually allocated size on check physical continuousJoe Jin
commit 7250f422da0480d8512b756640f131b9b893ccda upstream. xen_swiotlb_{alloc,free}_coherent() allocate/free memory based on the order of the pages and not size argument (bytes). This is inconsistent with range_straddles_page_boundary and memset which use the 'size' value, which may lead to not exchanging memory with Xen (range_straddles_page_boundary() returned true). And then the call to xen_swiotlb_free_coherent() would actually try to exchange the memory with Xen, leading to the kernel hitting an BUG (as the hypercall returned an error). This patch fixes it by making the 'size' variable be of the same size as the amount of memory allocated. CC: stable@vger.kernel.org Signed-off-by: Joe Jin <joe.jin@oracle.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Christoph Helwig <hch@lst.de> Cc: Dongli Zhang <dongli.zhang@oracle.com> Cc: John Sobecki <john.sobecki@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-10xen: fix GCC warning and remove duplicate EVTCHN_ROW/EVTCHN_COL usageJosh Abraham
[ Upstream commit 4dca864b59dd150a221730775e2f21f49779c135 ] This patch removes duplicate macro useage in events_base.c. It also fixes gcc warning: variable ‘col’ set but not used [-Wunused-but-set-variable] Signed-off-by: Joshua Abraham <j.abraham1776@gmail.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-10xen: avoid crash in disable_hotplug_cpuOlaf Hering
[ Upstream commit 3366cdb6d350d95466ee430ac50f3c8415ca8f46 ] The command 'xl vcpu-set 0 0', issued in dom0, will crash dom0: BUG: unable to handle kernel NULL pointer dereference at 00000000000002d8 PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 7 PID: 65 Comm: xenwatch Not tainted 4.19.0-rc2-1.ga9462db-default #1 openSUSE Tumbleweed (unreleased) Hardware name: Intel Corporation S5520UR/S5520UR, BIOS S5500.86B.01.00.0050.050620101605 05/06/2010 RIP: e030:device_offline+0x9/0xb0 Code: 77 24 00 e9 ce fe ff ff 48 8b 13 e9 68 ff ff ff 48 8b 13 e9 29 ff ff ff 48 8b 13 e9 ea fe ff ff 90 66 66 66 66 90 41 54 55 53 <f6> 87 d8 02 00 00 01 0f 85 88 00 00 00 48 c7 c2 20 09 60 81 31 f6 RSP: e02b:ffffc90040f27e80 EFLAGS: 00010203 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: ffff8801f3800000 RSI: ffffc90040f27e70 RDI: 0000000000000000 RBP: 0000000000000000 R08: ffffffff820e47b3 R09: 0000000000000000 R10: 0000000000007ff0 R11: 0000000000000000 R12: ffffffff822e6d30 R13: dead000000000200 R14: dead000000000100 R15: ffffffff8158b4e0 FS: 00007ffa595158c0(0000) GS:ffff8801f39c0000(0000) knlGS:0000000000000000 CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000002d8 CR3: 00000001d9602000 CR4: 0000000000002660 Call Trace: handle_vcpu_hotplug_event+0xb5/0xc0 xenwatch_thread+0x80/0x140 ? wait_woken+0x80/0x80 kthread+0x112/0x130 ? kthread_create_worker_on_cpu+0x40/0x40 ret_from_fork+0x3a/0x50 This happens because handle_vcpu_hotplug_event is called twice. In the first iteration cpu_present is still true, in the second iteration cpu_present is false which causes get_cpu_device to return NULL. In case of cpu#0, cpu_online is apparently always true. Fix this crash by checking if the cpu can be hotplugged, which is false for a cpu that was just removed. Also check if the cpu was actually offlined by device_remove, otherwise leave the cpu_present state as it is. Rearrange to code to do all work with device_hotplug_lock held. Signed-off-by: Olaf Hering <olaf@aepfle.de> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-10xen/manage: don't complain about an empty value in control/sysrq nodeVitaly Kuznetsov
[ Upstream commit 87dffe86d406bee8782cac2db035acb9a28620a7 ] When guest receives a sysrq request from the host it acknowledges it by writing '\0' to control/sysrq xenstore node. This, however, make xenstore watch fire again but xenbus_scanf() fails to parse empty value with "%c" format string: sysrq: SysRq : Emergency Sync Emergency Sync complete xen:manage: Error -34 reading sysrq code in control/sysrq Ignore -ERANGE the same way we already ignore -ENOENT, empty value in control/sysrq is totally legal. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-15xen/balloon: fix balloon initialization for PVH Dom0Roger Pau Monne
[ Upstream commit 3596924a233e45aa918c961a902170fc4916461b ] The current balloon code tries to calculate a delta factor for the balloon target when running in HVM mode in order to account for memory used by the firmware. This workaround for memory accounting doesn't work properly on a PVH Dom0, that has a static-max value different from the target value even at startup. Note that this is not a problem for DomUs because guests are started with a static-max value that matches the amount of RAM in the memory map. Fix this by forcefully setting target_diff for Dom0, regardless of it's mode. Reported-by: Gabriel Bercarug <bercarug@amazon.com> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-06-23Merge tag 'for-linus-4.18-rc2-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: "This contains the following fixes/cleanups: - the removal of a BUG_ON() which wasn't necessary and which could trigger now due to a recent change - a correction of a long standing bug happening very rarely in Xen dom0 when a hypercall buffer from user land was not accessible by the hypervisor for very short periods of time due to e.g. page migration or compaction - usage of EXPORT_SYMBOL_GPL() instead of EXPORT_SYMBOL() in a Xen-related driver (no breakage possible as using those symbols without others already exported via EXPORT-SYMBOL_GPL() wouldn't make any sense) - a simplification for Xen PVH or Xen ARM guests - some additional error handling for callers of xenbus_printf()" * tag 'for-linus-4.18-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen: Remove unnecessary BUG_ON from __unbind_from_irq() xen: add new hypercall buffer mapping device xen/scsiback: add error handling for xenbus_printf scsi: xen-scsifront: add error handling for xenbus_printf xen/grant-table: Export gnttab_{alloc|free}_pages as GPL xen: add error handling for xenbus_printf xen: share start flags between PV and PVH
2018-06-22xen: Remove unnecessary BUG_ON from __unbind_from_irq()Boris Ostrovsky
Commit 910f8befdf5b ("xen/pirq: fix error path cleanup when binding MSIs") fixed a couple of errors in error cleanup path of xen_bind_pirq_msi_to_irq(). This cleanup allowed a call to __unbind_from_irq() with an unbound irq, which would result in triggering the BUG_ON there. Since there is really no reason for the BUG_ON (xen_free_irq() can operate on unbound irqs) we can remove it. Reported-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: stable@vger.kernel.org Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2018-06-22xen: add new hypercall buffer mapping deviceJuergen Gross
For passing arbitrary data from user land to the Xen hypervisor the Xen tools today are using mlock()ed buffers. Unfortunately the kernel might change access rights of such buffers for brief periods of time e.g. for page migration or compaction, leading to access faults in the hypervisor, as the hypervisor can't use the locks of the kernel. In order to solve this problem add a new device node to the Xen privcmd driver to easily allocate hypercall buffers via mmap(). The memory is allocated in the kernel and just mapped into user space. Marked as VM_IO the user mapping will not be subject to page migration et al. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2018-06-19xen/scsiback: add error handling for xenbus_printfZhouyang Jia
When xenbus_printf fails, the lack of error-handling code may cause unexpected results. This patch adds error-handling code after calling xenbus_printf. Signed-off-by: Zhouyang Jia <jiazhouyang09@gmail.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2018-06-19xen/grant-table: Export gnttab_{alloc|free}_pages as GPLOleksandr Andrushchenko
Only gnttab_{alloc|free}_pages are exported as EXPORT_SYMBOL while all the rest are exported as EXPORT_SYMBOL_GPL, thus effectively making it not possible for non-GPL driver modules to use grant table module. Export gnttab_{alloc|free}_pages as EXPORT_SYMBOL_GPL so all the exports are aligned. Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2018-06-19xen: add error handling for xenbus_printfZhouyang Jia
When xenbus_printf fails, the lack of error-handling code may cause unexpected results. This patch adds error-handling code after calling xenbus_printf. Signed-off-by: Zhouyang Jia <jiazhouyang09@gmail.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2018-06-12treewide: kvmalloc() -> kvmalloc_array()Kees Cook
The kvmalloc() function has a 2-factor argument form, kvmalloc_array(). This patch replaces cases of: kvmalloc(a * b, gfp) with: kvmalloc_array(a * b, gfp) as well as handling cases of: kvmalloc(a * b * c, gfp) with: kvmalloc(array3_size(a, b, c), gfp) as it's slightly less ugly than: kvmalloc_array(array_size(a, b), c, gfp) This does, however, attempt to ignore constant size factors like: kvmalloc(4 * 1024, gfp) though any constants defined via macros get caught up in the conversion. Any factors with a sizeof() of "unsigned char", "char", and "u8" were dropped, since they're redundant. The Coccinelle script used for this was: // Fix redundant parens around sizeof(). @@ type TYPE; expression THING, E; @@ ( kvmalloc( - (sizeof(TYPE)) * E + sizeof(TYPE) * E , ...) | kvmalloc( - (sizeof(THING)) * E + sizeof(THING) * E , ...) ) // Drop single-byte sizes and redundant parens. @@ expression COUNT; typedef u8; typedef __u8; @@ ( kvmalloc( - sizeof(u8) * (COUNT) + COUNT , ...) | kvmalloc( - sizeof(__u8) * (COUNT) + COUNT , ...) | kvmalloc( - sizeof(char) * (COUNT) + COUNT , ...) | kvmalloc( - sizeof(unsigned char) * (COUNT) + COUNT , ...) | kvmalloc( - sizeof(u8) * COUNT + COUNT , ...) | kvmalloc( - sizeof(__u8) * COUNT + COUNT , ...) | kvmalloc( - sizeof(char) * COUNT + COUNT , ...) | kvmalloc( - sizeof(unsigned char) * COUNT + COUNT , ...) ) // 2-factor product with sizeof(type/expression) and identifier or constant. @@ type TYPE; expression THING; identifier COUNT_ID; constant COUNT_CONST; @@ ( - kvmalloc + kvmalloc_array ( - sizeof(TYPE) * (COUNT_ID) + COUNT_ID, sizeof(TYPE) , ...) | - kvmalloc + kvmalloc_array ( - sizeof(TYPE) * COUNT_ID + COUNT_ID, sizeof(TYPE) , ...) | - kvmalloc + kvmalloc_array ( - sizeof(TYPE) * (COUNT_CONST) + COUNT_CONST, sizeof(TYPE) , ...) | - kvmalloc + kvmalloc_array ( - sizeof(TYPE) * COUNT_CONST + COUNT_CONST, sizeof(TYPE) , ...) | - kvmalloc + kvmalloc_array ( - sizeof(THING) * (COUNT_ID) + COUNT_ID, sizeof(THING) , ...) | - kvmalloc + kvmalloc_array ( - sizeof(THING) * COUNT_ID + COUNT_ID, sizeof(THING) , ...) | - kvmalloc + kvmalloc_array ( - sizeof(THING) * (COUNT_CONST) + COUNT_CONST, sizeof(THING) , ...) | - kvmalloc + kvmalloc_array ( - sizeof(THING) * COUNT_CONST + COUNT_CONST, sizeof(THING) , ...) ) // 2-factor product, only identifiers. @@ identifier SIZE, COUNT; @@ - kvmalloc + kvmalloc_array ( - SIZE * COUNT + COUNT, SIZE , ...) // 3-factor product with 1 sizeof(type) or sizeof(expression), with // redundant parens removed. @@ expression THING; identifier STRIDE, COUNT; type TYPE; @@ ( kvmalloc( - sizeof(TYPE) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kvmalloc( - sizeof(TYPE) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kvmalloc( - sizeof(TYPE) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kvmalloc( - sizeof(TYPE) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kvmalloc( - sizeof(THING) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kvmalloc( - sizeof(THING) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kvmalloc( - sizeof(THING) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kvmalloc( - sizeof(THING) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) ) // 3-factor product with 2 sizeof(variable), with redundant parens removed. @@ expression THING1, THING2; identifier COUNT; type TYPE1, TYPE2; @@ ( kvmalloc( - sizeof(TYPE1) * sizeof(TYPE2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) | kvmalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) | kvmalloc( - sizeof(THING1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) | kvmalloc( - sizeof(THING1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) | kvmalloc( - sizeof(TYPE1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) | kvmalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) ) // 3-factor product, only identifiers, with redundant parens removed. @@ identifier STRIDE, SIZE, COUNT; @@ ( kvmalloc( - (COUNT) * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kvmalloc( - COUNT * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kvmalloc( - COUNT * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kvmalloc( - (COUNT) * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kvmalloc( - COUNT * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kvmalloc( - (COUNT) * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kvmalloc( - (COUNT) * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kvmalloc( - COUNT * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) ) // Any remaining multi-factor products, first at least 3-factor products, // when they're not all constants... @@ expression E1, E2, E3; constant C1, C2, C3; @@ ( kvmalloc(C1 * C2 * C3, ...) | kvmalloc( - (E1) * E2 * E3 + array3_size(E1, E2, E3) , ...) | kvmalloc( - (E1) * (E2) * E3 + array3_size(E1, E2, E3) , ...) | kvmalloc( - (E1) * (E2) * (E3) + array3_size(E1, E2, E3) , ...) | kvmalloc( - E1 * E2 * E3 + array3_size(E1, E2, E3) , ...) ) // And then all remaining 2 factors products when they're not all constants, // keeping sizeof() as the second factor argument. @@ expression THING, E1, E2; type TYPE; constant C1, C2, C3; @@ ( kvmalloc(sizeof(THING) * C2, ...) | kvmalloc(sizeof(TYPE) * C2, ...) | kvmalloc(C1 * C2 * C3, ...) | kvmalloc(C1 * C2, ...) | - kvmalloc + kvmalloc_array ( - sizeof(TYPE) * (E2) + E2, sizeof(TYPE) , ...) | - kvmalloc + kvmalloc_array ( - sizeof(TYPE) * E2 + E2, sizeof(TYPE) , ...) | - kvmalloc + kvmalloc_array ( - sizeof(THING) * (E2) + E2, sizeof(THING) , ...) | - kvmalloc + kvmalloc_array ( - sizeof(THING) * E2 + E2, sizeof(THING) , ...) | - kvmalloc + kvmalloc_array ( - (E1) * E2 + E1, E2 , ...) | - kvmalloc + kvmalloc_array ( - (E1) * (E2) + E1, E2 , ...) | - kvmalloc + kvmalloc_array ( - E1 * E2 + E1, E2 , ...) ) Signed-off-by: Kees Cook <keescook@chromium.org>
2018-06-12treewide: kzalloc() -> kcalloc()Kees Cook
The kzalloc() function has a 2-factor argument form, kcalloc(). This patch replaces cases of: kzalloc(a * b, gfp) with: kcalloc(a * b, gfp) as well as handling cases of: kzalloc(a * b * c, gfp) with: kzalloc(array3_size(a, b, c), gfp) as it's slightly less ugly than: kzalloc_array(array_size(a, b), c, gfp) This does, however, attempt to ignore constant size factors like: kzalloc(4 * 1024, gfp) though any constants defined via macros get caught up in the conversion. Any factors with a sizeof() of "unsigned char", "char", and "u8" were dropped, since they're redundant. The Coccinelle script used for this was: // Fix redundant parens around sizeof(). @@ type TYPE; expression THING, E; @@ ( kzalloc( - (sizeof(TYPE)) * E + sizeof(TYPE) * E , ...) | kzalloc( - (sizeof(THING)) * E + sizeof(THING) * E , ...) ) // Drop single-byte sizes and redundant parens. @@ expression COUNT; typedef u8; typedef __u8; @@ ( kzalloc( - sizeof(u8) * (COUNT) + COUNT , ...) | kzalloc( - sizeof(__u8) * (COUNT) + COUNT , ...) | kzalloc( - sizeof(char) * (COUNT) + COUNT , ...) | kzalloc( - sizeof(unsigned char) * (COUNT) + COUNT , ...) | kzalloc( - sizeof(u8) * COUNT + COUNT , ...) | kzalloc( - sizeof(__u8) * COUNT + COUNT , ...) | kzalloc( - sizeof(char) * COUNT + COUNT , ...) | kzalloc( - sizeof(unsigned char) * COUNT + COUNT , ...) ) // 2-factor product with sizeof(type/expression) and identifier or constant. @@ type TYPE; expression THING; identifier COUNT_ID; constant COUNT_CONST; @@ ( - kzalloc + kcalloc ( - sizeof(TYPE) * (COUNT_ID) + COUNT_ID, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * COUNT_ID + COUNT_ID, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * (COUNT_CONST) + COUNT_CONST, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * COUNT_CONST + COUNT_CONST, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * (COUNT_ID) + COUNT_ID, sizeof(THING) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * COUNT_ID + COUNT_ID, sizeof(THING) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * (COUNT_CONST) + COUNT_CONST, sizeof(THING) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * COUNT_CONST + COUNT_CONST, sizeof(THING) , ...) ) // 2-factor product, only identifiers. @@ identifier SIZE, COUNT; @@ - kzalloc + kcalloc ( - SIZE * COUNT + COUNT, SIZE , ...) // 3-factor product with 1 sizeof(type) or sizeof(expression), with // redundant parens removed. @@ expression THING; identifier STRIDE, COUNT; type TYPE; @@ ( kzalloc( - sizeof(TYPE) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kzalloc( - sizeof(TYPE) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kzalloc( - sizeof(TYPE) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kzalloc( - sizeof(TYPE) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kzalloc( - sizeof(THING) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kzalloc( - sizeof(THING) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kzalloc( - sizeof(THING) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kzalloc( - sizeof(THING) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) ) // 3-factor product with 2 sizeof(variable), with redundant parens removed. @@ expression THING1, THING2; identifier COUNT; type TYPE1, TYPE2; @@ ( kzalloc( - sizeof(TYPE1) * sizeof(TYPE2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) | kzalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) | kzalloc( - sizeof(THING1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) | kzalloc( - sizeof(THING1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) | kzalloc( - sizeof(TYPE1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) | kzalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) ) // 3-factor product, only identifiers, with redundant parens removed. @@ identifier STRIDE, SIZE, COUNT; @@ ( kzalloc( - (COUNT) * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - COUNT * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - COUNT * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - (COUNT) * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - COUNT * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - (COUNT) * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - (COUNT) * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - COUNT * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) ) // Any remaining multi-factor products, first at least 3-factor products, // when they're not all constants... @@ expression E1, E2, E3; constant C1, C2, C3; @@ ( kzalloc(C1 * C2 * C3, ...) | kzalloc( - (E1) * E2 * E3 + array3_size(E1, E2, E3) , ...) | kzalloc( - (E1) * (E2) * E3 + array3_size(E1, E2, E3) , ...) | kzalloc( - (E1) * (E2) * (E3) + array3_size(E1, E2, E3) , ...) | kzalloc( - E1 * E2 * E3 + array3_size(E1, E2, E3) , ...) ) // And then all remaining 2 factors products when they're not all constants, // keeping sizeof() as the second factor argument. @@ expression THING, E1, E2; type TYPE; constant C1, C2, C3; @@ ( kzalloc(sizeof(THING) * C2, ...) | kzalloc(sizeof(TYPE) * C2, ...) | kzalloc(C1 * C2 * C3, ...) | kzalloc(C1 * C2, ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * (E2) + E2, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * E2 + E2, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * (E2) + E2, sizeof(THING) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * E2 + E2, sizeof(THING) , ...) | - kzalloc + kcalloc ( - (E1) * E2 + E1, E2 , ...) | - kzalloc + kcalloc ( - (E1) * (E2) + E1, E2 , ...) | - kzalloc + kcalloc ( - E1 * E2 + E1, E2 , ...) ) Signed-off-by: Kees Cook <keescook@chromium.org>
2018-06-12treewide: kmalloc() -> kmalloc_array()Kees Cook
The kmalloc() function has a 2-factor argument form, kmalloc_array(). This patch replaces cases of: kmalloc(a * b, gfp) with: kmalloc_array(a * b, gfp) as well as handling cases of: kmalloc(a * b * c, gfp) with: kmalloc(array3_size(a, b, c), gfp) as it's slightly less ugly than: kmalloc_array(array_size(a, b), c, gfp) This does, however, attempt to ignore constant size factors like: kmalloc(4 * 1024, gfp) though any constants defined via macros get caught up in the conversion. Any factors with a sizeof() of "unsigned char", "char", and "u8" were dropped, since they're redundant. The tools/ directory was manually excluded, since it has its own implementation of kmalloc(). The Coccinelle script used for this was: // Fix redundant parens around sizeof(). @@ type TYPE; expression THING, E; @@ ( kmalloc( - (sizeof(TYPE)) * E + sizeof(TYPE) * E , ...) | kmalloc( - (sizeof(THING)) * E + sizeof(THING) * E , ...) ) // Drop single-byte sizes and redundant parens. @@ expression COUNT; typedef u8; typedef __u8; @@ ( kmalloc( - sizeof(u8) * (COUNT) + COUNT , ...) | kmalloc( - sizeof(__u8) * (COUNT) + COUNT , ...) | kmalloc( - sizeof(char) * (COUNT) + COUNT , ...) | kmalloc( - sizeof(unsigned char) * (COUNT) + COUNT , ...) | kmalloc( - sizeof(u8) * COUNT + COUNT , ...) | kmalloc( - sizeof(__u8) * COUNT + COUNT , ...) | kmalloc( - sizeof(char) * COUNT + COUNT , ...) | kmalloc( - sizeof(unsigned char) * COUNT + COUNT , ...) ) // 2-factor product with sizeof(type/expression) and identifier or constant. @@ type TYPE; expression THING; identifier COUNT_ID; constant COUNT_CONST; @@ ( - kmalloc + kmalloc_array ( - sizeof(TYPE) * (COUNT_ID) + COUNT_ID, sizeof(TYPE) , ...) | - kmalloc + kmalloc_array ( - sizeof(TYPE) * COUNT_ID + COUNT_ID, sizeof(TYPE) , ...) | - kmalloc + kmalloc_array ( - sizeof(TYPE) * (COUNT_CONST) + COUNT_CONST, sizeof(TYPE) , ...) | - kmalloc + kmalloc_array ( - sizeof(TYPE) * COUNT_CONST + COUNT_CONST, sizeof(TYPE) , ...) | - kmalloc + kmalloc_array ( - sizeof(THING) * (COUNT_ID) + COUNT_ID, sizeof(THING) , ...) | - kmalloc + kmalloc_array ( - sizeof(THING) * COUNT_ID + COUNT_ID, sizeof(THING) , ...) | - kmalloc + kmalloc_array ( - sizeof(THING) * (COUNT_CONST) + COUNT_CONST, sizeof(THING) , ...) | - kmalloc + kmalloc_array ( - sizeof(THING) * COUNT_CONST + COUNT_CONST, sizeof(THING) , ...) ) // 2-factor product, only identifiers. @@ identifier SIZE, COUNT; @@ - kmalloc + kmalloc_array ( - SIZE * COUNT + COUNT, SIZE , ...) // 3-factor product with 1 sizeof(type) or sizeof(expression), with // redundant parens removed. @@ expression THING; identifier STRIDE, COUNT; type TYPE; @@ ( kmalloc( - sizeof(TYPE) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kmalloc( - sizeof(TYPE) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kmalloc( - sizeof(TYPE) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kmalloc( - sizeof(TYPE) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kmalloc( - sizeof(THING) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kmalloc( - sizeof(THING) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kmalloc( - sizeof(THING) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kmalloc( - sizeof(THING) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) ) // 3-factor product with 2 sizeof(variable), with redundant parens removed. @@ expression THING1, THING2; identifier COUNT; type TYPE1, TYPE2; @@ ( kmalloc( - sizeof(TYPE1) * sizeof(TYPE2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) | kmalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) | kmalloc( - sizeof(THING1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) | kmalloc( - sizeof(THING1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) | kmalloc( - sizeof(TYPE1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) | kmalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) ) // 3-factor product, only identifiers, with redundant parens removed. @@ identifier STRIDE, SIZE, COUNT; @@ ( kmalloc( - (COUNT) * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kmalloc( - COUNT * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kmalloc( - COUNT * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kmalloc( - (COUNT) * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kmalloc( - COUNT * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kmalloc( - (COUNT) * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kmalloc( - (COUNT) * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kmalloc( - COUNT * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) ) // Any remaining multi-factor products, first at least 3-factor products, // when they're not all constants... @@ expression E1, E2, E3; constant C1, C2, C3; @@ ( kmalloc(C1 * C2 * C3, ...) | kmalloc( - (E1) * E2 * E3 + array3_size(E1, E2, E3) , ...) | kmalloc( - (E1) * (E2) * E3 + array3_size(E1, E2, E3) , ...) | kmalloc( - (E1) * (E2) * (E3) + array3_size(E1, E2, E3) , ...) | kmalloc( - E1 * E2 * E3 + array3_size(E1, E2, E3) , ...) ) // And then all remaining 2 factors products when they're not all constants, // keeping sizeof() as the second factor argument. @@ expression THING, E1, E2; type TYPE; constant C1, C2, C3; @@ ( kmalloc(sizeof(THING) * C2, ...) | kmalloc(sizeof(TYPE) * C2, ...) | kmalloc(C1 * C2 * C3, ...) | kmalloc(C1 * C2, ...) | - kmalloc + kmalloc_array ( - sizeof(TYPE) * (E2) + E2, sizeof(TYPE) , ...) | - kmalloc + kmalloc_array ( - sizeof(TYPE) * E2 + E2, sizeof(TYPE) , ...) | - kmalloc + kmalloc_array ( - sizeof(THING) * (E2) + E2, sizeof(THING) , ...) | - kmalloc + kmalloc_array ( - sizeof(THING) * E2 + E2, sizeof(THING) , ...) | - kmalloc + kmalloc_array ( - (E1) * E2 + E1, E2 , ...) | - kmalloc + kmalloc_array ( - (E1) * (E2) + E1, E2 , ...) | - kmalloc + kmalloc_array ( - E1 * E2 + E1, E2 , ...) ) Signed-off-by: Kees Cook <keescook@chromium.org>
2018-06-08Merge tag 'for-linus-4.18-rc1-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen updates from Juergen Gross: "This contains some minor code cleanups (fixing return types of functions), some fixes for Linux running as Xen PVH guest, and adding of a new guest resource mapping feature for Xen tools" * tag 'for-linus-4.18-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/PVH: Make GDT selectors PVH-specific xen/PVH: Set up GS segment for stack canary xen/store: do not store local values in xen_start_info xen-netfront: fix xennet_start_xmit()'s return type xen/privcmd: add IOCTL_PRIVCMD_MMAP_RESOURCE xen: Change return type to vm_fault_t
2018-05-18xen-swiotlb: fix the check condition for xen_swiotlb_free_coherentJoe Jin
When run raidconfig from Dom0 we found that the Xen DMA heap is reduced, but Dom Heap is increased by the same size. Tracing raidconfig we found that the related ioctl() in megaraid_sas will call dma_alloc_coherent() to apply memory. If the memory allocated by Dom0 is not in the DMA area, it will exchange memory with Xen to meet the requiment. Later drivers call dma_free_coherent() to free the memory, on xen_swiotlb_free_coherent() the check condition (dev_addr + size - 1 <= dma_mask) is always false, it prevents calling xen_destroy_contiguous_region() to return the memory to the Xen DMA heap. This issue introduced by commit 6810df88dcfc2 "xen-swiotlb: When doing coherent alloc/dealloc check before swizzling the MFNs.". Signed-off-by: Joe Jin <joe.jin@oracle.com> Tested-by: John Sobecki <john.sobecki@oracle.com> Reviewed-by: Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2018-05-17xen/store: do not store local values in xen_start_infoRoger Pau Monne
There's no need to store the xenstore page or event channel in xen_start_info if they are locally initialized. This also fixes PVH local xenstore initialization due to the lack of xen_start_info in that case. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2018-05-14xen/privcmd: add IOCTL_PRIVCMD_MMAP_RESOURCEPaul Durrant
My recent Xen patch series introduces a new HYPERVISOR_memory_op to support direct priv-mapping of certain guest resources (such as ioreq pages, used by emulators) by a tools domain, rather than having to access such resources via the guest P2M. This patch adds the necessary infrastructure to the privcmd driver and Xen MMU code to support direct resource mapping. NOTE: The adjustment in the MMU code is partially cosmetic. Xen will now allow a PV tools domain to map guest pages either by GFN or MFN, thus the term 'mfn' has been swapped for 'pfn' in the lower layers of the remap code. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2018-05-14xen: Change return type to vm_fault_tSouptick Joarder
Use new return type vm_fault_t for fault handler in struct vm_operations_struct. Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2018-04-20Merge tag 'for-linus-4.17-rc2-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: - some fixes of kmalloc() flags - one fix of the xenbus driver - an update of the pv sound driver interface needed for a driver which will go through the sound tree * tag 'for-linus-4.17-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen: xenbus_dev_frontend: Really return response string xen/sndif: Sync up with the canonical definition in Xen xen: xen-pciback: Replace GFP_ATOMIC with GFP_KERNEL in pcistub_reg_add xen: xen-pciback: Replace GFP_ATOMIC with GFP_KERNEL in xen_pcibk_config_quirks_init xen: xen-pciback: Replace GFP_ATOMIC with GFP_KERNEL in pcistub_device_alloc xen: xen-pciback: Replace GFP_ATOMIC with GFP_KERNEL in pcistub_init_device xen: xen-pciback: Replace GFP_ATOMIC with GFP_KERNEL in pcistub_probe
2018-04-17xen: xenbus_dev_frontend: Really return response stringSimon Gaiser
xenbus_command_reply() did not actually copy the response string and leaked stack content instead. Fixes: 9a6161fe73bd ("xen: return xenstore command failures via response instead of rc") Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2018-04-16xen: xen-pciback: Replace GFP_ATOMIC with GFP_KERNEL in pcistub_reg_addJia-Ju Bai
pcistub_reg_add() is never called in atomic context. pcistub_reg_add() is only called by pcistub_quirk_add, which is only set in DRIVER_ATTR(). Despite never getting called from atomic context, pcistub_reg_add() calls kzalloc() with GFP_ATOMIC, which does not sleep for allocation. GFP_ATOMIC is not necessary and can be replaced with GFP_KERNEL, which can sleep and improve the possibility of sucessful allocation. This is found by a static analysis tool named DCNS written by myself. And I also manually check it. Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2018-04-16xen: xen-pciback: Replace GFP_ATOMIC with GFP_KERNEL in ↵Jia-Ju Bai
xen_pcibk_config_quirks_init xen_pcibk_config_quirks_init() is never called in atomic context. The call chains ending up at xen_pcibk_config_quirks_init() are: [1] xen_pcibk_config_quirks_init() <- xen_pcibk_config_init_dev() <- pcistub_init_device() <- pcistub_seize() <- pcistub_probe() [2] xen_pcibk_config_quirks_init() <- xen_pcibk_config_init_dev() <- pcistub_init_device() <- pcistub_init_devices_late() <- xen_pcibk_init() pcistub_probe() is only set as ".probe" in struct pci_driver. xen_pcibk_init() is is only set as a parameter of module_init(). These functions are not called in atomic context. Despite never getting called from atomic context, xen_pcibk_config_quirks_init() calls kzalloc() with GFP_ATOMIC, which does not sleep for allocation. GFP_ATOMIC is not necessary and can be replaced with GFP_KERNEL, which can sleep and improve the possibility of sucessful allocation. Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2018-04-16xen: xen-pciback: Replace GFP_ATOMIC with GFP_KERNEL in pcistub_device_allocJia-Ju Bai
pcistub_device_alloc() is never called in atomic context. The call chain ending up at pcistub_device_alloc() is: [1] pcistub_device_alloc() <- pcistub_seize() <- pcistub_probe() pcistub_probe() is only set as ".probe" in struct pci_driver. This function is not called in atomic context. Despite never getting called from atomic context, pcistub_device_alloc() calls kzalloc() with GFP_ATOMIC, which does not sleep for allocation. GFP_ATOMIC is not necessary and can be replaced with GFP_KERNEL, which can sleep and improve the possibility of sucessful allocation. Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2018-04-16xen: xen-pciback: Replace GFP_ATOMIC with GFP_KERNEL in pcistub_init_deviceJia-Ju Bai
pcistub_init_device() is never called in atomic context. The call chain ending up at pcistub_init_device() is: [1] pcistub_init_device() <- pcistub_seize() <- pcistub_probe() [2] pcistub_init_device() <- pcistub_init_devices_late() <- xen_pcibk_init() pcistub_probe() is only set as ".probe" in struct pci_driver. xen_pcibk_init() is is only set as a parameter of module_init(). These functions are not called in atomic context. Despite never getting called from atomic context, pcistub_init_device() calls kzalloc() with GFP_ATOMIC, which does not sleep for allocation. GFP_ATOMIC is not necessary and can be replaced with GFP_KERNEL, which can sleep and improve the possibility of sucessful allocation. This is found by a static analysis tool named DCNS written by myself. And I also manually check it. Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2018-04-16xen: xen-pciback: Replace GFP_ATOMIC with GFP_KERNEL in pcistub_probeJia-Ju Bai
pcistub_probe() is never called in atomic context. This function is only set as ".probe" in struct pci_driver. Despite never getting called from atomic context, pcistub_probe() calls kmalloc() with GFP_ATOMIC, which does not sleep for allocation. GFP_ATOMIC is not necessary and can be replaced with GFP_KERNEL, which can sleep and improve the possibility of sucessful allocation. This is found by a static analysis tool named DCNS written by myself. And I also manually check it. Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2018-04-12Merge tag 'for-linus-4.17-rc1-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: "A few fixes of Xen related core code and drivers" * tag 'for-linus-4.17-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/pvh: Indicate XENFEAT_linux_rsdp_unrestricted to Xen xen/acpi: off by one in read_acpi_id() xen/acpi: upload _PSD info for non Dom0 CPUs too x86/xen: Delay get_cpu_cap until stack canary is established xen: xenbus_dev_frontend: Verify body of XS_TRANSACTION_END xen: xenbus: Catch closing of non existent transactions xen: xenbus_dev_frontend: Fix XS_TRANSACTION_END handling
2018-03-30xen/acpi: off by one in read_acpi_id()Dan Carpenter
If acpi_id is == nr_acpi_bits, then we access one element beyond the end of the acpi_psd[] array or we set one bit beyond the end of the bit map when we do __set_bit(acpi_id, acpi_id_present); Fixes: 59a568029181 ("xen/acpi-processor: C and P-state driver that uploads said data to hypervisor.") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2018-03-21xen/acpi: upload _PSD info for non Dom0 CPUs tooJoao Martins
All uploaded PM data from non-dom0 CPUs takes the info from vCPU 0 and changing only the acpi_id. For processors which P-state coordination type is HW_ALL (0xFD) it is OK to upload bogus P-state dependency information (_PSD), because Xen will ignore any cpufreq domains created for past CPUs. Albeit for platforms which expose coordination types as SW_ANY or SW_ALL, this will have some unintended side effects. Effectively, it will look at the P-state domain existence and *if it already exists* it will skip the acpi-cpufreq initialization and thus inherit the policy from the first CPU in the cpufreq domain. This will finally lead to the original cpu not changing target freq to P0 other than the first in the domain. Which will make turbo boost not getting enabled (e.g. for 'performance' governor) for all cpus. This patch fixes that, by also evaluating _PSD when we enumerate all ACPI processors and thus always uploading the correct info to Xen. We export acpi_processor_get_psd() for that this purpose, but change signature to not assume an existent of acpi_processor given that ACPI isn't creating an acpi_processor for non-dom0 CPUs. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2018-03-21xen: xenbus_dev_frontend: Verify body of XS_TRANSACTION_ENDSimon Gaiser
By guaranteeing that the argument of XS_TRANSACTION_END is valid we can assume that the transaction has been closed when we get an XS_ERROR response from xenstore (Note that we already verify that it's a valid transaction id). Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2018-03-21xen: xenbus: Catch closing of non existent transactionsSimon Gaiser
Users of the xenbus functions should never close a non existent transaction (for example by trying to closing the same transaction twice) but better catch it in xs_request_exit() than to corrupt the reference counter. Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2018-03-21xen: xenbus_dev_frontend: Fix XS_TRANSACTION_END handlingSimon Gaiser
Commit fd8aa9095a95 ("xen: optimize xenbus driver for multiple concurrent xenstore accesses") made a subtle change to the semantic of xenbus_dev_request_and_reply() and xenbus_transaction_end(). Before on an error response to XS_TRANSACTION_END xenbus_dev_request_and_reply() would not decrement the active transaction counter. But xenbus_transaction_end() has always counted the transaction as finished regardless of the response. The new behavior is that xenbus_dev_request_and_reply() and xenbus_transaction_end() will always count the transaction as finished regardless the response code (handled in xs_request_exit()). But xenbus_dev_frontend tries to end a transaction on closing of the device if the XS_TRANSACTION_END failed before. Trying to close the transaction twice corrupts the reference count. So fix this by also considering a transaction closed if we have sent XS_TRANSACTION_END once regardless of the return code. Cc: <stable@vger.kernel.org> # 4.11 Fixes: fd8aa9095a95 ("xen: optimize xenbus driver for multiple concurrent xenstore accesses") Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2018-03-20x86/dma: Remove dma_alloc_coherent_mask()Christoph Hellwig
These days all devices (including the ISA fallback device) have a coherent DMA mask set, so remove the workaround. Tested-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Joerg Roedel <joro@8bytes.org> Cc: Jon Mason <jdmason@kudzu.us> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Muli Ben-Yehuda <mulix@mulix.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: iommu@lists.linux-foundation.org Link: http://lkml.kernel.org/r/20180319103826.12853-3-hch@lst.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-09Merge tag 'for-linus-4.16a-rc5-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fix from Juergen Gross: "Just one fix for the correct error handling after a failed device_register()" * tag 'for-linus-4.16a-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen: xenbus: use put_device() instead of kfree()
2018-03-08xen: xenbus: use put_device() instead of kfree()Arvind Yadav
Never directly free @dev after calling device_register(), even if it returned an error! Always use put_device() to give up the reference initialized. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>