summaryrefslogtreecommitdiffstats
path: root/init
AgeCommit message (Collapse)Author
2015-08-03ACPI / init: Switch over platform to the ACPI mode laterRafael J. Wysocki
commit b064a8fa77dfead647564c46ac8fc5b13bd1ab73 upstream. Commit 73f7d1ca3263 "ACPI / init: Run acpi_early_init() before timekeeping_init()" moved the ACPI subsystem initialization, including the ACPI mode enabling, to an earlier point in the initialization sequence, to allow the timekeeping subsystem use ACPI early. Unfortunately, that resulted in boot regressions on some systems and the early ACPI initialization was moved toward its original position in the kernel initialization code by commit c4e1acbb35e4 "ACPI / init: Invoke early ACPI initialization later". However, that turns out to be insufficient, as boot is still broken on the Tyan S8812 mainboard. To fix that issue, split the ACPI early initialization code into two pieces so the majority of it still located in acpi_early_init() and the part switching over the platform into the ACPI mode goes into a new function, acpi_subsystem_init(), executed at the original early ACPI initialization spot. That fixes the Tyan S8812 boot problem, but still allows ACPI tables to be loaded earlier which is useful to the EFI code in efi_enter_virtual_mode(). Link: https://bugzilla.kernel.org/show_bug.cgi?id=97141 Fixes: 73f7d1ca3263 "ACPI / init: Run acpi_early_init() before timekeeping_init()" Reported-and-tested-by: Marius Tolzmann <tolzmann@molgen.mpg.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Toshi Kani <toshi.kani@hp.com> Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org> Reviewed-by: Lee, Chun-Yi <jlee@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-09init/Kconfig: Fix HAVE_FUTEX_CMPXCHG to not break up the EXPERT menuJosh Triplett
commit 62b4d2041117f35ab2409c9f5c4b8d3dc8e59d0f upstream. commit 03b8c7b623c80af264c4c8d6111e5c6289933666 ("futex: Allow architectures to skip futex_atomic_cmpxchg_inatomic() test") added the HAVE_FUTEX_CMPXCHG symbol right below FUTEX. This placed it right in the middle of the options for the EXPERT menu. However, HAVE_FUTEX_CMPXCHG does not depend on EXPERT or FUTEX, so Kconfig stops placing items in the EXPERT menu, and displays the remaining several EXPERT items (starting with EPOLL) directly in the General Setup menu. Since both users of HAVE_FUTEX_CMPXCHG only select it "if FUTEX", make HAVE_FUTEX_CMPXCHG itself depend on FUTEX. With this change, the subsequent items display as part of the EXPERT menu again; the EMBEDDED menu now appears as the next top-level item in the General Setup menu, which makes General Setup much shorter and more usable. Signed-off-by: Josh Triplett <josh@joshtriplett.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-08-07x86, espfix: Make espfix64 a Kconfig option, fix UMLH. Peter Anvin
commit 197725de65477bc8509b41388157c1a2283542bb upstream. Make espfix64 a hidden Kconfig option. This fixes the x86-64 UML build which had broken due to the non-existence of init_espfix_bsp() in UML: since UML uses its own Kconfig, this option does not appear in the UML build. This also makes it possible to make support for 16-bit segments a configuration option, for the people who want to minimize the size of the kernel. Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Richard Weinberger <richard@nod.at> Link: http://lkml.kernel.org/r/1398816946-3351-1-git-send-email-hpa@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-08-07x86-64, espfix: Don't leak bits 31:16 of %esp returning to 16-bit stackH. Peter Anvin
commit 3891a04aafd668686239349ea58f3314ea2af86b upstream. The IRET instruction, when returning to a 16-bit segment, only restores the bottom 16 bits of the user space stack pointer. This causes some 16-bit software to break, but it also leaks kernel state to user space. We have a software workaround for that ("espfix") for the 32-bit kernel, but it relies on a nonzero stack segment base which is not available in 64-bit mode. In checkin: b3b42ac2cbae x86-64, modify_ldt: Ban 16-bit segments on 64-bit kernels we "solved" this by forbidding 16-bit segments on 64-bit kernels, with the logic that 16-bit support is crippled on 64-bit kernels anyway (no V86 support), but it turns out that people are doing stuff like running old Win16 binaries under Wine and expect it to work. This works around this by creating percpu "ministacks", each of which is mapped 2^16 times 64K apart. When we detect that the return SS is on the LDT, we copy the IRET frame to the ministack and use the relevant alias to return to userspace. The ministacks are mapped readonly, so if IRET faults we promote #GP to #DF which is an IST vector and thus has its own stack; we then do the fixup in the #DF handler. (Making #GP an IST exception would make the msr_safe functions unsafe in NMI/MC context, and quite possibly have other effects.) Special thanks to: - Andy Lutomirski, for the suggestion of using very small stack slots and copy (as opposed to map) the IRET frame there, and for the suggestion to mark them readonly and let the fault promote to #DF. - Konrad Wilk for paravirt fixup and testing. - Borislav Petkov for testing help and useful comments. Reported-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Link: http://lkml.kernel.org/r/1398816946-3351-1-git-send-email-hpa@linux.intel.com Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Andrew Lutomriski <amluto@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Dirk Hohndel <dirk@hohndel.org> Cc: Arjan van de Ven <arjan.van.de.ven@intel.com> Cc: comex <comexk@gmail.com> Cc: Alexander van Heukelum <heukelum@fastmail.fm> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: <stable@vger.kernel.org> # consider after upstream merge Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-05-31init/Kconfig: move the trusted keyring config option to general setupPeter Foley
commit 82c04ff89eba09d0e46e3f3649c6d3aa18e764a0 upstream. The SYSTEM_TRUSTED_KEYRING config option is not in any menu, causing it to show up in the toplevel of the kernel configuration. Fix this by moving it under the General Setup menu. Signed-off-by: Peter Foley <pefoley2@pefoley.com> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-04-14futex: Allow architectures to skip futex_atomic_cmpxchg_inatomic() testHeiko Carstens
commit 03b8c7b623c80af264c4c8d6111e5c6289933666 upstream. If an architecture has futex_atomic_cmpxchg_inatomic() implemented and there is no runtime check necessary, allow to skip the test within futex_init(). This allows to get rid of some code which would always give the same result, and also allows the compiler to optimize a couple of if statements away. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Finn Thain <fthain@telegraphics.com.au> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Link: http://lkml.kernel.org/r/20140302120947.GA3641@osiris Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-13ACPI / init: Invoke early ACPI initialization laterRafael J. Wysocki
Commit 73f7d1ca3263 (ACPI / init: Run acpi_early_init() before timekeeping_init()) optimistically moved the early ACPI initialization before timekeeping_init(), but that didn't work, because it broke fast TSC calibration for Julian Wollrath on Thinkpad x121e (and most likely for others too). The reason is that acpi_early_init() enables the SCI and that interferes with the fast TSC calibration mechanism. Thus follow the original idea to execute acpi_early_init() before efi_enter_virtual_mode() to help the EFI people for now and we can revisit the other problem that commit 73f7d1ca3263 attempted to address in the future (if really necessary). Fixes: 73f7d1ca3263 (ACPI / init: Run acpi_early_init() before timekeeping_init()) Reported-by: Julian Wollrath <jwollrath@web.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-02-05execve: use 'struct filename *' for executable name passingLinus Torvalds
This changes 'do_execve()' to get the executable name as a 'struct filename', and to free it when it is done. This is what the normal users want, and it simplifies and streamlines their error handling. The controlled lifetime of the executable name also fixes a use-after-free problem with the trace_sched_process_exec tracepoint: the lifetime of the passed-in string for kernel users was not at all obvious, and the user-mode helper code used UMH_WAIT_EXEC to serialize the pathname allocation lifetime with the execve() having finished, which in turn meant that the trace point that happened after mm_release() of the old process VM ended up using already free'd memory. To solve the kernel string lifetime issue, this simply introduces "getname_kernel()" that works like the normal user-space getname() function, except with the source coming from kernel memory. As Oleg points out, this also means that we could drop the tcomm[] array from 'struct linux_binprm', since the pathname lifetime now covers setup_new_exec(). That would be a separate cleanup. Reported-by: Igor Zhbanov <i.zhbanov@samsung.com> Tested-by: Steven Rostedt <rostedt@goodmis.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-31alpha: Enable system-call auditing support.蔡正龙
Signed-off-by: Zhenglong.cai <zhenglong.cai@cs2c.com.cn> Signed-off-by: Matt Turner <mattst88@gmail.com>
2014-01-28Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs updates from Al Viro: "Assorted stuff; the biggest pile here is Christoph's ACL series. Plus assorted cleanups and fixes all over the place... There will be another pile later this week" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (43 commits) __dentry_path() fixes vfs: Remove second variable named error in __dentry_path vfs: Is mounted should be testing mnt_ns for NULL or error. Fix race when checking i_size on direct i/o read hfsplus: remove can_set_xattr nfsd: use get_acl and ->set_acl fs: remove generic_acl nfs: use generic posix ACL infrastructure for v3 Posix ACLs gfs2: use generic posix ACL infrastructure jfs: use generic posix ACL infrastructure xfs: use generic posix ACL infrastructure reiserfs: use generic posix ACL infrastructure ocfs2: use generic posix ACL infrastructure jffs2: use generic posix ACL infrastructure hfsplus: use generic posix ACL infrastructure f2fs: use generic posix ACL infrastructure ext2/3/4: use generic posix ACL infrastructure btrfs: use generic posix ACL infrastructure fs: make posix_acl_create more useful fs: make posix_acl_chmod more useful ...
2014-01-27init/main.c: remove unused declaractions of mca_init() and sbus_init()Kang Hu
mca_init() no longer exists. sbus_init() is defined in arch/sparc/kernel/sbus.c and is a subsys_initcall. both are not needed in main.c any more. Signed-off-by: Kang Hu <hukangustc@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-25Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull user namespaces work from Eric Biederman: "The work to convert the kernel to use kuid_t and kgid_t has been finished since 3.12 so it is time to remove the scaffolding that allowed the work to progress incrementally. The first patch on this branch just removes the scaffolding, ensuring we will always get compile errors if people accidentally try the userspace and the kernel uid and gid types. The second patch an overlooked and unused chunk of mips code that that fails to build after the first patch. The code hasn't been in linux-next for long (as I was out of it and could not sheppared the cold properly) but the patch has been around for a long time just waiting for the day when I had finished the uid/gid conversions. Putting the code in linux-next did find the compile failure on mips so I took the time to get that fix reviewed and included. Beyond that I am not too worried about errors because all these two patches do is delete a modest amount of code" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: MIPS: VPE: Remove vpe_getuid and vpe_getgid userns: userns: Remove UIDGID_STRICT_TYPE_CHECKS
2014-01-25cramfs: take headers to fs/cramfsAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-01-24Merge tag 'pm+acpi-3.14-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI and power management updates from Rafael Wysocki: "As far as the number of commits goes, the top spot belongs to ACPI this time with cpufreq in the second position and a handful of PM core, PNP and cpuidle updates. They are fixes and cleanups mostly, as usual, with a couple of new features in the mix. The most visible change is probably that we will create struct acpi_device objects (visible in sysfs) for all devices represented in the ACPI tables regardless of their status and there will be a new sysfs attribute under those objects allowing user space to check that status via _STA. Consequently, ACPI device eject or generally hot-removal will not delete those objects, unless the table containing the corresponding namespace nodes is unloaded, which is extremely rare. Also ACPI container hotplug will be handled quite a bit differently and cpufreq will support CPU boost ("turbo") generically and not only in the acpi-cpufreq driver. Specifics: - ACPI core changes to make it create a struct acpi_device object for every device represented in the ACPI tables during all namespace scans regardless of the current status of that device. In accordance with this, ACPI hotplug operations will not delete those objects, unless the underlying ACPI tables go away. - On top of the above, new sysfs attribute for ACPI device objects allowing user space to check device status by triggering the execution of _STA for its ACPI object. From Srinivas Pandruvada. - ACPI core hotplug changes reducing code duplication, integrating the PCI root hotplug with the core and reworking container hotplug. - ACPI core simplifications making it use ACPI_COMPANION() in the code "glueing" ACPI device objects to "physical" devices. - ACPICA update to upstream version 20131218. This adds support for the DBG2 and PCCT tables to ACPICA, fixes some bugs and improves debug facilities. From Bob Moore, Lv Zheng and Betty Dall. - Init code change to carry out the early ACPI initialization earlier. That should allow us to use ACPI during the timekeeping initialization and possibly to simplify the EFI initialization too. From Chun-Yi Lee. - Clenups of the inclusions of ACPI headers in many places all over from Lv Zheng and Rashika Kheria (work in progress). - New helper for ACPI _DSM execution and rework of the code in drivers that uses _DSM to execute it via the new helper. From Jiang Liu. - New Win8 OSI blacklist entries from Takashi Iwai. - Assorted ACPI fixes and cleanups from Al Stone, Emil Goode, Hanjun Guo, Lan Tianyu, Masanari Iida, Oliver Neukum, Prarit Bhargava, Rashika Kheria, Tang Chen, Zhang Rui. - intel_pstate driver updates, including proper Baytrail support, from Dirk Brandewie and intel_pstate documentation from Ramkumar Ramachandra. - Generic CPU boost ("turbo") support for cpufreq from Lukasz Majewski. - powernow-k6 cpufreq driver fixes from Mikulas Patocka. - cpufreq core fixes and cleanups from Viresh Kumar, Jane Li, Mark Brown. - Assorted cpufreq drivers fixes and cleanups from Anson Huang, John Tobias, Paul Bolle, Paul Walmsley, Sachin Kamat, Shawn Guo, Viresh Kumar. - cpuidle cleanups from Bartlomiej Zolnierkiewicz. - Support for hibernation APM events from Bin Shi. - Hibernation fix to avoid bringing up nonboot CPUs with ACPI EC disabled during thaw transitions from Bjørn Mork. - PM core fixes and cleanups from Ben Dooks, Leonardo Potenza, Ulf Hansson. - PNP subsystem fixes and cleanups from Dmitry Torokhov, Levente Kurusa, Rashika Kheria. - New tool for profiling system suspend from Todd E Brandt and a cpupower tool cleanup from One Thousand Gnomes" * tag 'pm+acpi-3.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (153 commits) thermal: exynos: boost: Automatic enable/disable of BOOST feature (at Exynos4412) cpufreq: exynos4x12: Change L0 driver data to CPUFREQ_BOOST_FREQ Documentation: cpufreq / boost: Update BOOST documentation cpufreq: exynos: Extend Exynos cpufreq driver to support boost cpufreq / boost: Kconfig: Support for software-managed BOOST acpi-cpufreq: Adjust the code to use the common boost attribute cpufreq: Add boost frequency support in core intel_pstate: Add trace point to report internal state. cpufreq: introduce cpufreq_generic_get() routine ARM: SA1100: Create dummy clk_get_rate() to avoid build failures cpufreq: stats: create sysfs entries when cpufreq_stats is a module cpufreq: stats: free table and remove sysfs entry in a single routine cpufreq: stats: remove hotplug notifiers cpufreq: stats: handle cpufreq_unregister_driver() and suspend/resume properly cpufreq: speedstep: remove unused speedstep_get_state platform: introduce OF style 'modalias' support for platform bus PM / tools: new tool for suspend/resume performance optimization ACPI: fix module autoloading for ACPI enumerated devices ACPI: add module autoloading support for ACPI enumerated devices ACPI: fix create_modalias() return value handling ...
2014-01-23init: fix possible format string bugTetsuo Handa
Use constant format string in case message changes. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23init/main.c: remove unused declaration of tc_init()Geert Uytterhoeven
Its user was removed in v2.5.2.4. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-21Merge branch 'akpm' (incoming from Andrew)Linus Torvalds
Merge first patch-bomb from Andrew Morton: - a couple of misc things - inotify/fsnotify work from Jan - ocfs2 updates (partial) - about half of MM * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (117 commits) mm/migrate: remove unused function, fail_migrate_page() mm/migrate: remove putback_lru_pages, fix comment on putback_movable_pages mm/migrate: correct failure handling if !hugepage_migration_support() mm/migrate: add comment about permanent failure path mm, page_alloc: warn for non-blockable __GFP_NOFAIL allocation failure mm: compaction: reset scanner positions immediately when they meet mm: compaction: do not mark unmovable pageblocks as skipped in async compaction mm: compaction: detect when scanners meet in isolate_freepages mm: compaction: reset cached scanner pfn's before reading them mm: compaction: encapsulate defer reset logic mm: compaction: trace compaction begin and end memcg, oom: lock mem_cgroup_print_oom_info sched: add tracepoints related to NUMA task migration mm: numa: do not automatically migrate KSM pages mm: numa: trace tasks that fail migration due to rate limiting mm: numa: limit scope of lock for NUMA migrate rate limiting mm: numa: make NUMA-migrate related functions static lib/show_mem.c: show num_poisoned_pages when oom mm/hwpoison: add '#' to hwpoison_inject mm/memblock: use WARN_ONCE when MAX_NUMNODES passed as input parameter ...
2014-01-21Merge branch 'for-3.14' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup updates from Tejun Heo: "The bulk of changes are cleanups and preparations for the upcoming kernfs conversion. - cgroup_event mechanism which is and will be used only by memcg is moved to memcg. - pidlist handling is updated so that it can be served by seq_file. Also, the list is not sorted if sane_behavior. cgroup documentation explicitly states that the file is not sorted but it has been for quite some time. - All cgroup file handling now happens on top of seq_file. This is to prepare for kernfs conversion. In addition, all operations are restructured so that they map 1-1 to kernfs operations. - Other cleanups and low-pri fixes" * 'for-3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (40 commits) cgroup: trivial style updates cgroup: remove stray references to css_id doc: cgroups: Fix typo in doc/cgroups cgroup: fix fail path in cgroup_load_subsys() cgroup: fix missing unlock on error in cgroup_load_subsys() cgroup: remove for_each_root_subsys() cgroup: implement for_each_css() cgroup: factor out cgroup_subsys_state creation into create_css() cgroup: combine css handling loops in cgroup_create() cgroup: reorder operations in cgroup_create() cgroup: make for_each_subsys() useable under cgroup_root_mutex cgroup: css iterations and css_from_dir() are safe under cgroup_mutex cgroup: unify pidlist and other file handling cgroup: replace cftype->read_seq_string() with cftype->seq_show() cgroup: attach cgroup_open_file to all cgroup files cgroup: generalize cgroup_pidlist_open_file cgroup: unify read path so that seq_file is always used cgroup: unify cgroup_write_X64() and cgroup_write_string() cgroup: remove cftype->read(), ->read_map() and ->write() hugetlb_cgroup: convert away from cftype->read() ...
2014-01-21init/main.c: use memblock apis for early memory allocationsSantosh Shilimkar
Switch to memblock interfaces for early memory allocator instead of bootmem allocator. No functional change in beahvior than what it is in current code from bootmem users points of view. Archs already converted to NO_BOOTMEM now directly use memblock interfaces instead of bootmem wrappers build on top of memblock. And the archs which still uses bootmem, these new apis just fall back to exiting bootmem APIs. Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Tejun Heo <tj@kernel.org> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Grygorii Strashko <grygorii.strashko@ti.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Paul Walmsley <paul@pwsan.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Russell King <linux@arm.linux.org.uk> Cc: Tony Lindgren <tony@atomide.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-21mm: create a separate slab for page->ptl allocationKirill A. Shutemov
If DEBUG_SPINLOCK and DEBUG_LOCK_ALLOC are enabled spinlock_t on x86_64 is 72 bytes. For page->ptl they will be allocated from kmalloc-96 slab, so we loose 24 on each. An average system can easily allocate few tens thousands of page->ptl and overhead is significant. Let's create a separate slab for page->ptl allocation to solve this. To make sure that it really works this time, some numbers from my test machine (just booted, no load): Before: # grep '^\(kmalloc-96\|page->ptl\)' /proc/slabinfo kmalloc-96 31987 32190 128 30 1 : tunables 120 60 8 : slabdata 1073 1073 92 After: # grep '^\(kmalloc-96\|page->ptl\)' /proc/slabinfo page->ptl 27516 28143 72 53 1 : tunables 120 60 8 : slabdata 531 531 9 kmalloc-96 3853 5280 128 30 1 : tunables 120 60 8 : slabdata 176 176 0 Note that the patch is useful not only for debug case, but also for PREEMPT_RT, where spinlock_t is always bloated. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-16ACPI / init: Run acpi_early_init() before timekeeping_init()Lee, Chun-Yi
This is a variant patch from Rafael J. Wysocki's ACPI / init: Run acpi_early_init() before efi_enter_virtual_mode() According to Matt Fleming, if acpi_early_init() was executed before efi_enter_virtual_mode(), the EFI initialization could benefit from it, so Rafael's patch makes that happen. And, we want accessing ACPI TAD device to set system clock, so move acpi_early_init() before timekeeping_init(). This final position is also before efi_enter_virtual_mode(). Tested-by: Toshi Kani <toshi.kani@hp.com> Signed-off-by: Lee, Chun-Yi <jlee@suse.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-01-12Merge branch 'linus' into timers/coreIngo Molnar
Pick up the latest fixes and refresh the branch. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-12-11math64: Add mul_u64_u32_shr()Peter Zijlstra
Introduce mul_u64_u32_shr() as proposed by Andy a while back; it allows using 64x64->128 muls on 64bit archs and recent GCC which defines __SIZEOF_INT128__ and __int128. (This new method will be used by the scheduler.) Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: fweisbec@gmail.com Cc: Andy Lutomirski <luto@amacapital.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/n/tip-hxjoeuzmrcaumR0uZwjpe2pv@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-12-02trivial: fix spelling in CONTEXT_TRACKING_FORCE help textPaul Gortmaker
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org>
2013-11-26userns: userns: Remove UIDGID_STRICT_TYPE_CHECKSEric W. Biederman
Removing UIDGID_STRICT_TYPE_CHECKS simplifies the code and always generates a compile error if the uids and kuids or gids and kgids are mixed by accident. Now that the appropriate conversions have been placed throughout the kernel there is no longer a need for a mode where we don't detect them as compile errors. Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2013-11-22cgroup: Merge branch 'memcg_event' into for-3.14Tejun Heo
Merge v3.12 based patch series to move cgroup_event implementation to memcg into for-3.14. The following two commits cause a conflict in kernel/cgroup.c 2ff2a7d03bbe4 ("cgroup: kill css_id") 79bd9814e5ec9 ("cgroup, memcg: move cgroup_event implementation to memcg") Each patch removes a struct definition from kernel/cgroup.c. As the two are adjacent, they cause a context conflict. Easily resolved by removing both structs. Signed-off-by: Tejun Heo <tj@kernel.org>
2013-11-22cgroup, memcg: move cgroup_event implementation to memcgTejun Heo
cgroup_event is way over-designed and tries to build a generic flexible event mechanism into cgroup - fully customizable event specification for each user of the interface. This is utterly unnecessary and overboard especially in the light of the planned unified hierarchy as there's gonna be single agent. Simply generating events at fixed points, or if that's too restrictive, configureable cadence or single set of configureable points should be enough. Thankfully, memcg is the only user and gets to keep it. Replacing it with something simpler on sane_behavior is strongly recommended. This patch moves cgroup_event and "cgroup.event_control" implementation to mm/memcontrol.c. Clearing of events on cgroup destruction is moved from cgroup_destroy_locked() to mem_cgroup_css_offline(), which shouldn't make any noticeable difference. cgroup_css() and __file_cft() are exported to enable the move; however, this will soon be reverted once the event code is updated to be memcg specific. Note that "cgroup.event_control" will now exist only on the hierarchy with memcg attached to it. While this change is visible to userland, it is unlikely to be noticeable as the file has never been meaningful outside memcg. Aside from the above change, this is pure code relocation. v2: Per Li Zefan's comments, init/Kconfig updated accordingly and poll.h inclusion moved from cgroup.c to memcontrol.c. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizefan@huawei.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Balbir Singh <bsingharora@gmail.com>
2013-11-21Merge branch 'for-linus2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security subsystem updates from James Morris: "In this patchset, we finally get an SELinux update, with Paul Moore taking over as maintainer of that code. Also a significant update for the Keys subsystem, as well as maintenance updates to Smack, IMA, TPM, and Apparmor" and since I wanted to know more about the updates to key handling, here's the explanation from David Howells on that: "Okay. There are a number of separate bits. I'll go over the big bits and the odd important other bit, most of the smaller bits are just fixes and cleanups. If you want the small bits accounting for, I can do that too. (1) Keyring capacity expansion. KEYS: Consolidate the concept of an 'index key' for key access KEYS: Introduce a search context structure KEYS: Search for auth-key by name rather than target key ID Add a generic associative array implementation. KEYS: Expand the capacity of a keyring Several of the patches are providing an expansion of the capacity of a keyring. Currently, the maximum size of a keyring payload is one page. Subtract a small header and then divide up into pointers, that only gives you ~500 pointers on an x86_64 box. However, since the NFS idmapper uses a keyring to store ID mapping data, that has proven to be insufficient to the cause. Whatever data structure I use to handle the keyring payload, it can only store pointers to keys, not the keys themselves because several keyrings may point to a single key. This precludes inserting, say, and rb_node struct into the key struct for this purpose. I could make an rbtree of records such that each record has an rb_node and a key pointer, but that would use four words of space per key stored in the keyring. It would, however, be able to use much existing code. I selected instead a non-rebalancing radix-tree type approach as that could have a better space-used/key-pointer ratio. I could have used the radix tree implementation that we already have and insert keys into it by their serial numbers, but that means any sort of search must iterate over the whole radix tree. Further, its nodes are a bit on the capacious side for what I want - especially given that key serial numbers are randomly allocated, thus leaving a lot of empty space in the tree. So what I have is an associative array that internally is a radix-tree with 16 pointers per node where the index key is constructed from the key type pointer and the key description. This means that an exact lookup by type+description is very fast as this tells us how to navigate directly to the target key. I made the data structure general in lib/assoc_array.c as far as it is concerned, its index key is just a sequence of bits that leads to a pointer. It's possible that someone else will be able to make use of it also. FS-Cache might, for example. (2) Mark keys as 'trusted' and keyrings as 'trusted only'. KEYS: verify a certificate is signed by a 'trusted' key KEYS: Make the system 'trusted' keyring viewable by userspace KEYS: Add a 'trusted' flag and a 'trusted only' flag KEYS: Separate the kernel signature checking keyring from module signing These patches allow keys carrying asymmetric public keys to be marked as being 'trusted' and allow keyrings to be marked as only permitting the addition or linkage of trusted keys. Keys loaded from hardware during kernel boot or compiled into the kernel during build are marked as being trusted automatically. New keys can be loaded at runtime with add_key(). They are checked against the system keyring contents and if their signatures can be validated with keys that are already marked trusted, then they are marked trusted also and can thus be added into the master keyring. Patches from Mimi Zohar make this usable with the IMA keyrings also. (3) Remove the date checks on the key used to validate a module signature. X.509: Remove certificate date checks It's not reasonable to reject a signature just because the key that it was generated with is no longer valid datewise - especially if the kernel hasn't yet managed to set the system clock when the first module is loaded - so just remove those checks. (4) Make it simpler to deal with additional X.509 being loaded into the kernel. KEYS: Load *.x509 files into kernel keyring KEYS: Have make canonicalise the paths of the X.509 certs better to deduplicate The builder of the kernel now just places files with the extension ".x509" into the kernel source or build trees and they're concatenated by the kernel build and stuffed into the appropriate section. (5) Add support for userspace kerberos to use keyrings. KEYS: Add per-user_namespace registers for persistent per-UID kerberos caches KEYS: Implement a big key type that can save to tmpfs Fedora went to, by default, storing kerberos tickets and tokens in tmpfs. We looked at storing it in keyrings instead as that confers certain advantages such as tickets being automatically deleted after a certain amount of time and the ability for the kernel to get at these tokens more easily. To make this work, two things were needed: (a) A way for the tickets to persist beyond the lifetime of all a user's sessions so that cron-driven processes can still use them. The problem is that a user's session keyrings are deleted when the session that spawned them logs out and the user's user keyring is deleted when the UID is deleted (typically when the last log out happens), so neither of these places is suitable. I've added a system keyring into which a 'persistent' keyring is created for each UID on request. Each time a user requests their persistent keyring, the expiry time on it is set anew. If the user doesn't ask for it for, say, three days, the keyring is automatically expired and garbage collected using the existing gc. All the kerberos tokens it held are then also gc'd. (b) A key type that can hold really big tickets (up to 1MB in size). The problem is that Active Directory can return huge tickets with lots of auxiliary data attached. We don't, however, want to eat up huge tracts of unswappable kernel space for this, so if the ticket is greater than a certain size, we create a swappable shmem file and dump the contents in there and just live with the fact we then have an inode and a dentry overhead. If the ticket is smaller than that, we slap it in a kmalloc()'d buffer" * 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (121 commits) KEYS: Fix keyring content gc scanner KEYS: Fix error handling in big_key instantiation KEYS: Fix UID check in keyctl_get_persistent() KEYS: The RSA public key algorithm needs to select MPILIB ima: define '_ima' as a builtin 'trusted' keyring ima: extend the measurement list to include the file signature kernel/system_certificate.S: use real contents instead of macro GLOBAL() KEYS: fix error return code in big_key_instantiate() KEYS: Fix keyring quota misaccounting on key replacement and unlink KEYS: Fix a race between negating a key and reading the error set KEYS: Make BIG_KEYS boolean apparmor: remove the "task" arg from may_change_ptraced_domain() apparmor: remove parent task info from audit logging apparmor: remove tsk field from the apparmor_audit_struct apparmor: fix capability to not use the current task, during reporting Smack: Ptrace access check mode ima: provide hash algo info in the xattr ima: enable support for larger default filedata hash algorithms ima: define kernel parameter 'ima_template=' to change configured default ima: add Kconfig default measurement list template ...
2013-11-21Merge git://git.infradead.org/users/eparis/auditLinus Torvalds
Pull audit updates from Eric Paris: "Nothing amazing. Formatting, small bug fixes, couple of fixes where we didn't get records due to some old VFS changes, and a change to how we collect execve info..." Fixed conflict in fs/exec.c as per Eric and linux-next. * git://git.infradead.org/users/eparis/audit: (28 commits) audit: fix type of sessionid in audit_set_loginuid() audit: call audit_bprm() only once to add AUDIT_EXECVE information audit: move audit_aux_data_execve contents into audit_context union audit: remove unused envc member of audit_aux_data_execve audit: Kill the unused struct audit_aux_data_capset audit: do not reject all AUDIT_INODE filter types audit: suppress stock memalloc failure warnings since already managed audit: log the audit_names record type audit: add child record before the create to handle case where create fails audit: use given values in tty_audit enable api audit: use nlmsg_len() to get message payload length audit: use memset instead of trying to initialize field by field audit: fix info leak in AUDIT_GET requests audit: update AUDIT_INODE filter rule to comparator function audit: audit feature to set loginuid immutable audit: audit feature to only allow unsetting the loginuid audit: allow unsetting the loginuid (with priv) audit: remove CONFIG_AUDIT_LOGINUID_IMMUTABLE audit: loginuid functions coding style selinux: apply selinux checks on new audit message types ...
2013-11-20Revert "mm: create a separate slab for page->ptl allocation"Linus Torvalds
This reverts commit ea1e7ed33708c7a760419ff9ded0a6cb90586a50. Al points out that while the commit *does* actually create a separate slab for the page->ptl allocation, that slab is never actually used, and the code continues to use kmalloc/kfree. Damien Wyart points out that the original patch did have the conversion to use kmem_cache_alloc/free, so it got lost somewhere on its way to me. Revert the half-arsed attempt that didn't do anything. If we really do want the special slab (remember: this is all relevant just for debug builds, so it's not necessarily all that critical) we might as well redo the patch fully. Reported-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Andrew Morton <akpm@linux-foundation.org> Cc: Kirill A Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-17Revert "init/Kconfig: add option to disable kernel compression"H. Peter Anvin
This reverts commit 69f0554ec261fd686ac7fa1c598cc9eb27b83a80. This patch breaks randconfig on at least the x86-64 architecture, and most likely on others. There is work underway to support uncompressed kernels in a generic way, but it looks like it will amount to rewriting the support from scratch; see the LKML thread in the Link: for info. Therefore, revert this change and wait for the fix. Reported-by: Pavel Roskin <proski@gnu.org> Cc: Christian Ruppert <christian.ruppert@abilis.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/20131113113418.167b8ffd@IRBT4585 Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-15Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial tree updates from Jiri Kosina: "Usual earth-shaking, news-breaking, rocket science pile from trivial.git" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (23 commits) doc: usb: Fix typo in Documentation/usb/gadget_configs.txt doc: add missing files to timers/00-INDEX timekeeping: Fix some trivial typos in comments mm: Fix some trivial typos in comments irq: Fix some trivial typos in comments NUMA: fix typos in Kconfig help text mm: update 00-INDEX doc: Documentation/DMA-attributes.txt fix typo DRM: comment: `halve' -> `half' Docs: Kconfig: `devlopers' -> `developers' doc: typo on word accounting in kprobes.c in mutliple architectures treewide: fix "usefull" typo treewide: fix "distingush" typo mm/Kconfig: Grammar s/an/a/ kexec: Typo s/the/then/ Documentation/kvm: Update cpuid documentation for steal time and pv eoi treewide: Fix common typo in "identify" __page_to_pfn: Fix typo in comment Correct some typos for word frequency clk: fixed-factor: Fix a trivial typo ...
2013-11-15Merge tag 'modules-next-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull module updates from Rusty Russell: "Mainly boring here, too. rmmod --wait finally removed, though" * tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: modpost: fix bogus 'exported twice' warnings. init: fix in-place parameter modification regression asmlinkage, module: Make ksymtab and kcrctab symbols and __this_module __visible kernel: add support for init_array constructors modpost: Optionally ignore secondary errors seen if a single module build fails module: remove rmmod --wait option.
2013-11-15mm: create a separate slab for page->ptl allocationKirill A. Shutemov
If DEBUG_SPINLOCK and DEBUG_LOCK_ALLOC are enabled spinlock_t on x86_64 is 72 bytes. For page->ptl they will be allocated from kmalloc-96 slab, so we loose 24 on each. An average system can easily allocate few tens thousands of page->ptl and overhead is significant. Let's create a separate slab for page->ptl allocation to solve this. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds
Pull networking updates from David Miller: 1) The addition of nftables. No longer will we need protocol aware firewall filtering modules, it can all live in userspace. At the core of nftables is a, for lack of a better term, virtual machine that executes byte codes to inspect packet or metadata (arriving interface index, etc.) and make verdict decisions. Besides support for loading packet contents and comparing them, the interpreter supports lookups in various datastructures as fundamental operations. For example sets are supports, and therefore one could create a set of whitelist IP address entries which have ACCEPT verdicts attached to them, and use the appropriate byte codes to do such lookups. Since the interpreted code is composed in userspace, userspace can do things like optimize things before giving it to the kernel. Another major improvement is the capability of atomically updating portions of the ruleset. In the existing netfilter implementation, one has to update the entire rule set in order to make a change and this is very expensive. Userspace tools exist to create nftables rules using existing netfilter rule sets, but both kernel implementations will need to co-exist for quite some time as we transition from the old to the new stuff. Kudos to Patrick McHardy, Pablo Neira Ayuso, and others who have worked so hard on this. 2) Daniel Borkmann and Hannes Frederic Sowa made several improvements to our pseudo-random number generator, mostly used for things like UDP port randomization and netfitler, amongst other things. In particular the taus88 generater is updated to taus113, and test cases are added. 3) Support 64-bit rates in HTB and TBF schedulers, from Eric Dumazet and Yang Yingliang. 4) Add support for new 577xx tigon3 chips to tg3 driver, from Nithin Sujir. 5) Fix two fatal flaws in TCP dynamic right sizing, from Eric Dumazet, Neal Cardwell, and Yuchung Cheng. 6) Allow IP_TOS and IP_TTL to be specified in sendmsg() ancillary control message data, much like other socket option attributes. From Francesco Fusco. 7) Allow applications to specify a cap on the rate computed automatically by the kernel for pacing flows, via a new SO_MAX_PACING_RATE socket option. From Eric Dumazet. 8) Make the initial autotuned send buffer sizing in TCP more closely reflect actual needs, from Eric Dumazet. 9) Currently early socket demux only happens for TCP sockets, but we can do it for connected UDP sockets too. Implementation from Shawn Bohrer. 10) Refactor inet socket demux with the goal of improving hash demux performance for listening sockets. With the main goals being able to use RCU lookups on even request sockets, and eliminating the listening lock contention. From Eric Dumazet. 11) The bonding layer has many demuxes in it's fast path, and an RCU conversion was started back in 3.11, several changes here extend the RCU usage to even more locations. From Ding Tianhong and Wang Yufen, based upon suggestions by Nikolay Aleksandrov and Veaceslav Falico. 12) Allow stackability of segmentation offloads to, in particular, allow segmentation offloading over tunnels. From Eric Dumazet. 13) Significantly improve the handling of secret keys we input into the various hash functions in the inet hashtables, TCP fast open, as well as syncookies. From Hannes Frederic Sowa. The key fundamental operation is "net_get_random_once()" which uses static keys. Hannes even extended this to ipv4/ipv6 fragmentation handling and our generic flow dissector. 14) The generic driver layer takes care now to set the driver data to NULL on device removal, so it's no longer necessary for drivers to explicitly set it to NULL any more. Many drivers have been cleaned up in this way, from Jingoo Han. 15) Add a BPF based packet scheduler classifier, from Daniel Borkmann. 16) Improve CRC32 interfaces and generic SKB checksum iterators so that SCTP's checksumming can more cleanly be handled. Also from Daniel Borkmann. 17) Add a new PMTU discovery mode, IP_PMTUDISC_INTERFACE, which forces using the interface MTU value. This helps avoid PMTU attacks, particularly on DNS servers. From Hannes Frederic Sowa. 18) Use generic XPS for transmit queue steering rather than internal (re-)implementation in virtio-net. From Jason Wang. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1622 commits) random32: add test cases for taus113 implementation random32: upgrade taus88 generator to taus113 from errata paper random32: move rnd_state to linux/random.h random32: add prandom_reseed_late() and call when nonblocking pool becomes initialized random32: add periodic reseeding random32: fix off-by-one in seeding requirement PHY: Add RTL8201CP phy_driver to realtek xtsonic: add missing platform_set_drvdata() in xtsonic_probe() macmace: add missing platform_set_drvdata() in mace_probe() ethernet/arc/arc_emac: add missing platform_set_drvdata() in arc_emac_probe() ipv6: protect for_each_sk_fl_rcu in mem_check with rcu_read_lock_bh vlan: Implement vlan_dev_get_egress_qos_mask as an inline. ixgbe: add warning when max_vfs is out of range. igb: Update link modes display in ethtool netfilter: push reasm skb through instead of original frag skbs ip6_output: fragment outgoing reassembled skb properly MAINTAINERS: mv643xx_eth: take over maintainership from Lennart net_sched: tbf: support of 64bit rates ixgbe: deleting dfwd stations out of order can cause null ptr deref ixgbe: fix build err, num_rx_queues is only available with CONFIG_RPS ...
2013-11-13./Makefile: export initial ramdisk compression config optionP J P
Make menuconfig allows one to choose compression format of an initial ramdisk image. But this choice does not result in duly compressed ramdisk image. Because - $ make install - does not pass on the selected compression choice to the dracut(8) tool, which creates the initramfs file. dracut(8) generates the image with the default compression, ie. gzip(1). This patch exports the selected compression option to a sub-shell environment, so that it could be used by dracut(8) tool to generate appropriately compressed initramfs images. There isn't a straightforward way to pass on options to dracut(8) via positional parameters. Because it is indirectly invoked at the end of a $ make install sequence. # make install -> arch/$arch/boot/Makefile -> arch/$arch/boot/install.sh -> /sbing/installkernel ... -> /sbin/new-kernel-pkg ... -> /sbin/dracut ... Signed-off-by: P J P <ppandit@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-13init/Kconfig: add option to disable kernel compressionChristian Ruppert
Some ARC users say they can boot faster with without kernel compression. This probably depends on things like the FLASH chip they use etc. Until now, kernel compression can only be disabled by removing "select HAVE_<compression>" lines from the architecture Kconfig. So add the Kconfig logic to permit disabling of kernel compression. Signed-off-by: Christian Ruppert <christian.ruppert@abilis.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-13init: make init failures more explicitMichael Opdenacker
This patch proposes to make init failures more explicit. Before this, the "No init found" message didn't help much. It could sometimes be misleading and actually mean "No *working* init found". This message could hide many different issues: - no init program candidates found at all - some init program candidates exist but can't be executed (missing execute permissions, failed to load shared libraries, executable compiled for an unknown architecture...) This patch notifies the kernel user when a candidate init program is found but can't be executed. In each failure situation, the error code is displayed, to quickly find the root cause. "No init found" is also replaced by "No working init found", which is more correct. This will help embedded Linux developers (especially the newcomers), regularly making and debugging new root filesystems. Credits to Geert Uytterhoeven and Janne Karhunen for their improvement suggestions. Signed-off-by: Michael Opdenacker <michael.opdenacker@free-electrons.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Tested-by: Janne Karhunen <Janne.Karhunen@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-13init/do_mounts_rd.c: fix NULL pointer dereference while loading initramfsP J P
Make menuconfig allows one to choose compression format of an initial ramdisk image. But this choice does not result in duly compressed initial ramdisk image. Because - $ make install - does not pass on the selected compression choice to the dracut(8) tool, which creates the initramfs file. dracut(8) generates the image with the default compression, ie. gzip(1). If a user chose any other compression instead of gzip(1), it leads to a crash due to NULL pointer dereference in crd_load(), caused by a NULL function pointer returned by the 'decompress_method()' routine. Because the initramfs image is gzip(1) compressed, whereas the kernel knows only to decompress the chosen format and not gzip(1). This patch replaces the crash by an explicit panic() call with an appropriate error message. This shall prevent the kernel from eventually panicking in: init/do_mounts.c: mount_block_root() with -> panic("VFS: Unable to mount root fs on %s", b); [akpm@linux-foundation.org: mention that the problem is with the ramdisk, don't print known-to-be-NULL value] Signed-off-by: P J P <prasad@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-13init/main.c: remove prototype for softirq_init()Geert Uytterhoeven
It's already available in <linux/interrupt.h> Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-13init/do_mounts.c: add maj:min syntax commentSebastian Capella
The name_to_dev_t function has a comment block which lists the supported syntaxes for the device name. Add a bullet for the <major>:<minor> syntax, which is already supported in the code Signed-off-by: Sebastian Capella <sebastian.capella@linaro.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-12Merge branch 'timers-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer changes from Ingo Molnar: "Main changes in this cycle were: - Updated full dynticks support. - Event stream support for architected (ARM) timers. - ARM clocksource driver updates. - Move arm64 to using the generic sched_clock framework & resulting cleanup in the generic sched_clock code. - Misc fixes and cleanups" * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (50 commits) x86/time: Honor ACPI FADT flag indicating absence of a CMOS RTC clocksource: sun4i: remove IRQF_DISABLED clocksource: sun4i: Report the minimum tick that we can program clocksource: sun4i: Select CLKSRC_MMIO clocksource: Provide timekeeping for efm32 SoCs clocksource: em_sti: convert to clk_prepare/unprepare time: Fix signedness bug in sysfs_get_uname() and its callers timekeeping: Fix some trivial typos in comments alarmtimer: return EINVAL instead of ENOTSUPP if rtcdev doesn't exist clocksource: arch_timer: Do not register arch_sys_counter twice timer stats: Add a 'Collection: active/inactive' line to timer usage statistics sched_clock: Remove sched_clock_func() hook arch_timer: Move to generic sched_clock framework clocksource: tcb_clksrc: Remove IRQF_DISABLED clocksource: tcb_clksrc: Improve driver robustness clocksource: tcb_clksrc: Replace clk_enable/disable with clk_prepare_enable/disable_unprepare clocksource: arm_arch_timer: Use clocksource for suspend timekeeping clocksource: dw_apb_timer_of: Mark a few more functions as __init clocksource: Put nodes passed to CLOCKSOURCE_OF_DECLARE callbacks centrally arm: zynq: Enable arm_global_timer ...
2013-11-12Merge branch 'sched-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler changes from Ingo Molnar: "The main changes in this cycle are: - (much) improved CONFIG_NUMA_BALANCING support from Mel Gorman, Rik van Riel, Peter Zijlstra et al. Yay! - optimize preemption counter handling: merge the NEED_RESCHED flag into the preempt_count variable, by Peter Zijlstra. - wait.h fixes and code reorganization from Peter Zijlstra - cfs_bandwidth fixes from Ben Segall - SMP load-balancer cleanups from Peter Zijstra - idle balancer improvements from Jason Low - other fixes and cleanups" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (129 commits) ftrace, sched: Add TRACE_FLAG_PREEMPT_RESCHED stop_machine: Fix race between stop_two_cpus() and stop_cpus() sched: Remove unnecessary iteration over sched domains to update nr_busy_cpus sched: Fix asymmetric scheduling for POWER7 sched: Move completion code from core.c to completion.c sched: Move wait code from core.c to wait.c sched: Move wait.c into kernel/sched/ sched/wait: Fix __wait_event_interruptible_lock_irq_timeout() sched: Avoid throttle_cfs_rq() racing with period_timer stopping sched: Guarantee new group-entities always have weight sched: Fix hrtimer_cancel()/rq->lock deadlock sched: Fix cfs_bandwidth misuse of hrtimer_expires_remaining sched: Fix race on toggling cfs_bandwidth_used sched: Remove extra put_online_cpus() inside sched_setaffinity() sched/rt: Fix task_tick_rt() comment sched/wait: Fix build breakage sched/wait: Introduce prepare_to_wait_event() sched/wait: Add ___wait_cond_timeout() to wait_event*_timeout() too sched: Remove get_online_cpus() usage sched: Fix race in migrate_swap_stop() ...
2013-11-07parisc: add kernel audit featureHelge Deller
Implement missing functions for parisc to provide kernel audit feature. Signed-off-by: Helge Deller <deller@gmx.de>
2013-11-05audit: remove CONFIG_AUDIT_LOGINUID_IMMUTABLEEric Paris
After trying to use this feature in Fedora we found the hard coding policy like this into the kernel was a bad idea. Surprise surprise. We ran into these problems because it was impossible to launch a container as a logged in user and run a login daemon inside that container. This reverts back to the old behavior before this option was added. The option will be re-added in a userspace selectable manor such that userspace can choose when it is and when it is not appropriate. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Eric Paris <eparis@redhat.com>
2013-10-31init: fix in-place parameter modification regressionKrzysztof Mazur
Before commit 026cee0086fe1df4cf74691cf273062cc769617d ("params: <level>_initcall-like kernel parameters") the __setup parameter parsing code could modify parameter in the static_command_line buffer and such modifications were kept. After that commit such modifications are destroyed during per-initcall level parameter parsing because the same static_command_line buffer is used and only parameters for appropriate initcall level are parsed. That change broke at least parsing "ubd" parameter in the ubd driver when the COW file is used. Now the separate buffer is used for per-initcall parameter parsing. Signed-off-by: Krzysztof Mazur <krzysiek@podlesie.net> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-10-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: drivers/net/usb/qmi_wwan.c include/net/dst.h Trivial merge conflicts, both were overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-19static_key: WARN on usage before jump_label_init was calledHannes Frederic Sowa
Usage of the static key primitives to toggle a branch must not be used before jump_label_init() is called from init/main.c. jump_label_init reorganizes and wires up the jump_entries so usage before that could have unforeseen consequences. Following primitives are now checked for correct use: * static_key_slow_inc * static_key_slow_dec * static_key_slow_dec_deferred * jump_label_rate_limit The x86 architecture already checks this by testing if the default_nop was already replaced with an optimal nop or with a branch instruction. It will panic then. Other architectures don't check for this. Because we need to relax this check for the x86 arch to allow code to transition from default_nop to the enabled state and other architectures did not check for this at all this patch introduces checking on the static_key primitives in a non-arch dependent manner. All checked functions are considered slow-path so the additional check does no harm to performance. The warnings are best observed with earlyprintk. Based on a patch from Andi Kleen. Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Andi Kleen <andi@firstfloor.org> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-14NUMA: fix typos in Kconfig help textPaul Gortmaker
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2013-10-11Merge branch 'core/urgent' into sched/coreIngo Molnar
Merge in asm goto fix, to be able to apply the asm/rmwcc.h fix. Signed-off-by: Ingo Molnar <mingo@kernel.org>