aboutsummaryrefslogtreecommitdiffstats
path: root/arch/arm64/include/asm/atomic_lse.h
AgeCommit message (Collapse)Author
2019-02-11Merge branch 'locking/atomics' into locking/core, to pick up WIP commitsIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-12-07arm64: Avoid masking "old" for LSE cmpxchg() implementationWill Deacon
The CAS instructions implicitly access only the relevant bits of the "old" argument, so there is no need for explicit masking via type-casting as there is in the LL/SC implementation. Move the casting into the LL/SC code and remove it altogether for the LSE implementation. Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-07arm64: Avoid redundant type conversions in xchg() and cmpxchg()Will Deacon
Our atomic instructions (either LSE atomics of LDXR/STXR sequences) natively support byte, half-word, word and double-word memory accesses so there is no need to mask the data register prior to being stored. Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-11-01arm64, locking/atomics: Use instrumented atomicsMark Rutland
Now that the generic atomic headers provide instrumented wrappers of all the atomics implemented by arm64, let's migrate arm64 over to these. The additional instrumentation will help to find bugs (e.g. when fuzzing with Syzkaller). Mostly this change involves adding an arch_ prefix to a number of function names and macro definitions. When LSE atomics are used, the out-of-line LL/SC atomics will be named __ll_sc_arch_atomic_${OP}. Adding the arch_ prefix requires some whitespace fixups to keep things aligned. Some other unusual whitespace is fixed up at the same time (e.g. in the cmpxchg wrappers). Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will.deacon@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: linuxdrivers@attotech.com Cc: dvyukov@google.com Cc: boqun.feng@gmail.com Cc: arnd@arndb.de Cc: aryabinin@virtuozzo.com Cc: glider@google.com Link: http://lkml.kernel.org/r/20180904104830.2975-7-mark.rutland@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-05-21arm64: lse: Add early clobbers to some input/output asm operandsWill Deacon
For LSE atomics that read and write a register operand, we need to ensure that these operands are annotated as "early clobber" if the register is written before all of the input operands have been consumed. Failure to do so can result in the compiler allocating the same register to both operands, leading to splats such as: Unable to handle kernel paging request at virtual address 11111122222221 [...] x1 : 1111111122222222 x0 : 1111111122222221 Process swapper/0 (pid: 1, stack limit = 0x000000008209f908) Call trace: test_atomic64+0x1360/0x155c where x0 has been allocated as both the value to be stored and also the atomic_t pointer. This patch adds the missing clobbers. Cc: <stable@vger.kernel.org> Cc: Dave Martin <dave.martin@arm.com> Cc: Robin Murphy <robin.murphy@arm.com> Reported-by: Mark Salter <msalter@redhat.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-07-20arm64: atomics: Remove '&' from '+&' asm constraint in lse atomicsWill Deacon
The lse implementation of atomic64_dec_if_positive uses the '+&' constraint, but the '&' is redundant and confusing in this case, since early clobber on a read/write operand is a strange concept. Replace the constraint with '+'. Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-05-09arm64: atomic_lse: match asm register sizesMark Rutland
The LSE atomic code uses asm register variables to ensure that parameters are allocated in specific registers. In the majority of cases we specifically ask for an x register when using 64-bit values, but in a couple of cases we use a w regsiter for a 64-bit value. For asm register variables, the compiler only cares about the register index, with wN and xN having the same meaning. The compiler determines the register size to use based on the type of the variable. Thus, this inconsistency is merely confusing, and not harmful to code generation. For consistency, this patch updates those cases to use the x register alias. There should be no functional change as a result of this patch. Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-09-09arm64: lse: convert lse alternatives NOP padding to use __nopsWill Deacon
The LSE atomics are implemented using alternative code sequences of different lengths, and explicit NOP padding is used to ensure the patching works correctly. This patch converts the bulk of the LSE code over to using the __nops macro, which makes it slightly clearer as to what is going on and also consolidates all of the padding at the end of the various sequences. Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-06-16locking/atomic, arch/arm64: Implement ↵Will Deacon
atomic{,64}_fetch_{add,sub,and,andnot,or,xor}{,_relaxed,_acquire,_release}() for LSE instructions Implement FETCH-OP atomic primitives, these are very similar to the existing OP-RETURN primitives we already have, except they return the value of the atomic variable _before_ modification. This is especially useful for irreversible operations -- such as bitops (because it becomes impossible to reconstruct the state prior to modification). This patch implements the LSE variants. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steve Capper <steve.capper@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arch@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Link: http://lkml.kernel.org/r/1461344493-8262-2-git-send-email-will.deacon@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-16locking/atomic, arch/arm64: Generate LSE non-return cases using common macrosWill Deacon
atomic[64]_{add,and,andnot,or,xor} all follow the same patterns, so generate them using macros, like we do for the LL/SC case already. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steve Capper <steve.capper@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arch@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Link: http://lkml.kernel.org/r/1461344493-8262-1-git-send-email-will.deacon@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-26arm64: lse: deal with clobbered IP registers after branch via PLTArd Biesheuvel
The LSE atomics implementation uses runtime patching to patch in calls to out of line non-LSE atomics implementations on cores that lack hardware support for LSE. To avoid paying the overhead cost of a function call even if no call ends up being made, the bl instruction is kept invisible to the compiler, and the out of line implementations preserve all registers, not just the ones that they are required to preserve as per the AAPCS64. However, commit fd045f6cd98e ("arm64: add support for module PLTs") added support for routing branch instructions via veneers if the branch target offset exceeds the range of the ordinary relative branch instructions. Since this deals with jump and call instructions that are exposed to ELF relocations, the PLT code uses x16 to hold the address of the branch target when it performs an indirect branch-to-register, something which is explicitly allowed by the AAPCS64 (and ordinary compiler generated code does not expect register x16 or x17 to retain their values across a bl instruction). Since the lse runtime patched bl instructions don't adhere to the AAPCS64, they don't deal with this clobbering of registers x16 and x17. So add them to the clobber list of the asm() statements that perform the call instructions, and drop x16 and x17 from the list of registers that are callee saved in the out of line non-LSE implementations. In addition, since we have given these functions two scratch registers, they no longer need to stack/unstack temp registers. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> [will: factored clobber list into #define, updated Makefile comment] Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2015-11-05arm64: cmpxchg_dbl: fix return value typeLorenzo Pieralisi
The current arm64 __cmpxchg_double{_mb} implementations carry out the compare exchange by first comparing the old values passed in to the values read from the pointer provided and by stashing the cumulative bitwise difference in a 64-bit register. By comparing the register content against 0, it is possible to detect if the values read differ from the old values passed in, so that the compare exchange detects whether it has to bail out or carry on completing the operation with the exchange. Given the current implementation, to detect the cmpxchg operation status, the __cmpxchg_double{_mb} functions should return the 64-bit stashed bitwise difference so that the caller can detect cmpxchg failure by comparing the return value content against 0. The current implementation declares the return value as an int, which means that the 64-bit value stashing the bitwise difference is truncated before being returned to the __cmpxchg_double{_mb} callers, which means that any bitwise difference present in the top 32 bits goes undetected, triggering false positives and subsequent kernel failures. This patch fixes the issue by declaring the arm64 __cmpxchg_double{_mb} return values as a long, so that the bitwise difference is properly propagated on failure, restoring the expected behaviour. Fixes: e9a4b795652f ("arm64: cmpxchg_dbl: patch in lse instructions when supported by the CPU") Cc: <stable@vger.kernel.org> # 4.3+ Cc: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2015-10-12arm64: atomics: implement native {relaxed, acquire, release} atomicsWill Deacon
Commit 654672d4ba1a ("locking/atomics: Add _{acquire|release|relaxed}() variants of some atomic operation") introduced a relaxed atomic API to Linux that maps nicely onto the arm64 memory model, including the new ARMv8.1 atomic instructions. This patch hooks up the API to our relaxed atomic instructions, rather than have them all expand to the full-barrier variants as they do currently. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2015-07-29arm64: lse: fix lse cmpxchg code indentationWill Deacon
For some reason, the ll/sc cmpxchg asm is all off to the left and awkward to read in conjunction with the following (correctly indented) LSE version. This patch shifts the ll/sc code back to where it should be. Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-07-27arm64: atomic64_dec_if_positive: fix incorrect branch conditionWill Deacon
If we attempt to atomic64_dec_if_positive on INT_MIN, we will underflow and incorrectly decide that the original parameter was positive. This patches fixes the broken condition code so that we handle this corner case correctly. Reviewed-by: Steve Capper <steve.capper@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-07-27arm64: atomics: implement atomic{,64}_cmpxchg using cmpxchgWill Deacon
We don't need duplicate cmpxchg implementations, so use cmpxchg to implement atomic{,64}_cmpxchg, like we do for xchg already. Reviewed-by: Steve Capper <steve.capper@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-07-27arm64: cmpxchg: avoid "cc" clobber in ll/sc routinesWill Deacon
We can perform the cmpxchg comparison using eor and cbnz which avoids the "cc" clobber for the ll/sc case and consequently for the LSE case where we may have to fall-back on the ll/sc code at runtime. Reviewed-by: Steve Capper <steve.capper@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-07-27arm64: cmpxchg_dbl: patch in lse instructions when supported by the CPUWill Deacon
On CPUs which support the LSE atomic instructions introduced in ARMv8.1, it makes sense to use them in preference to ll/sc sequences. This patch introduces runtime patching of our cmpxchg_double primitives so that the LSE casp instruction is used instead. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-07-27arm64: cmpxchg: patch in lse instructions when supported by the CPUWill Deacon
On CPUs which support the LSE atomic instructions introduced in ARMv8.1, it makes sense to use them in preference to ll/sc sequences. This patch introduces runtime patching of our cmpxchg primitives so that the LSE cas instruction is used instead. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-07-27arm64: atomics: patch in lse instructions when supported by the CPUWill Deacon
On CPUs which support the LSE atomic instructions introduced in ARMv8.1, it makes sense to use them in preference to ll/sc sequences. This patch introduces runtime patching of atomic_t and atomic64_t routines so that the call-site for the out-of-line ll/sc sequences is patched with an LSE atomic instruction when we detect that the CPU supports it. If binutils is not recent enough to assemble the LSE instructions, then the ll/sc sequences are inlined as though CONFIG_ARM64_LSE_ATOMICS=n. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-07-27arm64: introduce CONFIG_ARM64_LSE_ATOMICS as fallback to ll/sc atomicsWill Deacon
In order to patch in the new atomic instructions at runtime, we need to generate wrappers around the out-of-line exclusive load/store atomics. This patch adds a new Kconfig option, CONFIG_ARM64_LSE_ATOMICS. which causes our atomic functions to branch to the out-of-line ll/sc implementations. To avoid the register spill overhead of the PCS, the out-of-line functions are compiled with specific compiler flags to force out-of-line save/restore of any registers that are usually caller-saved. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>