aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation/livepatch/livepatch.txt
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/livepatch/livepatch.txt')
-rw-r--r--Documentation/livepatch/livepatch.txt467
1 files changed, 0 insertions, 467 deletions
diff --git a/Documentation/livepatch/livepatch.txt b/Documentation/livepatch/livepatch.txt
deleted file mode 100644
index 2d7ed09dbd59..000000000000
--- a/Documentation/livepatch/livepatch.txt
+++ /dev/null
@@ -1,467 +0,0 @@
-=========
-Livepatch
-=========
-
-This document outlines basic information about kernel livepatching.
-
-Table of Contents:
-
-1. Motivation
-2. Kprobes, Ftrace, Livepatching
-3. Consistency model
-4. Livepatch module
- 4.1. New functions
- 4.2. Metadata
- 4.3. Livepatch module handling
-5. Livepatch life-cycle
- 5.1. Registration
- 5.2. Enabling
- 5.3. Disabling
- 5.4. Unregistration
-6. Sysfs
-7. Limitations
-
-
-1. Motivation
-=============
-
-There are many situations where users are reluctant to reboot a system. It may
-be because their system is performing complex scientific computations or under
-heavy load during peak usage. In addition to keeping systems up and running,
-users want to also have a stable and secure system. Livepatching gives users
-both by allowing for function calls to be redirected; thus, fixing critical
-functions without a system reboot.
-
-
-2. Kprobes, Ftrace, Livepatching
-================================
-
-There are multiple mechanisms in the Linux kernel that are directly related
-to redirection of code execution; namely: kernel probes, function tracing,
-and livepatching:
-
- + The kernel probes are the most generic. The code can be redirected by
- putting a breakpoint instruction instead of any instruction.
-
- + The function tracer calls the code from a predefined location that is
- close to the function entry point. This location is generated by the
- compiler using the '-pg' gcc option.
-
- + Livepatching typically needs to redirect the code at the very beginning
- of the function entry before the function parameters or the stack
- are in any way modified.
-
-All three approaches need to modify the existing code at runtime. Therefore
-they need to be aware of each other and not step over each other's toes.
-Most of these problems are solved by using the dynamic ftrace framework as
-a base. A Kprobe is registered as a ftrace handler when the function entry
-is probed, see CONFIG_KPROBES_ON_FTRACE. Also an alternative function from
-a live patch is called with the help of a custom ftrace handler. But there are
-some limitations, see below.
-
-
-3. Consistency model
-====================
-
-Functions are there for a reason. They take some input parameters, get or
-release locks, read, process, and even write some data in a defined way,
-have return values. In other words, each function has a defined semantic.
-
-Many fixes do not change the semantic of the modified functions. For
-example, they add a NULL pointer or a boundary check, fix a race by adding
-a missing memory barrier, or add some locking around a critical section.
-Most of these changes are self contained and the function presents itself
-the same way to the rest of the system. In this case, the functions might
-be updated independently one by one.
-
-But there are more complex fixes. For example, a patch might change
-ordering of locking in multiple functions at the same time. Or a patch
-might exchange meaning of some temporary structures and update
-all the relevant functions. In this case, the affected unit
-(thread, whole kernel) need to start using all new versions of
-the functions at the same time. Also the switch must happen only
-when it is safe to do so, e.g. when the affected locks are released
-or no data are stored in the modified structures at the moment.
-
-The theory about how to apply functions a safe way is rather complex.
-The aim is to define a so-called consistency model. It attempts to define
-conditions when the new implementation could be used so that the system
-stays consistent.
-
-Livepatch has a consistency model which is a hybrid of kGraft and
-kpatch: it uses kGraft's per-task consistency and syscall barrier
-switching combined with kpatch's stack trace switching. There are also
-a number of fallback options which make it quite flexible.
-
-Patches are applied on a per-task basis, when the task is deemed safe to
-switch over. When a patch is enabled, livepatch enters into a
-transition state where tasks are converging to the patched state.
-Usually this transition state can complete in a few seconds. The same
-sequence occurs when a patch is disabled, except the tasks converge from
-the patched state to the unpatched state.
-
-An interrupt handler inherits the patched state of the task it
-interrupts. The same is true for forked tasks: the child inherits the
-patched state of the parent.
-
-Livepatch uses several complementary approaches to determine when it's
-safe to patch tasks:
-
-1. The first and most effective approach is stack checking of sleeping
- tasks. If no affected functions are on the stack of a given task,
- the task is patched. In most cases this will patch most or all of
- the tasks on the first try. Otherwise it'll keep trying
- periodically. This option is only available if the architecture has
- reliable stacks (HAVE_RELIABLE_STACKTRACE).
-
-2. The second approach, if needed, is kernel exit switching. A
- task is switched when it returns to user space from a system call, a
- user space IRQ, or a signal. It's useful in the following cases:
-
- a) Patching I/O-bound user tasks which are sleeping on an affected
- function. In this case you have to send SIGSTOP and SIGCONT to
- force it to exit the kernel and be patched.
- b) Patching CPU-bound user tasks. If the task is highly CPU-bound
- then it will get patched the next time it gets interrupted by an
- IRQ.
-
-3. For idle "swapper" tasks, since they don't ever exit the kernel, they
- instead have a klp_update_patch_state() call in the idle loop which
- allows them to be patched before the CPU enters the idle state.
-
- (Note there's not yet such an approach for kthreads.)
-
-Architectures which don't have HAVE_RELIABLE_STACKTRACE solely rely on
-the second approach. It's highly likely that some tasks may still be
-running with an old version of the function, until that function
-returns. In this case you would have to signal the tasks. This
-especially applies to kthreads. They may not be woken up and would need
-to be forced. See below for more information.
-
-Unless we can come up with another way to patch kthreads, architectures
-without HAVE_RELIABLE_STACKTRACE are not considered fully supported by
-the kernel livepatching.
-
-The /sys/kernel/livepatch/<patch>/transition file shows whether a patch
-is in transition. Only a single patch (the topmost patch on the stack)
-can be in transition at a given time. A patch can remain in transition
-indefinitely, if any of the tasks are stuck in the initial patch state.
-
-A transition can be reversed and effectively canceled by writing the
-opposite value to the /sys/kernel/livepatch/<patch>/enabled file while
-the transition is in progress. Then all the tasks will attempt to
-converge back to the original patch state.
-
-There's also a /proc/<pid>/patch_state file which can be used to
-determine which tasks are blocking completion of a patching operation.
-If a patch is in transition, this file shows 0 to indicate the task is
-unpatched and 1 to indicate it's patched. Otherwise, if no patch is in
-transition, it shows -1. Any tasks which are blocking the transition
-can be signaled with SIGSTOP and SIGCONT to force them to change their
-patched state. This may be harmful to the system though.
-/sys/kernel/livepatch/<patch>/signal attribute provides a better alternative.
-Writing 1 to the attribute sends a fake signal to all remaining blocking
-tasks. No proper signal is actually delivered (there is no data in signal
-pending structures). Tasks are interrupted or woken up, and forced to change
-their patched state.
-
-Administrator can also affect a transition through
-/sys/kernel/livepatch/<patch>/force attribute. Writing 1 there clears
-TIF_PATCH_PENDING flag of all tasks and thus forces the tasks to the patched
-state. Important note! The force attribute is intended for cases when the
-transition gets stuck for a long time because of a blocking task. Administrator
-is expected to collect all necessary data (namely stack traces of such blocking
-tasks) and request a clearance from a patch distributor to force the transition.
-Unauthorized usage may cause harm to the system. It depends on the nature of the
-patch, which functions are (un)patched, and which functions the blocking tasks
-are sleeping in (/proc/<pid>/stack may help here). Removal (rmmod) of patch
-modules is permanently disabled when the force feature is used. It cannot be
-guaranteed there is no task sleeping in such module. It implies unbounded
-reference count if a patch module is disabled and enabled in a loop.
-
-Moreover, the usage of force may also affect future applications of live
-patches and cause even more harm to the system. Administrator should first
-consider to simply cancel a transition (see above). If force is used, reboot
-should be planned and no more live patches applied.
-
-3.1 Adding consistency model support to new architectures
----------------------------------------------------------
-
-For adding consistency model support to new architectures, there are a
-few options:
-
-1) Add CONFIG_HAVE_RELIABLE_STACKTRACE. This means porting objtool, and
- for non-DWARF unwinders, also making sure there's a way for the stack
- tracing code to detect interrupts on the stack.
-
-2) Alternatively, ensure that every kthread has a call to
- klp_update_patch_state() in a safe location. Kthreads are typically
- in an infinite loop which does some action repeatedly. The safe
- location to switch the kthread's patch state would be at a designated
- point in the loop where there are no locks taken and all data
- structures are in a well-defined state.
-
- The location is clear when using workqueues or the kthread worker
- API. These kthreads process independent actions in a generic loop.
-
- It's much more complicated with kthreads which have a custom loop.
- There the safe location must be carefully selected on a case-by-case
- basis.
-
- In that case, arches without HAVE_RELIABLE_STACKTRACE would still be
- able to use the non-stack-checking parts of the consistency model:
-
- a) patching user tasks when they cross the kernel/user space
- boundary; and
-
- b) patching kthreads and idle tasks at their designated patch points.
-
- This option isn't as good as option 1 because it requires signaling
- user tasks and waking kthreads to patch them. But it could still be
- a good backup option for those architectures which don't have
- reliable stack traces yet.
-
-
-4. Livepatch module
-===================
-
-Livepatches are distributed using kernel modules, see
-samples/livepatch/livepatch-sample.c.
-
-The module includes a new implementation of functions that we want
-to replace. In addition, it defines some structures describing the
-relation between the original and the new implementation. Then there
-is code that makes the kernel start using the new code when the livepatch
-module is loaded. Also there is code that cleans up before the
-livepatch module is removed. All this is explained in more details in
-the next sections.
-
-
-4.1. New functions
-------------------
-
-New versions of functions are typically just copied from the original
-sources. A good practice is to add a prefix to the names so that they
-can be distinguished from the original ones, e.g. in a backtrace. Also
-they can be declared as static because they are not called directly
-and do not need the global visibility.
-
-The patch contains only functions that are really modified. But they
-might want to access functions or data from the original source file
-that may only be locally accessible. This can be solved by a special
-relocation section in the generated livepatch module, see
-Documentation/livepatch/module-elf-format.txt for more details.
-
-
-4.2. Metadata
--------------
-
-The patch is described by several structures that split the information
-into three levels:
-
- + struct klp_func is defined for each patched function. It describes
- the relation between the original and the new implementation of a
- particular function.
-
- The structure includes the name, as a string, of the original function.
- The function address is found via kallsyms at runtime.
-
- Then it includes the address of the new function. It is defined
- directly by assigning the function pointer. Note that the new
- function is typically defined in the same source file.
-
- As an optional parameter, the symbol position in the kallsyms database can
- be used to disambiguate functions of the same name. This is not the
- absolute position in the database, but rather the order it has been found
- only for a particular object ( vmlinux or a kernel module ). Note that
- kallsyms allows for searching symbols according to the object name.
-
- + struct klp_object defines an array of patched functions (struct
- klp_func) in the same object. Where the object is either vmlinux
- (NULL) or a module name.
-
- The structure helps to group and handle functions for each object
- together. Note that patched modules might be loaded later than
- the patch itself and the relevant functions might be patched
- only when they are available.
-
-
- + struct klp_patch defines an array of patched objects (struct
- klp_object).
-
- This structure handles all patched functions consistently and eventually,
- synchronously. The whole patch is applied only when all patched
- symbols are found. The only exception are symbols from objects
- (kernel modules) that have not been loaded yet.
-
- For more details on how the patch is applied on a per-task basis,
- see the "Consistency model" section.
-
-
-4.3. Livepatch module handling
-------------------------------
-
-The usual behavior is that the new functions will get used when
-the livepatch module is loaded. For this, the module init() function
-has to register the patch (struct klp_patch) and enable it. See the
-section "Livepatch life-cycle" below for more details about these
-two operations.
-
-Module removal is only safe when there are no users of the underlying
-functions. This is the reason why the force feature permanently disables
-the removal. The forced tasks entered the functions but we cannot say
-that they returned back. Therefore it cannot be decided when the
-livepatch module can be safely removed. When the system is successfully
-transitioned to a new patch state (patched/unpatched) without being
-forced it is guaranteed that no task sleeps or runs in the old code.
-
-
-5. Livepatch life-cycle
-=======================
-
-Livepatching defines four basic operations that define the life cycle of each
-live patch: registration, enabling, disabling and unregistration. There are
-several reasons why it is done this way.
-
-First, the patch is applied only when all patched symbols for already
-loaded objects are found. The error handling is much easier if this
-check is done before particular functions get redirected.
-
-Second, it might take some time until the entire system is migrated with
-the hybrid consistency model being used. The patch revert might block
-the livepatch module removal for too long. Therefore it is useful to
-revert the patch using a separate operation that might be called
-explicitly. But it does not make sense to remove all information until
-the livepatch module is really removed.
-
-
-5.1. Registration
------------------
-
-Each patch first has to be registered using klp_register_patch(). This makes
-the patch known to the livepatch framework. Also it does some preliminary
-computing and checks.
-
-In particular, the patch is added into the list of known patches. The
-addresses of the patched functions are found according to their names.
-The special relocations, mentioned in the section "New functions", are
-applied. The relevant entries are created under
-/sys/kernel/livepatch/<name>. The patch is rejected when any operation
-fails.
-
-
-5.2. Enabling
--------------
-
-Registered patches might be enabled either by calling klp_enable_patch() or
-by writing '1' to /sys/kernel/livepatch/<name>/enabled. The system will
-start using the new implementation of the patched functions at this stage.
-
-When a patch is enabled, livepatch enters into a transition state where
-tasks are converging to the patched state. This is indicated by a value
-of '1' in /sys/kernel/livepatch/<name>/transition. Once all tasks have
-been patched, the 'transition' value changes to '0'. For more
-information about this process, see the "Consistency model" section.
-
-If an original function is patched for the first time, a function
-specific struct klp_ops is created and an universal ftrace handler is
-registered.
-
-Functions might be patched multiple times. The ftrace handler is registered
-only once for the given function. Further patches just add an entry to the
-list (see field `func_stack`) of the struct klp_ops. The last added
-entry is chosen by the ftrace handler and becomes the active function
-replacement.
-
-Note that the patches might be enabled in a different order than they were
-registered.
-
-
-5.3. Disabling
---------------
-
-Enabled patches might get disabled either by calling klp_disable_patch() or
-by writing '0' to /sys/kernel/livepatch/<name>/enabled. At this stage
-either the code from the previously enabled patch or even the original
-code gets used.
-
-When a patch is disabled, livepatch enters into a transition state where
-tasks are converging to the unpatched state. This is indicated by a
-value of '1' in /sys/kernel/livepatch/<name>/transition. Once all tasks
-have been unpatched, the 'transition' value changes to '0'. For more
-information about this process, see the "Consistency model" section.
-
-Here all the functions (struct klp_func) associated with the to-be-disabled
-patch are removed from the corresponding struct klp_ops. The ftrace handler
-is unregistered and the struct klp_ops is freed when the func_stack list
-becomes empty.
-
-Patches must be disabled in exactly the reverse order in which they were
-enabled. It makes the problem and the implementation much easier.
-
-
-5.4. Unregistration
--------------------
-
-Disabled patches might be unregistered by calling klp_unregister_patch().
-This can be done only when the patch is disabled and the code is no longer
-used. It must be called before the livepatch module gets unloaded.
-
-At this stage, all the relevant sys-fs entries are removed and the patch
-is removed from the list of known patches.
-
-
-6. Sysfs
-========
-
-Information about the registered patches can be found under
-/sys/kernel/livepatch. The patches could be enabled and disabled
-by writing there.
-
-/sys/kernel/livepatch/<patch>/signal and /sys/kernel/livepatch/<patch>/force
-attributes allow administrator to affect a patching operation.
-
-See Documentation/ABI/testing/sysfs-kernel-livepatch for more details.
-
-
-7. Limitations
-==============
-
-The current Livepatch implementation has several limitations:
-
- + Only functions that can be traced could be patched.
-
- Livepatch is based on the dynamic ftrace. In particular, functions
- implementing ftrace or the livepatch ftrace handler could not be
- patched. Otherwise, the code would end up in an infinite loop. A
- potential mistake is prevented by marking the problematic functions
- by "notrace".
-
-
-
- + Livepatch works reliably only when the dynamic ftrace is located at
- the very beginning of the function.
-
- The function need to be redirected before the stack or the function
- parameters are modified in any way. For example, livepatch requires
- using -fentry gcc compiler option on x86_64.
-
- One exception is the PPC port. It uses relative addressing and TOC.
- Each function has to handle TOC and save LR before it could call
- the ftrace handler. This operation has to be reverted on return.
- Fortunately, the generic ftrace code has the same problem and all
- this is handled on the ftrace level.
-
-
- + Kretprobes using the ftrace framework conflict with the patched
- functions.
-
- Both kretprobes and livepatches use a ftrace handler that modifies
- the return address. The first user wins. Either the probe or the patch
- is rejected when the handler is already in use by the other.
-
-
- + Kprobes in the original function are ignored when the code is
- redirected to the new implementation.
-
- There is a work in progress to add warnings about this situation.