aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation/vm
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/vm')
-rw-r--r--Documentation/vm/hwpoison.txt5
-rw-r--r--Documentation/vm/remap_file_pages.txt28
-rw-r--r--Documentation/vm/transhuge.txt4
3 files changed, 35 insertions, 2 deletions
diff --git a/Documentation/vm/hwpoison.txt b/Documentation/vm/hwpoison.txt
index 550068466605..6ae89a9edf2a 100644
--- a/Documentation/vm/hwpoison.txt
+++ b/Documentation/vm/hwpoison.txt
@@ -84,6 +84,11 @@ PR_MCE_KILL
PR_MCE_KILL_EARLY: Early kill
PR_MCE_KILL_LATE: Late kill
PR_MCE_KILL_DEFAULT: Use system global default
+ Note that if you want to have a dedicated thread which handles
+ the SIGBUS(BUS_MCEERR_AO) on behalf of the process, you should
+ call prctl(PR_MCE_KILL_EARLY) on the designated thread. Otherwise,
+ the SIGBUS is sent to the main thread.
+
PR_MCE_KILL_GET
return current mode
diff --git a/Documentation/vm/remap_file_pages.txt b/Documentation/vm/remap_file_pages.txt
new file mode 100644
index 000000000000..560e4363a55d
--- /dev/null
+++ b/Documentation/vm/remap_file_pages.txt
@@ -0,0 +1,28 @@
+The remap_file_pages() system call is used to create a nonlinear mapping,
+that is, a mapping in which the pages of the file are mapped into a
+nonsequential order in memory. The advantage of using remap_file_pages()
+over using repeated calls to mmap(2) is that the former approach does not
+require the kernel to create additional VMA (Virtual Memory Area) data
+structures.
+
+Supporting of nonlinear mapping requires significant amount of non-trivial
+code in kernel virtual memory subsystem including hot paths. Also to get
+nonlinear mapping work kernel need a way to distinguish normal page table
+entries from entries with file offset (pte_file). Kernel reserves flag in
+PTE for this purpose. PTE flags are scarce resource especially on some CPU
+architectures. It would be nice to free up the flag for other usage.
+
+Fortunately, there are not many users of remap_file_pages() in the wild.
+It's only known that one enterprise RDBMS implementation uses the syscall
+on 32-bit systems to map files bigger than can linearly fit into 32-bit
+virtual address space. This use-case is not critical anymore since 64-bit
+systems are widely available.
+
+The plan is to deprecate the syscall and replace it with an emulation.
+The emulation will create new VMAs instead of nonlinear mappings. It's
+going to work slower for rare users of remap_file_pages() but ABI is
+preserved.
+
+One side effect of emulation (apart from performance) is that user can hit
+vm.max_map_count limit more easily due to additional VMAs. See comment for
+DEFAULT_MAX_MAP_COUNT for more details on the limit.
diff --git a/Documentation/vm/transhuge.txt b/Documentation/vm/transhuge.txt
index 4a63953a41f1..6b31cfbe2a9a 100644
--- a/Documentation/vm/transhuge.txt
+++ b/Documentation/vm/transhuge.txt
@@ -360,13 +360,13 @@ on any tail page, would mean having to split all hugepages upfront in
get_user_pages which is unacceptable as too many gup users are
performance critical and they must work natively on hugepages like
they work natively on hugetlbfs already (hugetlbfs is simpler because
-hugetlbfs pages cannot be splitted so there wouldn't be requirement of
+hugetlbfs pages cannot be split so there wouldn't be requirement of
accounting the pins on the tail pages for hugetlbfs). If we wouldn't
account the gup refcounts on the tail pages during gup, we won't know
anymore which tail page is pinned by gup and which is not while we run
split_huge_page. But we still have to add the gup pin to the head page
too, to know when we can free the compound page in case it's never
-splitted during its lifetime. That requires changing not just
+split during its lifetime. That requires changing not just
get_page, but put_page as well so that when put_page runs on a tail
page (and only on a tail page) it will find its respective head page,
and then it will decrease the head page refcount in addition to the