diff options
author | Miklós Tóth | 2023-07-07 12:02:15 +0000 |
---|---|---|
committer | Miklós Tóth | 2023-07-07 12:02:15 +0000 |
commit | 67b143e99f28cf3e4264bd870f3474a5032178dd (patch) | |
tree | 42a3ea2ce88c2400f3ad2b2150243706c21bdca3 /disable-per-VMA_v4_2.patch | |
parent | 7e7c3118e9bd5729433335adadec4f80fc8601ac (diff) | |
download | aur-67b143e99f28cf3e4264bd870f3474a5032178dd.tar.gz |
automatic update
Diffstat (limited to 'disable-per-VMA_v4_2.patch')
-rw-r--r-- | disable-per-VMA_v4_2.patch | 75 |
1 files changed, 75 insertions, 0 deletions
diff --git a/disable-per-VMA_v4_2.patch b/disable-per-VMA_v4_2.patch new file mode 100644 index 000000000000..a86c966a4110 --- /dev/null +++ b/disable-per-VMA_v4_2.patch @@ -0,0 +1,75 @@ +From: Suren Baghdasaryan <surenb@google.com> +To: akpm@linux-foundation.org +Cc: jirislaby@kernel.org, jacobly.alt@gmail.com, + holger@applied-asynchrony.com, hdegoede@redhat.com, + michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, + vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, + dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, + peterz@infradead.org, ldufour@linux.ibm.com, paulmck@kernel.org, + mingo@redhat.com, will@kernel.org, luto@kernel.org, + songliubraving@fb.com, peterx@redhat.com, david@redhat.com, + dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, + kent.overstreet@linux.dev, punit.agrawal@bytedance.com, + lstoakes@gmail.com, peterjung1337@gmail.com, rientjes@google.com, + chriscli@google.com, axelrasmussen@google.com, joelaf@google.com, + minchan@google.com, rppt@kernel.org, jannh@google.com, + shakeelb@google.com, tatashin@google.com, edumazet@google.com, + gthelen@google.com, linux-mm@kvack.org, + linux-kernel@vger.kernel.org, stable@vger.kernel.org, + Suren Baghdasaryan <surenb@google.com> +Subject: [PATCH v4 1/2] fork: lock VMAs of the parent process when forking +Date: Wed, 5 Jul 2023 18:13:59 -0700 [thread overview] +Message-ID: <20230706011400.2949242-2-surenb@google.com> (raw) +In-Reply-To: <20230706011400.2949242-1-surenb@google.com> + +When forking a child process, parent write-protects an anonymous page +and COW-shares it with the child being forked using copy_present_pte(). +Parent's TLB is flushed right before we drop the parent's mmap_lock in +dup_mmap(). If we get a write-fault before that TLB flush in the parent, +and we end up replacing that anonymous page in the parent process in +do_wp_page() (because, COW-shared with the child), this might lead to +some stale writable TLB entries targeting the wrong (old) page. +Similar issue happened in the past with userfaultfd (see flush_tlb_page() +call inside do_wp_page()). +Lock VMAs of the parent process when forking a child, which prevents +concurrent page faults during fork operation and avoids this issue. +This fix can potentially regress some fork-heavy workloads. Kernel build +time did not show noticeable regression on a 56-core machine while a +stress test mapping 10000 VMAs and forking 5000 times in a tight loop +shows ~7% regression. If such fork time regression is unacceptable, +disabling CONFIG_PER_VMA_LOCK should restore its performance. Further +optimizations are possible if this regression proves to be problematic. + +Suggested-by: David Hildenbrand <david@redhat.com> +Reported-by: Jiri Slaby <jirislaby@kernel.org> +Closes: https://lore.kernel.org/all/dbdef34c-3a07-5951-e1ae-e9c6e3cdf51b@kernel.org/ +Reported-by: Holger Hoffstätte <holger@applied-asynchrony.com> +Closes: https://lore.kernel.org/all/b198d649-f4bf-b971-31d0-e8433ec2a34c@applied-asynchrony.com/ +Reported-by: Jacob Young <jacobly.alt@gmail.com> +Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217624 +Fixes: 0bff0aaea03e ("x86/mm: try VMA lock-based page fault handling first") +Cc: stable@vger.kernel.org +Signed-off-by: Suren Baghdasaryan <surenb@google.com> +--- + kernel/fork.c | 6 ++++++ + 1 file changed, 6 insertions(+) + +diff --git a/kernel/fork.c b/kernel/fork.c +index b85814e614a5..2ba918f83bde 100644 +--- a/kernel/fork.c ++++ b/kernel/fork.c +@@ -658,6 +658,12 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm, + retval = -EINTR; + goto fail_uprobe_end; + } ++#ifdef CONFIG_PER_VMA_LOCK ++ /* Disallow any page faults before calling flush_cache_dup_mm */ ++ for_each_vma(old_vmi, mpnt) ++ vma_start_write(mpnt); ++ vma_iter_set(&old_vmi, 0); ++#endif + flush_cache_dup_mm(oldmm); + uprobe_dup_mmap(oldmm, mm); + /* +-- +2.41.0.255.g8b1d071c50-goog |