Re: [PATCH v3] mm: Avoid unnecessary page fault retires on shared memory types
On Wed, May 25, 2022 at 1:45 AM Peter Xu wrote: > I observed that for each of the shared file-backed page faults, we're very > likely to retry one more time for the 1st write fault upon no page. It's > because we'll need to release the mmap lock for dirty rate limit purpose > with balance_dirty_pages_ratelimited() (in fault_dirty_shared_page()). > > Then after that throttling we return VM_FAULT_RETRY. > > We did that probably because VM_FAULT_RETRY is the only way we can return > to the fault handler at that time telling it we've released the mmap lock. > > However that's not ideal because it's very likely the fault does not need > to be retried at all since the pgtable was well installed before the > throttling, so the next continuous fault (including taking mmap read lock, > walk the pgtable, etc.) could be in most cases unnecessary. > > It's not only slowing down page faults for shared file-backed, but also add > more mmap lock contention which is in most cases not needed at all. > > To observe this, one could try to write to some shmem page and look at > "pgfault" value in /proc/vmstat, then we should expect 2 counts for each > shmem write simply because we retried, and vm event "pgfault" will capture > that. > > To make it more efficient, add a new VM_FAULT_COMPLETED return code just to > show that we've completed the whole fault and released the lock. It's also > a hint that we should very possibly not need another fault immediately on > this page because we've just completed it. > > This patch provides a ~12% perf boost on my aarch64 test VM with a simple > program sequentially dirtying 400MB shmem file being mmap()ed and these are > the time it needs: > > Before: 650.980 ms (+-1.94%) > After: 569.396 ms (+-1.38%) > > I believe it could help more than that. > > We need some special care on GUP and the s390 pgfault handler (for gmap > code before returning from pgfault), the rest changes in the page fault > handlers should be relatively straightforward. > > Another thing to mention is that mm_account_fault() does take this new > fault as a generic fault to be accounted, unlike VM_FAULT_RETRY. > > I explicitly didn't touch hmm_vma_fault() and break_ksm() because they do > not handle VM_FAULT_RETRY even with existing code, so I'm literally keeping > them as-is. > > Signed-off-by: Peter Xu > arch/m68k/mm/fault.c | 4 Acked-by: Geert Uytterhoeven Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH v3] mm: Avoid unnecessary page fault retires on shared memory types
On Tue, May 24, 2022 at 07:45:31PM -0400, Peter Xu wrote: > I observed that for each of the shared file-backed page faults, we're very > likely to retry one more time for the 1st write fault upon no page. It's > because we'll need to release the mmap lock for dirty rate limit purpose > with balance_dirty_pages_ratelimited() (in fault_dirty_shared_page()). > > Then after that throttling we return VM_FAULT_RETRY. > > We did that probably because VM_FAULT_RETRY is the only way we can return > to the fault handler at that time telling it we've released the mmap lock. > > However that's not ideal because it's very likely the fault does not need > to be retried at all since the pgtable was well installed before the > throttling, so the next continuous fault (including taking mmap read lock, > walk the pgtable, etc.) could be in most cases unnecessary. > > It's not only slowing down page faults for shared file-backed, but also add > more mmap lock contention which is in most cases not needed at all. > > To observe this, one could try to write to some shmem page and look at > "pgfault" value in /proc/vmstat, then we should expect 2 counts for each > shmem write simply because we retried, and vm event "pgfault" will capture > that. > > To make it more efficient, add a new VM_FAULT_COMPLETED return code just to > show that we've completed the whole fault and released the lock. It's also > a hint that we should very possibly not need another fault immediately on > this page because we've just completed it. > > This patch provides a ~12% perf boost on my aarch64 test VM with a simple > program sequentially dirtying 400MB shmem file being mmap()ed and these are > the time it needs: > > Before: 650.980 ms (+-1.94%) > After: 569.396 ms (+-1.38%) > > I believe it could help more than that. > > We need some special care on GUP and the s390 pgfault handler (for gmap > code before returning from pgfault), the rest changes in the page fault > handlers should be relatively straightforward. > > Another thing to mention is that mm_account_fault() does take this new > fault as a generic fault to be accounted, unlike VM_FAULT_RETRY. > > I explicitly didn't touch hmm_vma_fault() and break_ksm() because they do > not handle VM_FAULT_RETRY even with existing code, so I'm literally keeping > them as-is. > > Signed-off-by: Peter Xu Acked-by: Peter Zijlstra (Intel) ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH v3] mm: Avoid unnecessary page fault retires on shared memory types
On Tue, May 24, 2022 at 07:45:31PM -0400, Peter Xu wrote: > I observed that for each of the shared file-backed page faults, we're very > likely to retry one more time for the 1st write fault upon no page. It's > because we'll need to release the mmap lock for dirty rate limit purpose > with balance_dirty_pages_ratelimited() (in fault_dirty_shared_page()). > > Then after that throttling we return VM_FAULT_RETRY. > > We did that probably because VM_FAULT_RETRY is the only way we can return > to the fault handler at that time telling it we've released the mmap lock. > > However that's not ideal because it's very likely the fault does not need > to be retried at all since the pgtable was well installed before the > throttling, so the next continuous fault (including taking mmap read lock, > walk the pgtable, etc.) could be in most cases unnecessary. > > It's not only slowing down page faults for shared file-backed, but also add > more mmap lock contention which is in most cases not needed at all. > > To observe this, one could try to write to some shmem page and look at > "pgfault" value in /proc/vmstat, then we should expect 2 counts for each > shmem write simply because we retried, and vm event "pgfault" will capture > that. > > To make it more efficient, add a new VM_FAULT_COMPLETED return code just to > show that we've completed the whole fault and released the lock. It's also > a hint that we should very possibly not need another fault immediately on > this page because we've just completed it. > > This patch provides a ~12% perf boost on my aarch64 test VM with a simple > program sequentially dirtying 400MB shmem file being mmap()ed and these are > the time it needs: > > Before: 650.980 ms (+-1.94%) > After: 569.396 ms (+-1.38%) > > I believe it could help more than that. > > We need some special care on GUP and the s390 pgfault handler (for gmap > code before returning from pgfault), the rest changes in the page fault > handlers should be relatively straightforward. > > Another thing to mention is that mm_account_fault() does take this new > fault as a generic fault to be accounted, unlike VM_FAULT_RETRY. > > I explicitly didn't touch hmm_vma_fault() and break_ksm() because they do > not handle VM_FAULT_RETRY even with existing code, so I'm literally keeping > them as-is. > > Signed-off-by: Peter Xu Acked-by: Johannes Weiner ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
[PATCH 0/2] bitops: __fls adjustments
Apparently on few architectures __fls is defined incorrectly. Fix this by adjusting declarations to asm-generic ones. As far as I can tell there should be no functional changes, but I don't have devices to test it, so it was only compile tested. Amadeusz Sławiński (2): ARC: bitops: Change __fls to return unsigned long m68k: bitops: Change __fls to return and accept unsigned long arch/arc/include/asm/bitops.h | 2 +- arch/m68k/include/asm/bitops.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) -- 2.25.1 ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
[PATCH 2/2] m68k: bitops: Change __fls to return and accept unsigned long
As per asm-generic definition and other architectures __fls should return and accept unsigned long as its parameter. Signed-off-by: Amadeusz Sławiński --- arch/m68k/include/asm/bitops.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/m68k/include/asm/bitops.h b/arch/m68k/include/asm/bitops.h index 51283db53667..87c2cd66a9ce 100644 --- a/arch/m68k/include/asm/bitops.h +++ b/arch/m68k/include/asm/bitops.h @@ -510,7 +510,7 @@ static inline int fls(unsigned int x) return 32 - cnt; } -static inline int __fls(int x) +static inline unsigned long __fls(unsigned long x) { return fls(x) - 1; } -- 2.25.1 ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
[PATCH 1/2] ARC: bitops: Change __fls to return unsigned long
As per asm-generic definition and other architectures __fls should return unsigned long. No functional change is expected as return value should fit in unsigned long. Signed-off-by: Amadeusz Sławiński --- arch/arc/include/asm/bitops.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arc/include/asm/bitops.h b/arch/arc/include/asm/bitops.h index bdb7e190a294..4076607e7168 100644 --- a/arch/arc/include/asm/bitops.h +++ b/arch/arc/include/asm/bitops.h @@ -82,7 +82,7 @@ static inline __attribute__ ((const)) int fls(unsigned int x) /* * __fls: Similar to fls, but zero based (0-31) */ -static inline __attribute__ ((const)) int __fls(unsigned long x) +static inline __attribute__ ((const)) unsigned long __fls(unsigned long x) { if (!x) return 0; -- 2.25.1 ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
[GIT PULL] ARC changes for 5.19-rc1
Hi Linus, Please pull. Thx, -Vineet ---> The following changes since commit af2d861d4cd2a4da5137f795ee3509e6f944a25b: Linux 5.18-rc4 (2022-04-24 14:51:22 -0700) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc.git/ tags/arc-5.19-rc1 for you to fetch changes up to 6aa98f6217861889523e38b0141c8c71b2ef8a83: ARC: bpf: define uapi for BPF_PROG_TYPE_PERF_EVENT program type (2022-04-26 09:35:28 -0700) ARC changes for 5.19-rc1 - Basic eBPF support (Sergey) Sergey Matyukevich (4): ARC: enable HAVE_REGS_AND_STACK_ACCESS_API feature ARC: implement syscall tracepoints ARC: disasm: handle ARCv2 case in kprobe get/set functions ARC: bpf: define uapi for BPF_PROG_TYPE_PERF_EVENT program type arch/arc/Kconfig | 2 + arch/arc/include/asm/perf_event.h | 4 + arch/arc/include/asm/ptrace.h | 27 ++ arch/arc/include/asm/syscall.h | 2 + arch/arc/include/asm/thread_info.h | 5 +- arch/arc/include/uapi/asm/bpf_perf_event.h | 9 ++ arch/arc/kernel/disasm.c | 64 - arch/arc/kernel/entry.S | 12 +-- arch/arc/kernel/ptrace.c | 140 - 9 files changed, 253 insertions(+), 12 deletions(-) create mode 100644 arch/arc/include/uapi/asm/bpf_perf_event.h ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH v3] mm: Avoid unnecessary page fault retires on shared memory types
On 5/24/22 16:45, Peter Xu wrote: I observed that for each of the shared file-backed page faults, we're very likely to retry one more time for the 1st write fault upon no page. It's because we'll need to release the mmap lock for dirty rate limit purpose with balance_dirty_pages_ratelimited() (in fault_dirty_shared_page()). Then after that throttling we return VM_FAULT_RETRY. We did that probably because VM_FAULT_RETRY is the only way we can return to the fault handler at that time telling it we've released the mmap lock. However that's not ideal because it's very likely the fault does not need to be retried at all since the pgtable was well installed before the throttling, so the next continuous fault (including taking mmap read lock, walk the pgtable, etc.) could be in most cases unnecessary. It's not only slowing down page faults for shared file-backed, but also add more mmap lock contention which is in most cases not needed at all. To observe this, one could try to write to some shmem page and look at "pgfault" value in /proc/vmstat, then we should expect 2 counts for each shmem write simply because we retried, and vm event "pgfault" will capture that. To make it more efficient, add a new VM_FAULT_COMPLETED return code just to show that we've completed the whole fault and released the lock. It's also a hint that we should very possibly not need another fault immediately on this page because we've just completed it. This patch provides a ~12% perf boost on my aarch64 test VM with a simple program sequentially dirtying 400MB shmem file being mmap()ed and these are the time it needs: Before: 650.980 ms (+-1.94%) After: 569.396 ms (+-1.38%) I believe it could help more than that. We need some special care on GUP and the s390 pgfault handler (for gmap code before returning from pgfault), the rest changes in the page fault handlers should be relatively straightforward. Another thing to mention is that mm_account_fault() does take this new fault as a generic fault to be accounted, unlike VM_FAULT_RETRY. I explicitly didn't touch hmm_vma_fault() and break_ksm() because they do not handle VM_FAULT_RETRY even with existing code, so I'm literally keeping them as-is. Signed-off-by: Peter Xu --- v3: - Rebase to akpm/mm-unstable - Copy arch maintainers --- arch/arc/mm/fault.c | 4 Acked-by: Vineet Gupta Thx, -Vineet ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH 1/2] ARC: bitops: Change __fls to return unsigned long
Hi, On 5/25/22 07:48, Amadeusz Sławiński wrote: As per asm-generic definition and other architectures __fls should return unsigned long. No functional change is expected as return value should fit in unsigned long. Signed-off-by: Amadeusz Sławiński Applied to for-curr. Thx, -Vineet ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc