Re: [PATCH 00/21] dma-mapping: unify support for cache flushes
Hi Arnd, On Mon, Mar 27, 2023 at 1:14 PM Arnd Bergmann wrote: > > From: Arnd Bergmann > > After a long discussion about adding SoC specific semantics for when > to flush caches in drivers/soc/ drivers that we determined to be > fundamentally flawed[1], I volunteered to try to move that logic into > architecture-independent code and make all existing architectures do > the same thing. > > As we had determined earlier, the behavior is wildly different across > architectures, but most of the differences come down to either bugs > (when required flushes are missing) or extra flushes that are harmless > but might hurt performance. > > I finally found the time to come up with an implementation of this, which > starts by replacing every outlier with one of the three common options: > > 1. architectures without speculative prefetching (hegagon, m68k, > openrisc, sh, sparc, and certain armv4 and xtensa implementations) > only flush their caches before a DMA, by cleaning write-back caches > (if any) before a DMA to the device, and by invalidating the caches > before a DMA from a device > > 2. arc, microblaze, mips, nios2, sh and later xtensa now follow the > normal 32-bit arm model and invalidate their writeback caches > again after a DMA from the device, to remove stale cache lines > that got prefetched during the DMA. arc, csky and mips used to > invalidate buffers also before the bidirectional DMA, but this > is now skipped whenever we know it gets invalidated again > after the DMA. > > 3. parisc, powerpc and riscv already flushed buffers before > a DMA_FROM_DEVICE, and these get moved to the arm64 behavior > that does the writeback before and invalidate after both > DMA_FROM_DEVICE and DMA_BIDIRECTIONAL in order to avoid the > problem of accidentally leaking stale data if the DMA does > not actually happen[2]. > > The last patch in the series replaces the architecture specific code > with a shared version that implements all three based on architecture > specific parameters that are almost always determined at compile time. > > The difference between cases 1. and 2. is hardware specific, while between > 2. and 3. we need to decide which semantics we want, but I explicitly > avoid this question in my series and leave it to be decided later. > > Another difference that I do not address here is what cache invalidation > does for partical cache lines. On arm32, arm64 and powerpc, a partial > cache line always gets written back before invalidation in order to > ensure that data before or after the buffer is not discarded. On all > other architectures, the assumption is cache lines are never shared > between DMA buffer and data that is accessed by the CPU. If we end up > always writing back dirty cache lines before a DMA (option 3 above), > then this point becomes moot, otherwise we should probably address this > in a follow-up series to document one behavior or the other and implement > it consistently. > > Please review! > > Arnd > > [1] > https://lore.kernel.org/all/20221212115505.36770-1-prabhakar.mahadev-lad...@bp.renesas.com/ > [2] https://lore.kernel.org/all/20220606152150.GA31568@willie-the-truck/ > > Arnd Bergmann (21): > openrisc: dma-mapping: flush bidirectional mappings > xtensa: dma-mapping: use normal cache invalidation rules > sparc32: flush caches in dma_sync_*for_device > microblaze: dma-mapping: skip extra DMA flushes > powerpc: dma-mapping: split out cache operation logic > powerpc: dma-mapping: minimize for_cpu flushing > powerpc: dma-mapping: always clean cache in _for_device() op > riscv: dma-mapping: only invalidate after DMA, not flush > riscv: dma-mapping: skip invalidation before bidirectional DMA > csky: dma-mapping: skip invalidating before DMA from device > mips: dma-mapping: skip invalidating before bidirectional DMA > mips: dma-mapping: split out cache operation logic > arc: dma-mapping: skip invalidating before bidirectional DMA > parisc: dma-mapping: use regular flush/invalidate ops > ARM: dma-mapping: always invalidate WT caches before DMA > ARM: dma-mapping: bring back dmac_{clean,inv}_range > ARM: dma-mapping: use arch_sync_dma_for_{device,cpu}() internally > ARM: drop SMP support for ARM11MPCore > ARM: dma-mapping: use generic form of arch_sync_dma_* helpers > ARM: dma-mapping: split out arch_dma_mark_clean() helper > dma-mapping: replace custom code with generic implementation > Do you plan to send v2 for this series? Cheers, Prabhakar > arch/arc/mm/dma.c | 66 ++-- > arch/arm/Kconfig | 4 + > arch/arm/include/asm/cacheflush.h | 21 +++ > arch/arm/include/asm/glue-cache.h | 4 + > arch/arm/mach-oxnas/Kconfig| 4 - > arch/arm/mach-oxnas/Makefile | 1 - > arch/arm/mach-oxnas/headsmp.S | 23 --- > arch/arm/mach-oxnas/platsmp.c |
[RESEND PATCH 1/2] ARC: fix incorrect THREAD_SHIFT definition
Current definition of THREAD_SHIFT is (PAGE_SHIFT << THREAD_SIZE_ORDER) It should be (PAGE_SHIFT + THREAD_SIZE_ORDER) because the following equation should hold: Say PAGE_SHIFT == 13 (as the default value in ARC) THREAD_SIZE_ORDER == 1 (as CONFIG_16KSTACKS=y) THREAD_SIZE == (1 << THREAD_SHIFT) == (1 << (PAGE_SHIFT + THREAD_SIZE_ORDER)) == (1 << 14) == 16KB Signed-off-by: Min-Hua Chen --- arch/arc/include/asm/thread_info.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arc/include/asm/thread_info.h b/arch/arc/include/asm/thread_info.h index 6ba7fe417095..9f9dd021501c 100644 --- a/arch/arc/include/asm/thread_info.h +++ b/arch/arc/include/asm/thread_info.h @@ -22,7 +22,7 @@ #endif #define THREAD_SIZE (PAGE_SIZE << THREAD_SIZE_ORDER) -#define THREAD_SHIFT (PAGE_SHIFT << THREAD_SIZE_ORDER) +#define THREAD_SHIFT (PAGE_SHIFT + THREAD_SIZE_ORDER) #ifndef __ASSEMBLY__ -- 2.34.1 ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
[RESEND PATCH 0/2] ARC: fix THREAD_SHIFT and rename 16KSTACKS
Hi, When I read the arch/arc code, I first noticed that the definition of THREAD_SHIFT looks incorrect and the description of 16KSTACKS looks confusing because there are multiple definitions of PAGE_SHIFT. So I submit these patches to address the issues I found. Min-Hua Chen (2): ARC: fix incorrect THREAD_SHIFT definition ARC: rename 16KSTACKS to DEBUG_STACKS arch/arc/Kconfig.debug | 7 --- arch/arc/include/asm/thread_info.h | 4 ++-- 2 files changed, 6 insertions(+), 5 deletions(-) -- 2.34.1 ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
[RESEND PATCH 2/2] ARC: rename 16KSTACKS to DEBUG_STACKS
Rename 16KSTACKS to DEBUG_STACKS. arch/arc/Kconfig.debug says that the default stack size is 8KB and it will become 16KB stack if 16KSTACKS is set. However, the stack size is based on PAGE_SIZE, and it is configurable by CONFIG_ARC_PAGE_SIZE_16K or CONFIG_ARC_PAGE_SIZE_4K. When CONFIG_16KSTACKS=y: PAGE_SHIFT==12 => THREAD_SIZE == (1 << (12 + 1)) = 8KB PAGE_SHIFT==13 => THREAD_SIZE == (1 << (13 + 1)) = 16KB PAGE_SHIFT==14 => THREAD_SIZE == (1 << (14 + 1)) = 32KB We get correct 16KB stack only when PAGE_SHIFT is 13. See arch/arc/include/uapi/asm/page.h: /* PAGE_SHIFT determines the page size */ \#if defined(CONFIG_ARC_PAGE_SIZE_16K) \#define PAGE_SHIFT 14 \#elif defined(CONFIG_ARC_PAGE_SIZE_4K) \#define PAGE_SHIFT 12 \#else \#define PAGE_SHIFT 13 \#endif See arch/arc/include/asm/thread_info.h: \#ifdef CONFIG_DEBUG_STACKS \#define THREAD_SIZE_ORDER 1 \#else \#define THREAD_SIZE_ORDER 0 \#endif \#define THREAD_SIZE (PAGE_SIZE << THREAD_SIZE_ORDER) \#define THREAD_SHIFT (PAGE_SHIFT + THREAD_SIZE_ORDER) To make CONFIG_16KSTACKS less confusing, rename it to DEBUG_STACKS (as it is defined in Kconfig.debug) and modify the Kconfig description. No functional changes intended. Signed-off-by: Min-Hua Chen --- arch/arc/Kconfig.debug | 7 --- arch/arc/include/asm/thread_info.h | 2 +- 2 files changed, 5 insertions(+), 4 deletions(-) diff --git a/arch/arc/Kconfig.debug b/arch/arc/Kconfig.debug index 45add86decd5..9a1e140605c4 100644 --- a/arch/arc/Kconfig.debug +++ b/arch/arc/Kconfig.debug @@ -1,10 +1,11 @@ # SPDX-License-Identifier: GPL-2.0 -config 16KSTACKS - bool "Use 16Kb for kernel stacks instead of 8Kb" +config DEBUG_STACKS + bool "Use double sized kernel stacks" help - If you say Y here the kernel will use a 16Kb stacksize for the + If you say Y here the kernel will use a double sized stack for the kernel stack attached to each process/thread. The default is 8K. + (depends on CONFIG_ARC_PAGE_SIZE_16K or CONFIG_ARC_PAGE_SIZE_4K) This increases the resident kernel footprint and will cause less threads to run on the system and also increase the pressure on the VM subsystem for higher order allocations. diff --git a/arch/arc/include/asm/thread_info.h b/arch/arc/include/asm/thread_info.h index 9f9dd021501c..a7358d1225a6 100644 --- a/arch/arc/include/asm/thread_info.h +++ b/arch/arc/include/asm/thread_info.h @@ -15,7 +15,7 @@ #include -#ifdef CONFIG_16KSTACKS +#ifdef CONFIG_DEBUG_STACKS #define THREAD_SIZE_ORDER 1 #else #define THREAD_SIZE_ORDER 0 -- 2.34.1 ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH] mm/slab: rename CONFIG_SLAB to CONFIG_SLAB_DEPRECATED
On 5/24/23 02:29, David Rientjes wrote: On Tue, 23 May 2023, Vlastimil Babka wrote: As discussed at LSF/MM [1] [2] and with no objections raised there, deprecate the SLAB allocator. Rename the user-visible option so that users with CONFIG_SLAB=y get a new prompt with explanation during make oldconfig, while make olddefconfig will just switch to SLUB. In all defconfigs with CONFIG_SLAB=y remove the line so those also switch to SLUB. Regressions due to the switch should be reported to linux-mm and slab maintainers. [1] https://lore.kernel.org/all/4b9fc9c6-b48c-198f-5f80-811a44737...@suse.cz/ [2] https://lwn.net/Articles/932201/ Signed-off-by: Vlastimil Babka Acked-by: David Rientjes I did tested SLUB on parisc with 32- and 64-bit kernel, so you may add: Acked-by: Helge Deller # parisc Helge ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc