Re: [PATCH RFC cmpxchg 3/8] ARC: Emulate one-byte and two-byte cmpxchg
On Mon, Apr 1, 2024, at 23:39, Paul E. McKenney wrote: > Use the new cmpxchg_emu_u8() and cmpxchg_emu_u16() to emulate one-byte > and two-byte cmpxchg() on arc. > > Signed-off-by: Paul E. McKenney I'm missing the context here, is it now mandatory to have 16-bit cmpxchg() everywhere? I think we've historically tried hard to keep this out of common code since it's expensive on architectures that don't have native 16-bit load/store instructions (alpha, armv3) and or sub-word atomics (armv5, riscv, mips). Does the code that uses this rely on working concurrently with non-atomic stores to part of the 32-bit word? If we want to allow that, we need to merge my alpha ev4/45/5 removal series first. For the cmpxchg() interface, I would prefer to handle the 8-bit and 16-bit versions the same way as cmpxchg64() and provide separate cmpxchg8()/cmpxchg16()/cmpxchg32() functions by architectures that operate on fixed-size integer values but not compounds or pointers, and a generic cmpxchg() wrapper in common code that can handle the abtraction for pointers, long and (if absolutely necessary) compounds by multiplexing between cmpxchg32() and cmpxchg64() where needed. I did a prototype a few years ago and found that there is probably under a dozen users of the sub-word atomics in the tree, so this mostly requires changes to architecture code and less to drivers and core code. Arnd ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH RFC cmpxchg 3/8] ARC: Emulate one-byte and two-byte cmpxchg
On Tue, Apr 02, 2024 at 10:14:08AM +0200, Arnd Bergmann wrote: > On Mon, Apr 1, 2024, at 23:39, Paul E. McKenney wrote: > > Use the new cmpxchg_emu_u8() and cmpxchg_emu_u16() to emulate one-byte > > and two-byte cmpxchg() on arc. > > > > Signed-off-by: Paul E. McKenney > > I'm missing the context here, is it now mandatory to have 16-bit > cmpxchg() everywhere? I think we've historically tried hard to > keep this out of common code since it's expensive on architectures > that don't have native 16-bit load/store instructions (alpha, armv3) > and or sub-word atomics (armv5, riscv, mips). I need 8-bit, and just added 16-bit because it was easy to do so. I would be OK dropping the 16-bit portions of this series, assuming that no-one needs it. And assuming that it is easier to drop it than to explain why it is not available. ;-) > Does the code that uses this rely on working concurrently with > non-atomic stores to part of the 32-bit word? If we want to > allow that, we need to merge my alpha ev4/45/5 removal series > first. For 8-but cmpxchg(), yes. There are potentially concurrent smp_load_acquire() and smp_store_release() operations to this same byte. Or is your question specific to the 16-bit primitives? (Full disclosure: I have no objection to removing Alpha ev4/45/5, having several times suggested removing Alpha entirely. And having the scars to prove it.) > For the cmpxchg() interface, I would prefer to handle the > 8-bit and 16-bit versions the same way as cmpxchg64() and > provide separate cmpxchg8()/cmpxchg16()/cmpxchg32() functions > by architectures that operate on fixed-size integer values > but not compounds or pointers, and a generic cmpxchg() wrapper > in common code that can handle the abtraction for pointers, > long and (if absolutely necessary) compounds by multiplexing > between cmpxchg32() and cmpxchg64() where needed. So as to support _acquire(), _relaxed(), and _release()? If so, I don't have any use cases for other than full ordering. > I did a prototype a few years ago and found that there is > probably under a dozen users of the sub-word atomics in > the tree, so this mostly requires changes to architecture > code and less to drivers and core code. Given this approach, the predominance of changes to architecture code seems quite likely to me. But do we really wish to invest that much work into architectures that might not be all that long for the world? (Quickly donning my old asbestos suit, the one with the tungsten pinstripes...) Thanx, Paul ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH RFC cmpxchg 3/8] ARC: Emulate one-byte and two-byte cmpxchg
On Tue, Apr 02, 2024 at 10:06:14AM -0700, Paul E. McKenney wrote: > On Tue, Apr 02, 2024 at 10:14:08AM +0200, Arnd Bergmann wrote: > > On Mon, Apr 1, 2024, at 23:39, Paul E. McKenney wrote: > > > Use the new cmpxchg_emu_u8() and cmpxchg_emu_u16() to emulate one-byte > > > and two-byte cmpxchg() on arc. > > > > > > Signed-off-by: Paul E. McKenney > > > > I'm missing the context here, is it now mandatory to have 16-bit > > cmpxchg() everywhere? I think we've historically tried hard to > > keep this out of common code since it's expensive on architectures > > that don't have native 16-bit load/store instructions (alpha, armv3) > > and or sub-word atomics (armv5, riscv, mips). > > I need 8-bit, and just added 16-bit because it was easy to do so. > I would be OK dropping the 16-bit portions of this series, assuming > that no-one needs it. And assuming that it is easier to drop it than > to explain why it is not available. ;-) > > > Does the code that uses this rely on working concurrently with > > non-atomic stores to part of the 32-bit word? If we want to > > allow that, we need to merge my alpha ev4/45/5 removal series > > first. > > For 8-but cmpxchg(), yes. There are potentially concurrent > smp_load_acquire() and smp_store_release() operations to this same byte. > > Or is your question specific to the 16-bit primitives? (Full disclosure: > I have no objection to removing Alpha ev4/45/5, having several times > suggested removing Alpha entirely. And having the scars to prove it.) > > > For the cmpxchg() interface, I would prefer to handle the > > 8-bit and 16-bit versions the same way as cmpxchg64() and > > provide separate cmpxchg8()/cmpxchg16()/cmpxchg32() functions > > by architectures that operate on fixed-size integer values > > but not compounds or pointers, and a generic cmpxchg() wrapper > > in common code that can handle the abtraction for pointers, > > long and (if absolutely necessary) compounds by multiplexing > > between cmpxchg32() and cmpxchg64() where needed. > > So as to support _acquire(), _relaxed(), and _release()? > > If so, I don't have any use cases for other than full ordering. Nor any use cases other than integers. (In case another thing you are after here is good type-checking for non-integers combined with allowing C-language implicit conversions for integers.) Thanx, Paul > > I did a prototype a few years ago and found that there is > > probably under a dozen users of the sub-word atomics in > > the tree, so this mostly requires changes to architecture > > code and less to drivers and core code. > > Given this approach, the predominance of changes to architecture code > seems quite likely to me. > > But do we really wish to invest that much work into architectures that > might not be all that long for the world? (Quickly donning my old > asbestos suit, the one with the tungsten pinstripes...) > > Thanx, Paul ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc