date:20240402

Re: [PATCH RFC cmpxchg 3/8] ARC: Emulate one-byte and two-byte cmpxchg

2024-04-02 Thread Arnd Bergmann

On Mon, Apr 1, 2024, at 23:39, Paul E. McKenney wrote:
> Use the new cmpxchg_emu_u8() and cmpxchg_emu_u16() to emulate one-byte
> and two-byte cmpxchg() on arc.
>
> Signed-off-by: Paul E. McKenney 

I'm missing the context here, is it now mandatory to have 16-bit
cmpxchg() everywhere? I think we've historically tried hard to
keep this out of common code since it's expensive on architectures
that don't have native 16-bit load/store instructions (alpha, armv3)
and or sub-word atomics (armv5, riscv, mips).

Does the code that uses this rely on working concurrently with
non-atomic stores to part of the 32-bit word? If we want to
allow that, we need to merge my alpha ev4/45/5 removal series
first.

For the cmpxchg() interface, I would prefer to handle the
8-bit and 16-bit versions the same way as cmpxchg64() and
provide separate cmpxchg8()/cmpxchg16()/cmpxchg32() functions
by architectures that operate on fixed-size integer values
but not compounds or pointers, and a generic cmpxchg() wrapper
in common code that can handle the abtraction for pointers,
long and (if absolutely necessary) compounds by multiplexing
between cmpxchg32() and cmpxchg64() where needed.

I did a prototype a few years ago and found that there is
probably under a dozen users of the sub-word atomics in
the tree, so this mostly requires changes to architecture
code and less to drivers and core code.

  Arnd

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

Re: [PATCH RFC cmpxchg 3/8] ARC: Emulate one-byte and two-byte cmpxchg

2024-04-02 Thread Paul E. McKenney

On Tue, Apr 02, 2024 at 10:14:08AM +0200, Arnd Bergmann wrote:
> On Mon, Apr 1, 2024, at 23:39, Paul E. McKenney wrote:
> > Use the new cmpxchg_emu_u8() and cmpxchg_emu_u16() to emulate one-byte
> > and two-byte cmpxchg() on arc.
> >
> > Signed-off-by: Paul E. McKenney 
> 
> I'm missing the context here, is it now mandatory to have 16-bit
> cmpxchg() everywhere? I think we've historically tried hard to
> keep this out of common code since it's expensive on architectures
> that don't have native 16-bit load/store instructions (alpha, armv3)
> and or sub-word atomics (armv5, riscv, mips).

I need 8-bit, and just added 16-bit because it was easy to do so.
I would be OK dropping the 16-bit portions of this series, assuming
that no-one needs it.  And assuming that it is easier to drop it than
to explain why it is not available.  ;-)

> Does the code that uses this rely on working concurrently with
> non-atomic stores to part of the 32-bit word? If we want to
> allow that, we need to merge my alpha ev4/45/5 removal series
> first.

For 8-but cmpxchg(), yes.  There are potentially concurrent
smp_load_acquire() and smp_store_release() operations to this same byte.

Or is your question specific to the 16-bit primitives?  (Full disclosure:
I have no objection to removing Alpha ev4/45/5, having several times
suggested removing Alpha entirely.  And having the scars to prove it.)

> For the cmpxchg() interface, I would prefer to handle the
> 8-bit and 16-bit versions the same way as cmpxchg64() and
> provide separate cmpxchg8()/cmpxchg16()/cmpxchg32() functions
> by architectures that operate on fixed-size integer values
> but not compounds or pointers, and a generic cmpxchg() wrapper
> in common code that can handle the abtraction for pointers,
> long and (if absolutely necessary) compounds by multiplexing
> between cmpxchg32() and cmpxchg64() where needed.

So as to support _acquire(), _relaxed(), and _release()?

If so, I don't have any use cases for other than full ordering.

> I did a prototype a few years ago and found that there is
> probably under a dozen users of the sub-word atomics in
> the tree, so this mostly requires changes to architecture
> code and less to drivers and core code.

Given this approach, the predominance of changes to architecture code
seems quite likely to me.

But do we really wish to invest that much work into architectures that
might not be all that long for the world?  (Quickly donning my old
asbestos suit, the one with the tungsten pinstripes...)

Thanx, Paul

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

Re: [PATCH RFC cmpxchg 3/8] ARC: Emulate one-byte and two-byte cmpxchg

2024-04-02 Thread Paul E. McKenney

On Tue, Apr 02, 2024 at 10:06:14AM -0700, Paul E. McKenney wrote:
> On Tue, Apr 02, 2024 at 10:14:08AM +0200, Arnd Bergmann wrote:
> > On Mon, Apr 1, 2024, at 23:39, Paul E. McKenney wrote:
> > > Use the new cmpxchg_emu_u8() and cmpxchg_emu_u16() to emulate one-byte
> > > and two-byte cmpxchg() on arc.
> > >
> > > Signed-off-by: Paul E. McKenney 
> > 
> > I'm missing the context here, is it now mandatory to have 16-bit
> > cmpxchg() everywhere? I think we've historically tried hard to
> > keep this out of common code since it's expensive on architectures
> > that don't have native 16-bit load/store instructions (alpha, armv3)
> > and or sub-word atomics (armv5, riscv, mips).
> 
> I need 8-bit, and just added 16-bit because it was easy to do so.
> I would be OK dropping the 16-bit portions of this series, assuming
> that no-one needs it.  And assuming that it is easier to drop it than
> to explain why it is not available.  ;-)
> 
> > Does the code that uses this rely on working concurrently with
> > non-atomic stores to part of the 32-bit word? If we want to
> > allow that, we need to merge my alpha ev4/45/5 removal series
> > first.
> 
> For 8-but cmpxchg(), yes.  There are potentially concurrent
> smp_load_acquire() and smp_store_release() operations to this same byte.
> 
> Or is your question specific to the 16-bit primitives?  (Full disclosure:
> I have no objection to removing Alpha ev4/45/5, having several times
> suggested removing Alpha entirely.  And having the scars to prove it.)
> 
> > For the cmpxchg() interface, I would prefer to handle the
> > 8-bit and 16-bit versions the same way as cmpxchg64() and
> > provide separate cmpxchg8()/cmpxchg16()/cmpxchg32() functions
> > by architectures that operate on fixed-size integer values
> > but not compounds or pointers, and a generic cmpxchg() wrapper
> > in common code that can handle the abtraction for pointers,
> > long and (if absolutely necessary) compounds by multiplexing
> > between cmpxchg32() and cmpxchg64() where needed.
> 
> So as to support _acquire(), _relaxed(), and _release()?
> 
> If so, I don't have any use cases for other than full ordering.

Nor any use cases other than integers.  (In case another thing you are
after here is good type-checking for non-integers combined with allowing
C-language implicit conversions for integers.)

Thanx, Paul

> > I did a prototype a few years ago and found that there is
> > probably under a dozen users of the sub-word atomics in
> > the tree, so this mostly requires changes to architecture
> > code and less to drivers and core code.
> 
> Given this approach, the predominance of changes to architecture code
> seems quite likely to me.
> 
> But do we really wish to invest that much work into architectures that
> might not be all that long for the world?  (Quickly donning my old
> asbestos suit, the one with the tungsten pinstripes...)
> 
>   Thanx, Paul

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

Re: [PATCH RFC cmpxchg 3/8] ARC: Emulate one-byte and two-byte cmpxchg

Re: [PATCH RFC cmpxchg 3/8] ARC: Emulate one-byte and two-byte cmpxchg

Re: [PATCH RFC cmpxchg 3/8] ARC: Emulate one-byte and two-byte cmpxchg

3 matches

Site Navigation

Mail list logo

Footer information