Re: [PATCH RFC cmpxchg 3/8] ARC: Emulate one-byte and two-byte cmpxchg

2024-04-04 Thread Arnd Bergmann
On Tue, Apr 2, 2024, at 19:06, Paul E. McKenney wrote:
> On Tue, Apr 02, 2024 at 10:14:08AM +0200, Arnd Bergmann wrote:
>> On Mon, Apr 1, 2024, at 23:39, Paul E. McKenney wrote:
>> > Use the new cmpxchg_emu_u8() and cmpxchg_emu_u16() to emulate one-byte
>> > and two-byte cmpxchg() on arc.
>> >
>> > Signed-off-by: Paul E. McKenney 
>> 
>> I'm missing the context here, is it now mandatory to have 16-bit
>> cmpxchg() everywhere? I think we've historically tried hard to
>> keep this out of common code since it's expensive on architectures
>> that don't have native 16-bit load/store instructions (alpha, armv3)
>> and or sub-word atomics (armv5, riscv, mips).
>
> I need 8-bit, and just added 16-bit because it was easy to do so.
> I would be OK dropping the 16-bit portions of this series, assuming
> that no-one needs it.  And assuming that it is easier to drop it than
> to explain why it is not available.  ;-)

It certainly makes sense to handle both the same, whichever
way we do this.

>> Does the code that uses this rely on working concurrently with
>> non-atomic stores to part of the 32-bit word? If we want to
>> allow that, we need to merge my alpha ev4/45/5 removal series
>> first.
>
> For 8-but cmpxchg(), yes.  There are potentially concurrent
> smp_load_acquire() and smp_store_release() operations to this same byte.
>
> Or is your question specific to the 16-bit primitives?  (Full disclosure:
> I have no objection to removing Alpha ev4/45/5, having several times
> suggested removing Alpha entirely.  And having the scars to prove it.)

For the native sub-word access, alpha ev5 cannot do either of
them, while armv3 could do byte access but not 16-bit words.

It sounds like the old alphas are already broken then, which
is a good reason to finally drop support. It would appear that
the arm riscpc is not affected by this though.

>> For the cmpxchg() interface, I would prefer to handle the
>> 8-bit and 16-bit versions the same way as cmpxchg64() and
>> provide separate cmpxchg8()/cmpxchg16()/cmpxchg32() functions
>> by architectures that operate on fixed-size integer values
>> but not compounds or pointers, and a generic cmpxchg() wrapper
>> in common code that can handle the abtraction for pointers,
>> long and (if absolutely necessary) compounds by multiplexing
>> between cmpxchg32() and cmpxchg64() where needed.
>
> So as to support _acquire(), _relaxed(), and _release()?
>
> If so, I don't have any use cases for other than full ordering.

My main goal here would be to simplify the definition of
the very commonly used cmpxchg() macro so it doesn't have
to deal with so many corner cases, and make it work the
same way across all architectures. Without the type
agnostic wrapper, there would also be the benefit of
additional type checking that we get by replacing the
macros with inline functions.

We'd still need all the combinations of cmpxchg() and xchg()
with the four sizes and ordering variants, but at least the
latter should easily collapse on most architectures. At the
moment, most architectures only provide the full-ordering
version.

  Arnd

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc


Re: [PATCH RFC cmpxchg 3/8] ARC: Emulate one-byte and two-byte cmpxchg

2024-04-04 Thread Paul E. McKenney
On Thu, Apr 04, 2024 at 01:57:32PM +0200, Arnd Bergmann wrote:
> On Tue, Apr 2, 2024, at 19:06, Paul E. McKenney wrote:
> > On Tue, Apr 02, 2024 at 10:14:08AM +0200, Arnd Bergmann wrote:
> >> On Mon, Apr 1, 2024, at 23:39, Paul E. McKenney wrote:
> >> > Use the new cmpxchg_emu_u8() and cmpxchg_emu_u16() to emulate one-byte
> >> > and two-byte cmpxchg() on arc.
> >> >
> >> > Signed-off-by: Paul E. McKenney 
> >> 
> >> I'm missing the context here, is it now mandatory to have 16-bit
> >> cmpxchg() everywhere? I think we've historically tried hard to
> >> keep this out of common code since it's expensive on architectures
> >> that don't have native 16-bit load/store instructions (alpha, armv3)
> >> and or sub-word atomics (armv5, riscv, mips).
> >
> > I need 8-bit, and just added 16-bit because it was easy to do so.
> > I would be OK dropping the 16-bit portions of this series, assuming
> > that no-one needs it.  And assuming that it is easier to drop it than
> > to explain why it is not available.  ;-)
> 
> It certainly makes sense to handle both the same, whichever
> way we do this.

Agreed, at least as long as the properties of the relevant hardware are
consistent with doing so.

> >> Does the code that uses this rely on working concurrently with
> >> non-atomic stores to part of the 32-bit word? If we want to
> >> allow that, we need to merge my alpha ev4/45/5 removal series
> >> first.
> >
> > For 8-but cmpxchg(), yes.  There are potentially concurrent
> > smp_load_acquire() and smp_store_release() operations to this same byte.
> >
> > Or is your question specific to the 16-bit primitives?  (Full disclosure:
> > I have no objection to removing Alpha ev4/45/5, having several times
> > suggested removing Alpha entirely.  And having the scars to prove it.)
> 
> For the native sub-word access, alpha ev5 cannot do either of
> them, while armv3 could do byte access but not 16-bit words.
> 
> It sounds like the old alphas are already broken then, which
> is a good reason to finally drop support. It would appear that
> the arm riscpc is not affected by this though.

Good point, given that single-byte load/store and emulated single-byte
cmpxchg() has been in common code for some time.

So armv3 is OK with one-byte emulated cmpxchg(), but not with two-byte,
which is consistent with the current state of this series in the -rcu
tree.  (My plan is to wait until Monday to re-send the series in order
to allow the test robots to find yet more bugs.)

Or is the plan to also drop support for armv3 in the near term?

> >> For the cmpxchg() interface, I would prefer to handle the
> >> 8-bit and 16-bit versions the same way as cmpxchg64() and
> >> provide separate cmpxchg8()/cmpxchg16()/cmpxchg32() functions
> >> by architectures that operate on fixed-size integer values
> >> but not compounds or pointers, and a generic cmpxchg() wrapper
> >> in common code that can handle the abtraction for pointers,
> >> long and (if absolutely necessary) compounds by multiplexing
> >> between cmpxchg32() and cmpxchg64() where needed.
> >
> > So as to support _acquire(), _relaxed(), and _release()?
> >
> > If so, I don't have any use cases for other than full ordering.
> 
> My main goal here would be to simplify the definition of
> the very commonly used cmpxchg() macro so it doesn't have
> to deal with so many corner cases, and make it work the
> same way across all architectures. Without the type
> agnostic wrapper, there would also be the benefit of
> additional type checking that we get by replacing the
> macros with inline functions.
> 
> We'd still need all the combinations of cmpxchg() and xchg()
> with the four sizes and ordering variants, but at least the
> latter should easily collapse on most architectures. At the
> moment, most architectures only provide the full-ordering
> version.

That does make a lot of sense to me.  Though C-language inline functions
have some trouble with type-generic parameters.

However, my patch is down at architecture-specific level, so should not
affect the cmpxchg() macro, right?  Or am I missing some aspect of your
proposed refactoring?

Thanx, Paul

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc


Re: [PATCH RFC cmpxchg 3/8] ARC: Emulate one-byte and two-byte cmpxchg

2024-04-04 Thread Arnd Bergmann
On Thu, Apr 4, 2024, at 16:44, Paul E. McKenney wrote:
> On Thu, Apr 04, 2024 at 01:57:32PM +0200, Arnd Bergmann wrote:
>> On Tue, Apr 2, 2024, at 19:06, Paul E. McKenney wrote:
>> > Or is your question specific to the 16-bit primitives?  (Full disclosure:
>> > I have no objection to removing Alpha ev4/45/5, having several times
>> > suggested removing Alpha entirely.  And having the scars to prove it.)
>> 
>> For the native sub-word access, alpha ev5 cannot do either of
>> them, while armv3 could do byte access but not 16-bit words.
>> 
>> It sounds like the old alphas are already broken then, which
>> is a good reason to finally drop support. It would appear that
>> the arm riscpc is not affected by this though.
>
> Good point, given that single-byte load/store and emulated single-byte
> cmpxchg() has been in common code for some time.
>
> So armv3 is OK with one-byte emulated cmpxchg(), but not with two-byte,
> which is consistent with the current state of this series in the -rcu
> tree.  (My plan is to wait until Monday to re-send the series in order
> to allow the test robots to find yet more bugs.)
>
> Or is the plan to also drop support for armv3 in the near term?

Russell still has his RiscPC, which is probably the only one
using armv3 kernels (it's actually v4 but relies on -march=armv3
to work around hardware quirks). Since armv3 support was dropped
in gcc-9, it's a matter of time before we drop this as well, but
it's not now.

>> We'd still need all the combinations of cmpxchg() and xchg()
>> with the four sizes and ordering variants, but at least the
>> latter should easily collapse on most architectures. At the
>> moment, most architectures only provide the full-ordering
>> version.
>
> That does make a lot of sense to me.  Though C-language inline functions
> have some trouble with type-generic parameters.
>
> However, my patch is down at architecture-specific level, so should not
> affect the cmpxchg() macro, right?  Or am I missing some aspect of your
> proposed refactoring?

Today, arch_cmpxchg() and its variants are defined in each
architecture to handle some subset of integer sizes.

The way I'd like to do this in the future would be to remove that
macro in favor of arch_{cmp,}xchg{8,16,32,64}{,_relaxed,_release,
_acquire,_local} inline functions that each architecture needs
to provide either directly or through a generic helper, with
all the macros wrapping them moved to common code.

I've been wanting to change this for years now and it never
quite makes it to the top of my todo pile, so I guess it can wait
a little longer.

 Arnd

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc