On Sat, 23 Nov 2024, Maciej W. Rozycki wrote:
> Correct an issue with 8-bit/16-bit stores to `_Atomic' QI and HI data
> objects producing non-BWX assembly such as:
>
> mb
> insql $17,$16,$17
> ldq_u $1,0($16)
> mskwl $1,$16,$1
> bis $17,$1,$17
> stq_u $17,0($16)
> mb
>
> for:
>
> void
> atomic_store16 (_Atomic unsigned short *p, unsigned short v)
> {
> *p = v;
> }
>
> where an unprotected RMW sequence does not indeed guarantee atomicity of
> the access to the data object addressed. This is because Alpha machine
> description does not provide `atomic_storeMODE' patterns and in their
> absence the middle end assumes plain stores are atomic for data objects
> whose size is up to the target's word size, which does not stand for
> non-BWX Alpha hardware, where only SI and DI mode stores are atomic.
>
> Fix the issue by adding `atomic_storeQI' and `atomic_storeHI' patterns,
> derived from corresponding `atomic_exchangeQI' and `atomic_exchangeHI'
> ones, but with original data not retrieved from the object addressed.
>
> Issue reported by Arnd Bergmann.
>
> gcc/
> PR target/117759
> * config/alpha/alpha-protos.h (alpha_expand_atomic_store_12)
> (alpha_split_atomic_store_12): New prototypes.
> * config/alpha/alpha.cc (alpha_expand_atomic_store_12)
> (alpha_split_atomic_store_12): New functions.
> * config/alpha/sync.md (atomic_store<mode>): New expander.
> (atomic_store<mode>_1): New insn and splitter.
>
> gcc/testsuite/
> PR target/117759
> * gcc.target/alpha/atomic-clear-hi.c: New file.
> * gcc.target/alpha/atomic-clear-qi.c: New file.
> * gcc.target/alpha/atomic-store-di.c: New file.
> * gcc.target/alpha/atomic-store-hi.c: New file.
> * gcc.target/alpha/atomic-store-hi-bwx.c: New file.
> * gcc.target/alpha/atomic-store-qi.c: New file.
> * gcc.target/alpha/atomic-store-qi-bwx.c: New file.
> * gcc.target/alpha/atomic-store-si.c: New file.
> ---
I need to withdraw the fix for this PR, it is a red herring that has
misled me.
While there is a data race here it is not within access to the atomic
data object, it comes from the *outside*. Consider this code:
unsigned short x;
_Atomic unsigned short y;
unsigned short z;
void
write_x (unsigned short v)
{
x = v;
}
void
write_y (unsigned short v)
{
y = v;
}
void
write_z (unsigned short v)
{
z = v;
}
Data objects defined here compile to:
.globl z
.section .sbss,"aw"
.type z, @object
.size z, 2
.align 1
z:
.zero 2
.globl y
.type y, @object
.size y, 2
.align 1
y:
.zero 2
.globl x
.type x, @object
.size x, 2
.align 1
x:
.zero 2
There is no data race with parallel calls to `write_y' on different CPUs,
because the previous value of `y' is ignored. There is a race condition
between the calls, but it is fine and can be resolved by the code author
if required.
There is a data race with parallel calls to, depending on final alignment,
either `write_x' and `write_y', or `write_z' and `write_y', e.g:
time CPU0 CPU1 CPU2
|
| ... ldq_u ..., x ldq_u ..., z
| ldq_l ..., y . .
| ... . .
| stq_c ..., y . .
V ... stq_u ..., x stq_u ..., z
In this execution scenario `y' *will* get clobbered by either CPU1 or CPU2
despite the atomic sequence executed by CPU0.
This data race cannot be resolved with making the access atomic to `y'
alone, `atomic_storeMODE' won't help here. Accesses to `x', `y' and `z'
all need to be made atomic to prevent this data race from happening, as
implemented with the `-msafe-bwa' feature posted separately; cf.
<https://inbox.sourceware.org/gcc-patches/[email protected]/>.
Maciej