On Mon, 18 Nov 2024, Arnd Bergmann wrote: > > This patch series addresses these issues in the last two changes, having > > made generic test suite updates to improve coverage in the concerned area > > first and then having addressed a bunch of issues in the code affected I > > discovered in the course of this effort. There is a patch that includes > > pair of changes to the middle end (reload+LRA) required by this update as > > well, so it's not a purely backend-limited change, and hence no "Alpha:" > > prefix on the cover letter or the relevant patches. > > I don't know enough about gcc internals to understand exactly > which problems you are addressing, but I think there are still > two issues that I identifier that remain unchanged here: > > a) storing an '_Atomic' variable smaller than 8 bytes on non-bwx > targets should use ll/sc, but uses a plain rmw cycle.
I think there should be no problem with aligned 4-byte (longword) data quantities, given that there are both plain and locked/conditional load and store machine operations provided by the non-BWX architecture. Why do you think otherwise? As to byte and aligned 2-byte (word) data quantities given that with `-msafe-bwa' *all* stores of these quantities are made via an LL/SC sequence (with a non-BWX target) I believe this has been sorted with my proposal: $ cat atomic16_t.c typedef _Atomic unsigned short atomic16_t; void store_atomic16 (atomic16_t *p, unsigned short v) { *p = v; } $ alpha-linux-gnu-gcc -msafe-bwa -O2 -S atomic16_t.c -o atomic16_t-bwa.c $ sed -n '/\.ent/,/\.end/p' <atomic16_t-bwa.s .ent store_atomic16 $store_atomic16..ng: store_atomic16: .frame $30,0,$26,0 $LFB0: .cfi_startproc .prologue 0 mb bic $16,7,$2 insql $17,$16,$17 $L2: ldq_l $1,0($2) mskwl $1,$16,$1 bis $17,$1,$1 stq_c $1,0($2) beq $1,$L2 mb ret $31,($26),1 .cfi_endproc $LFE0: .end store_atomic16 $ This does use an LL/SC sequence. Original non-BWX sequence uses plain unaligned load/store operations instead: $ alpha-linux-gnu-gcc -O2 -S atomic16_t.c -o atomic16_t-bwa.c $ sed -n '/\.ent/,/\.end/p' <atomic16_t-bwa.s .ent store_atomic16 $store_atomic16..ng: store_atomic16: .frame $30,0,$26,0 $LFB0: .cfi_startproc .prologue 0 mb insql $17,$16,$17 ldq_u $1,0($16) mskwl $1,$16,$1 bis $17,$1,$17 stq_u $17,0($16) mb ret $31,($26),1 .cfi_endproc $LFE0: .end store_atomic16 $ > b) Accessing a 'volatile' variable may have conflicting > requirements, and I'm not sure if either an ll/sc or a rmw > is actually correct since neither is free of side-effects. > Linux currently assumes that a store through a 'volatile > short *' pointer does not clobber adjacent data which > would need an atomic, but most CPUs (not sure about Alpha) > trap if you ever try an atomic operation on an MMIO > register address. To avoid triggering side effects Alpha system chipsets define a sparse I/O space decoding window where data locations are spaced apart such that no BWX operations are required to read or write individual 8-bit, or 16-bit, or even 24-bit peripheral registers. With DEC's own peripheral bus solutions it may not have been always necessary, but surely it has been to support PCI with the original Alpha implementation (EV4). We do have support for sparse I/O space operations in Linux, we've always had. Does my reply address your concerns? Maciej