On Mon, Nov 18, 2024, at 13:22, Maciej W. Rozycki wrote:
> On Mon, 18 Nov 2024, Arnd Bergmann wrote:
>
>> > This patch series addresses these issues in the last two changes, having
>> > made generic test suite updates to improve coverage in the concerned area
>> > first and then having addressed a bunch of issues in the code affected I
>> > discovered in the course of this effort. There is a patch that includes
>> > pair of changes to the middle end (reload+LRA) required by this update as
>> > well, so it's not a purely backend-limited change, and hence no "Alpha:"
>> > prefix on the cover letter or the relevant patches.
>>
>> I don't know enough about gcc internals to understand exactly
>> which problems you are addressing, but I think there are still
>> two issues that I identifier that remain unchanged here:
>>
>> a) storing an '_Atomic' variable smaller than 8 bytes on non-bwx
>> targets should use ll/sc, but uses a plain rmw cycle.
>
> I think there should be no problem with aligned 4-byte (longword) data
> quantities, given that there are both plain and locked/conditional load
> and store machine operations provided by the non-BWX architecture. Why do
> you think otherwise?
It's quite possible that I misremembered this and it was only
a problem on 1-byte and 2-byte stores, even for _Atomic.
> As to byte and aligned 2-byte (word) data quantities given that with
> `-msafe-bwa' *all* stores of these quantities are made via an LL/SC
> sequence (with a non-BWX target) I believe this has been sorted with my
> proposal:
>
> $ cat atomic16_t.c
> typedef _Atomic unsigned short atomic16_t;
>
> void store_atomic16 (atomic16_t *p, unsigned short v)
> {
> *p = v;
> }
> $ alpha-linux-gnu-gcc -msafe-bwa -O2 -S atomic16_t.c -o atomic16_t-bwa.c
> $ sed -n '/\.ent/,/\.end/p' <atomic16_t-bwa.s
> This does use an LL/SC sequence. Original non-BWX sequence uses plain
> unaligned load/store operations instead:
Right, with -msafe-bwa this would be covered, but I'm not sure
if you'd expect to build all of userspace with that flag as well.
I can certainly see how one would argue that userspace doesn't
need to use LL/SC for non-atomic subword access, but at the same
time I think the RMW sequence on _Atomic variables is a clear
bug that you'd need to fix also for -mno-safe-bwa.
>> b) Accessing a 'volatile' variable may have conflicting
>> requirements, and I'm not sure if either an ll/sc or a rmw
>> is actually correct since neither is free of side-effects.
>> Linux currently assumes that a store through a 'volatile
>> short *' pointer does not clobber adjacent data which
>> would need an atomic, but most CPUs (not sure about Alpha)
>> trap if you ever try an atomic operation on an MMIO
>> register address.
>
> To avoid triggering side effects Alpha system chipsets define a sparse
> I/O space decoding window where data locations are spaced apart such that
> no BWX operations are required to read or write individual 8-bit, or
> 16-bit, or even 24-bit peripheral registers. With DEC's own peripheral
> bus solutions it may not have been always necessary, but surely it has
> been to support PCI with the original Alpha implementation (EV4). We do
> have support for sparse I/O space operations in Linux, we've always had.
>
> Does my reply address your concerns?
It does address the immediate concern about MMIO registers, but I
think there is still an open question regarding what the correct
behavior on volatile variables should be in the absence of -msafe-bwa.
Arnd