Hi,
This has come out of a discussion[1] around the removal of non-BWX Alpha
support from the Linux kernel due to data races affecting RCU algorithms.
As it happens these data races also apply to BWX Alpha systems, as I have
discovered in the course of this effort, although owing to how the Alpha
backend of GCC has implemented block copy and clear operations rather than
actual hardware limitations, for example GCC will happily produce code
such as:
ldbu $1,0($3)
stw $31,8($3)
stq $1,0($3)
to zero a 9-byte member at the byte offset of 1 of a quadword-aligned
struct, happily clobbering a 1-byte member at the beginning of said struct
if there is a concurrent or parallel write to that member in the middle of
the unprotected RMW sequence.
This patch series addresses these issues in the last two changes, having
made generic test suite updates to improve coverage in the concerned area
first and then having addressed a bunch of issues in the code affected I
discovered in the course of this effort. There is a patch that includes
pair of changes to the middle end (reload+LRA) required by this update as
well, so it's not a purely backend-limited change, and hence no "Alpha:"
prefix on the cover letter or the relevant patches.
The intent for these changes is to eventually bring Linux kernel support
back for non-BWX systems still in people's possession.
This has been verified with the `alpha-linux-gnu' (EV4) target using a
POWER9 system as the host and an AlphaServer 300 (EV45) system as the
target, with no regressions except where expected due to LDx_L (as always
the first in a sequence) executed with an unaligned address, exceedingly
rarely though (4 test cases across all the GCC frontends and libraries
covered). This will be addressed in due course via emulation on the Linux
kernel side.
No Rust frontend or libgrust library verification has been run due to a
recent version requirement increase for the `cargo' helper tool, which my
development system cannot currently satisfy and I figured out that getting
that sorted right now would be the best use of my time.
I have attempted to verify a BWX configuration as well, using QEMU in the
user emulation mode. This has proved unreliable due to an exceedingly
large quantity of intermittent failures reported for no clear reason (i.e.
`qemu-alpha' just returning a nonzero exit status), regardless of the
presence of any patches from this series. Therefore BWX support has only
been smoke-tested by running the relevant subset of the tests repeatedly
until there was at least one clear run of each test. I will appreciate
assistance with BWX verification then, as I only have EV45 hardware
available and no prospect for this to change.
More details on testing have been included with the respective changes.
The patches in the series have been ordered such as to place hopefully
the easier if not obvious ones at the front, so that they can go in right
away even if ones coming later in the series turn problematic.
A couple of Linux kernel people who were active in the discussion of an
outline of this solution proposed here have been cc-ed on this cover
letter and the final two patches of the series that actually implement the
solution, in case you'd like to make a comment or otherwise just FYI so
that you are aware of the progress.
Comments, questions, voices of concern or appreciation, all very welcome.
References:
[1] "alpha: cleanups for 6.10",
<https://lore.kernel.org/r/[email protected]/>
Maciej