https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117759
--- Comment #4 from GCC Commits <cvs-commit at gcc dot gnu.org> --- The master branch has been updated by Maciej W. Rozycki <ma...@gcc.gnu.org>: https://gcc.gnu.org/g:1b85c548e2480116c74a7f74b487e3787c770056 commit r15-9037-g1b85c548e2480116c74a7f74b487e3787c770056 Author: Maciej W. Rozycki <ma...@orcam.me.uk> Date: Sun Mar 30 15:24:51 2025 +0100 Alpha: Add option to avoid data races for partial writes [PR117759] Similarly to data races with 8-bit byte or 16-bit word quantity memory writes on non-BWX Alpha implementations we have the same problem even on BWX implementations with partial memory writes produced for unaligned stores as well as block memory move and clear operations. This happens at the boundaries of the area written where we produce unprotected RMW sequences, such as for example: ldbu $1,0($3) stw $31,8($3) stq $1,0($3) to zero a 9-byte member at the byte offset of 1 of a quadword-aligned struct, happily clobbering a 1-byte member at the beginning of said struct if concurrent write happens while executing on the same CPU such as in a signal handler or a parallel write happens while executing on another CPU such as in another thread or via a shared memory segment. To guard against these data races with partial memory write accesses introduce the `-msafe-partial' command-line option that instructs the compiler to protect boundaries of the data quantity accessed by instead using a longer code sequence composed of narrower memory writes where suitable machine instructions are available (i.e. with BWX targets) or atomic RMW access sequences where byte and word memory access machine instructions are not available (i.e. with non-BWX targets). Owing to the desire of branch avoidance there are redundant overlapping writes in unaligned cases where STQ_U operations are used in the middle of a block so as to make sure no part of data to be written has been lost regardless of run-time alignment. For the non-BWX case it means that with blocks whose size is not a multiple of 8 there are additional atomic RMW sequences issued towards the end of the block in addition to the always required pair enclosing the block from each end. Only one such additional atomic RMW sequence is actually required, but code currently issues two for the sake of simplicity. An improvement might be added to `alpha_expand_unaligned_store_words_safe_partial' in the future, by folding `alpha_expand_unaligned_store_safe_partial' code for handling multi-word blocks whose size is not a multiple of 8 (i.e. with a trailing partial-word part). It would improve performance a bit, but current code is correct regardless. Update test cases with `-mno-safe-partial' where required and add new ones accordingly. In some cases GCC chooses to open-code block memory write operations, so with non-BWX targets `-msafe-partial' will in the usual case have to be used together with `-msafe-bwa'. Credit to Magnus Lindholm <linm...@gmail.com> for sharing hardware for the purpose of verifying the BWX side of this change. gcc/ PR target/117759 * config/alpha/alpha-protos.h (alpha_expand_unaligned_store_safe_partial): New prototype. * config/alpha/alpha.cc (alpha_expand_movmisalign) (alpha_expand_block_move, alpha_expand_block_clear): Handle TARGET_SAFE_PARTIAL. (alpha_expand_unaligned_store_safe_partial) (alpha_expand_unaligned_store_words_safe_partial) (alpha_expand_clear_safe_partial_nobwx): New functions. * config/alpha/alpha.md (insvmisaligndi): Handle TARGET_SAFE_PARTIAL. * config/alpha/alpha.opt (msafe-partial): New option. * config/alpha/alpha.opt.urls: Regenerate. * doc/invoke.texi (Option Summary, DEC Alpha Options): Document the new option. gcc/testsuite/ PR target/117759 * gcc.target/alpha/memclr-a2-o1-c9-ptr.c: Add `-mno-safe-partial'. * gcc.target/alpha/memclr-a2-o1-c9-ptr-safe-partial.c: New file. * gcc.target/alpha/memcpy-di-unaligned-dst.c: New file. * gcc.target/alpha/memcpy-di-unaligned-dst-safe-partial.c: New file. * gcc.target/alpha/memcpy-di-unaligned-dst-safe-partial-bwx.c: New file. * gcc.target/alpha/memcpy-si-unaligned-dst.c: New file. * gcc.target/alpha/memcpy-si-unaligned-dst-safe-partial.c: New file. * gcc.target/alpha/memcpy-si-unaligned-dst-safe-partial-bwx.c: New file. * gcc.target/alpha/stlx0.c: Add `-mno-safe-partial'. * gcc.target/alpha/stlx0-safe-partial.c: New file. * gcc.target/alpha/stlx0-safe-partial-bwx.c: New file. * gcc.target/alpha/stqx0.c: Add `-mno-safe-partial'. * gcc.target/alpha/stqx0-safe-partial.c: New file. * gcc.target/alpha/stqx0-safe-partial-bwx.c: New file. * gcc.target/alpha/stwx0.c: Add `-mno-safe-partial'. * gcc.target/alpha/stwx0-bwx.c: Add `-mno-safe-partial'. Refer to stwx0.c rather than copying its code and also verify no LDQ_U or STQ_U instructions have been produced. * gcc.target/alpha/stwx0-safe-partial.c: New file. * gcc.target/alpha/stwx0-safe-partial-bwx.c: New file.