On Sun, 11 Aug 2024 14:00:27 PDT (-0700), Robin Dapp wrote: > I figured it's easier to parse this as a series rather than one big > patch, in particular since target-specific code is involved. > > This adds an else operand to masked-load operations in order to avoid > implicit dependencies on zeroed masked-out elements. riscv does not > mandate zeroing for those but rather leaves them unspecified. > > The general idea is to query the proper operand of the target's > respective optab for a supported else value. If the supported value is > non-zero emit a cond_expr after the load in order to make the dependency > explicit and allow it to be optimized with the surrounding code. > > In order to keep the fallout manageable the patch is, for now, restricted to > only emit cond_exprs during explicit masking in tree-ifcvt. I have a local > version that emits a vec_cond_expr for each vector mask load but that would > cause several ripple effects further down the line. > > Loop masking in vectorizer context is as before. Also, the patch series only > considers element masking else values and no else value for length masking. > > The backend changes are supposed to be more proof-of-concept than anything > and are surely not idiomatic. x86's and aarch64's test suite results > are, however, unchanged. > > Robin Dapp (8): > docs: Document maskload else operand and behavior. > ifn: Add else-operand handling. > tree-ifcvt: Enforce zero else value after maskload. > vect: Add maskload else value support. > aarch64: Add masked-load else operands. > gcn: Add else operand to masked loads. > i386: Add else operand to masked loads. > RISC-V: Add else operand to masked loads [PR115536].
I think that's https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115336 (rv64gcv_zvl256b miscompile at -O3) not https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115536 (some modula2 thing). > .../aarch64/aarch64-sve-builtins-base.cc | 58 +++-- > gcc/config/aarch64/aarch64-sve-builtins.cc | 5 + > gcc/config/aarch64/aarch64-sve-builtins.h | 1 + > gcc/config/aarch64/aarch64-sve.md | 47 +++- > gcc/config/aarch64/aarch64-sve2.md | 3 +- > gcc/config/aarch64/predicates.md | 4 + > gcc/config/gcn/gcn-valu.md | 6 +- > gcc/config/gcn/predicates.md | 3 + > gcc/config/i386/i386-expand.cc | 59 ++++- > gcc/config/i386/predicates.md | 15 ++ > gcc/config/i386/sse.md | 124 ++++++---- > gcc/config/riscv/autovec.md | 45 ++-- > gcc/config/riscv/predicates.md | 3 + > gcc/config/riscv/riscv-v.cc | 26 ++- > gcc/doc/md.texi | 60 +++-- > gcc/internal-fn.cc | 88 +++++-- > gcc/internal-fn.h | 11 +- > gcc/optabs-query.cc | 83 +++++-- > gcc/optabs-query.h | 3 +- > gcc/optabs-tree.cc | 43 ++-- > gcc/optabs-tree.h | 8 +- > .../gcc.target/riscv/rvv/autovec/pr115336.c | 20 ++ > .../gcc.target/riscv/rvv/autovec/pr116059.c | 9 + > gcc/tree-if-conv.cc | 78 +++++-- > gcc/tree-vect-data-refs.cc | 39 +++- > gcc/tree-vect-patterns.cc | 17 +- > gcc/tree-vect-slp.cc | 22 +- > gcc/tree-vect-stmts.cc | 218 ++++++++++++++---- > gcc/tree-vectorizer.h | 11 +- > 29 files changed, 848 insertions(+), 261 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr115336.c > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr116059.c > > -- > 2.45.2