On Tue, 24 Jun 2025, Tamar Christina wrote: > store_bit_field_1 has an optimization where if a target is not a memory > operand > and the entire value is being set from something larger we can just wrap a > subreg around the source and emit a move. > > For vector constructors this is however problematic because the subreg means > that the expansion of the constructor won't happen through vec_init anymore.
But the expansion of the constructor happened already? > Complicated constructors which aren't natively supported by targets then ICE > as > they wouldn't have been expanded so recog fails. > > This patch blocks the optimization on non-constant vector constructors. Or > non-uniform > non-constant vectors. +static bool +foldable_value_with_subreg (rtx value) +{ + if (GET_CODE (value) != CONST_VECTOR || const_vec_duplicate_p (value)) + return true; that seems to allow any non-CONST_VECTOR, thus doesn't block non-constant vector constructors? It sounds like the problem happens upthread of store_bit_field_1? I allowed constant vectors because if I read the code right > simplify-rtx should be able to perform the simplification of pulling out the > element > or merging the constant values. There are several testcases in > aarch64-sve-pcs.exp > that test this as well. I allowed uniform non-constant vectors because they > would be folded into a vec_select later on. > > Note that codegen is quite horrible, for what should only be an lsr. But I'll > address that separately so that this patch is backportable. > > Bootstrapped Regtested on aarch64-none-linux-gnu, > arm-none-linux-gnueabihf, x86_64-pc-linux-gnu > -m32, -m64 and no issues. > > Ok for master? and GCC 15, 14, 13? > > Thanks, > Tamar > > > gcc/ChangeLog: > > PR target/120718 > * expmed.cc (store_bit_field_1): Only push subreg over uniform vector > constructors. > (foldable_value_with_subreg): New. > > gcc/testsuite/ChangeLog: > > PR target/120718 > * gcc.target/aarch64/sve/pr120718.c: New test. > > --- > diff --git a/gcc/expmed.cc b/gcc/expmed.cc > index > be427dca5d9afeed2013954472dde3a5430169e0..a468aa5c0c3f20bd62a7afc1d245d64e87be5396 > 100644 > --- a/gcc/expmed.cc > +++ b/gcc/expmed.cc > @@ -740,6 +740,28 @@ store_bit_field_using_insv (const extraction_insn *insv, > rtx op0, > return false; > } > > +/* For non-constant vectors wrapping a subreg around the RTX will not make > + the expression expand properly through vec_init. For constant vectors > + we can because simplification can just extract the element out by > + by merging the values. This can be done by simplify-rtx and so the > + subreg will be eliminated. However poly constants require vec_init as > + they are a runtime value. So only allow the subreg for simple integer > + or floating point constants. */ > + > +static bool > +foldable_value_with_subreg (rtx value) > +{ > + if (GET_CODE (value) != CONST_VECTOR || const_vec_duplicate_p (value)) > + return true; > + > + for (unsigned i = 0; i < const_vector_encoded_nelts (value); i++) > + if (!CONST_INT_P (const_vector_elt (value, i)) > + && !CONST_DOUBLE_P (const_vector_elt (value, i))) > + return false; > + > + return true; > +} > + > /* A subroutine of store_bit_field, with the same arguments. Return true > if the operation could be implemented. > > @@ -795,7 +817,8 @@ store_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, > poly_uint64 bitnum, > /* If the target is a register, overwriting the entire object, or storing > a full-word or multi-word field can be done with just a SUBREG. */ > if (!MEM_P (op0) > - && known_eq (bitsize, GET_MODE_BITSIZE (fieldmode))) > + && known_eq (bitsize, GET_MODE_BITSIZE (fieldmode)) > + && foldable_value_with_subreg (value)) > { > /* Use the subreg machinery either to narrow OP0 to the required > words or to cope with mode punning between equal-sized modes. > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr120718.c > b/gcc/testsuite/gcc.target/aarch64/sve/pr120718.c > new file mode 100644 > index > 0000000000000000000000000000000000000000..cb21d94792f0679a48cc20c3dcdf78c89c05a5c6 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr120718.c > @@ -0,0 +1,13 @@ > +/* { dg-do compile } */ > +/* { dg-additional-options "-O3" } */ > + > +#include <arm_sve.h> > +typedef int __attribute__((vector_size(8))) v2si; > +typedef struct { int x; int y; } A; > +void bar(A a); > +void foo() > +{ > + A a; > + *(v2si *)&a = (v2si){0, (int)svcntd_pat(SV_ALL)}; > + bar(a); > +} > > > -- Richard Biener <rguent...@suse.de> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)