> > Pengxuan Zheng <quic_pzh...@quicinc.com> writes: > > > We can still use SVE's INDEX instruction to construct vectors even > > > if not all elements are constants. For example, { 0, x, 2, 3 } can > > > be constructed by first using "INDEX #0, #1" to generate { 0, 1, 2, > > > 3 }, and then set the elements which are non-constants separately. > > > > > > PR target/113328 > > > > > > gcc/ChangeLog: > > > > > > * config/aarch64/aarch64.cc (aarch64_expand_vector_init_fallback): > > > Improve part-variable vector generation with SVE's INDEX if > > TARGET_SVE > > > is available. > > > > > > gcc/testsuite/ChangeLog: > > > > > > * gcc.target/aarch64/sve/acle/general/dupq_1.c: Update test to use > > > check-function-bodies. > > > * gcc.target/aarch64/sve/acle/general/dupq_2.c: Likewise. > > > * gcc.target/aarch64/sve/acle/general/dupq_3.c: Likewise. > > > * gcc.target/aarch64/sve/acle/general/dupq_4.c: Likewise. > > > * gcc.target/aarch64/sve/vec_init_4.c: New test. > > > * gcc.target/aarch64/sve/vec_init_5.c: New test. > > > > > > Signed-off-by: Pengxuan Zheng <quic_pzh...@quicinc.com> > > > --- > > > gcc/config/aarch64/aarch64.cc | 81 ++++++++++++++++++- > > > .../aarch64/sve/acle/general/dupq_1.c | 18 ++++- > > > .../aarch64/sve/acle/general/dupq_2.c | 18 ++++- > > > .../aarch64/sve/acle/general/dupq_3.c | 18 ++++- > > > .../aarch64/sve/acle/general/dupq_4.c | 18 ++++- > > > .../gcc.target/aarch64/sve/vec_init_4.c | 47 +++++++++++ > > > .../gcc.target/aarch64/sve/vec_init_5.c | 12 +++ > > > 7 files changed, 199 insertions(+), 13 deletions(-) create mode > > > 100644 gcc/testsuite/gcc.target/aarch64/sve/vec_init_4.c > > > create mode 100644 > > > gcc/testsuite/gcc.target/aarch64/sve/vec_init_5.c > > > > > > diff --git a/gcc/config/aarch64/aarch64.cc > > > b/gcc/config/aarch64/aarch64.cc index 6b3ca57d0eb..7305a5c6375 > > > 100644 > > > --- a/gcc/config/aarch64/aarch64.cc > > > +++ b/gcc/config/aarch64/aarch64.cc > > > @@ -23942,12 +23942,91 @@ aarch64_expand_vector_init_fallback (rtx > > target, rtx vals) > > > if (n_var != n_elts) > > > { > > > rtx copy = copy_rtx (vals); > > > + bool is_index_seq = false; > > > + > > > + /* If at least half of the elements of the vector are constants > > > and all > > > + these constant elements form a linear sequence of the form { B, B > > > ++ > > S, > > > + B + 2 * S, B + 3 * S, ... }, we can generate the vector with SVE's > > > + INDEX instruction if SVE is available and then set the elements which > > > + are not constant separately. More precisely, each constant element I > > > + has to be B + I * S where B and S must be valid immediate operand > > for > > > + an SVE INDEX instruction. > > > + > > > + For example, { X, 1, 2, 3} is a vector satisfying these conditions and > > > + we can generate a vector of all constants (i.e., { 0, 1, 2, 3 }) first > > > + and then set the first element of the vector to X. */ > > > + > > > + if (TARGET_SVE && GET_MODE_CLASS (mode) == > MODE_VECTOR_INT > > > + && n_var <= n_elts / 2) > > > + { > > > + int const_idx = -1; > > > + HOST_WIDE_INT const_val = 0; > > > + int base = 16; > > > + int step = 16; > > > + > > > + for (int i = 0; i < n_elts; ++i) > > > + { > > > + rtx x = XVECEXP (vals, 0, i); > > > + > > > + if (!CONST_INT_P (x)) > > > + continue; > > > + > > > + if (const_idx == -1) > > > + { > > > + const_idx = i; > > > + const_val = INTVAL (x); > > > + } > > > + else > > > + { > > > + if ((INTVAL (x) - const_val) % (i - const_idx) == 0) > > > + { > > > + HOST_WIDE_INT s > > > + = (INTVAL (x) - const_val) / (i - const_idx); > > > + if (s >= -16 && s <= 15) > > > + { > > > + int b = const_val - s * const_idx; > > > + if (b >= -16 && b <= 15) > > > + { > > > + base = b; > > > + step = s; > > > + } > > > + } > > > + } > > > + break; > > > + } > > > + } > > > + > > > + if (base != 16 > > > + && (!CONST_INT_P (v0) > > > + || (CONST_INT_P (v0) && INTVAL (v0) == base))) > > > + { > > > + if (!CONST_INT_P (v0)) > > > + XVECEXP (copy, 0, 0) = GEN_INT (base); > > > + > > > + is_index_seq = true; > > > + for (int i = 1; i < n_elts; ++i) > > > + { > > > + rtx x = XVECEXP (copy, 0, i); > > > + > > > + if (CONST_INT_P (x)) > > > + { > > > + if (INTVAL (x) != base + i * step) > > > + { > > > + is_index_seq = false; > > > + break; > > > + } > > > + } > > > + else > > > + XVECEXP (copy, 0, i) = GEN_INT (base + i * step); > > > + } > > > + } > > > + } > > > > This seems a bit more complex than I was hoping for, although the > > complexity is probably justified. > > > > Seeing how awkard it is to do this using current interfaces, I think > > I'd instead prefer to do something that I'd been vaguely hoping to do > > for a while: extend vector-builder.h to accept wildcard/don't care values. > > finalize () could then replace the wildcards with whatever gives the > > "nicest" > > encoding. > > > > That's also going to be relatively complex, but I think it'd be more > > general, and might help with the existing vec_init code as well. > > It would also be a step towards optimising -1 indices for > > __builtin_shufflevector. It might be a few weeks before I can post > > something though. > > No problem, Richard. > > I am also curious to see what this alternative implementation looks like. > Please kindly keep me posted when your patch is ready. Thank you! > > > > > Pushing 1/2 without 2/2 has meant that the dupq tests will fail in the > > meantime, but that's ok. In general, though, it's better not to push > > individual patches from a series unless they've been tested in > > isolation and are known to give clean test results. > > In fact, the dupq tests were not affected. Patch 1/2 already adjusted the > "scan-assembler" checks of the dupq tests based on the output of 1/2 alone. > Patch 2/2 just replaces the "scan-assembler" checks with "check-function- > bodies." So, the dupq tests still pass without 2/2.
Just realized that I got confused on what 1/2 does. You are right. The dupq tests will fail for now. Again, sorry for the confusions caused. 😊 Thanks, Pengxuan > > Thanks, > Pengxuan > > > > Thanks, > > Richard