On November 30, 2020 7:18:37 PM GMT+01:00, Richard Sandiford
<[email protected]> wrote:
>Richard Biener <[email protected]> writes:
>> On November 30, 2020 4:29:41 PM GMT+01:00, Richard Sandiford via
>Gcc-patches <[email protected]> wrote:
>>>dse.c:find_shift_sequence tries to represent a store and load
>>>back as a shift right followed by a truncation. It therefore
>>>needs to find an integer mode in which to do the shift right.
>>>The loop it uses has the form:
>>>
>>> FOR_EACH_MODE_FROM (new_mode_iter,
>>> smallest_int_mode_for_size (GET_MODE_BITSIZE (read_mode)))
>>>
>>>which implicitly assumes that read_mode has an equivalent integer
>mode.
>>>As shown in the testcase, not all modes have such an integer mode.
>>
>> But if no such mode exists iterating won't help either? So why not
>simply fail when the mode does not exist?
>
>You mean test against the maximum integer mode before the loop,
>but keep the smallest_int_mode_for_size call as-is? I'm not sure
>that's going to be more efficient in practice, and it seems less
>obviously correct.
>
>Alternatively, we could have a version of smallest_int_mode_for_size
>that returns failure instead of asserting.
OH, didn't remember this one asserts...
But
>smallest_int_mode_for_size
>is itself a FOR_EACH_MODE_* iterator, so it would iterate just as much
>as
>the patch does.
>
>smallest_int_mode_for_size also has some __int20 etc. handling that I
>don't think we want here, and probably avoid by luck. (We already know
>at this point that any shift value would be nonzero, which probably
>weeds out most of the problematic cases for PSImode targets.)
Ok, I see.
>find_shift_sequence already iterates over the modes itself and it
>already has custom requirements in terms of when to break and when
>to continue. It just seems simpler and more obvious for the loop to
>iterate over all the modes directly rather than delegate part of the
>iteration to another function.
Fine. Jeff has already acked the patch.
Richard.
>Thanks,
>Richard
>
>
>
>>
>>>This patch just makes the code start from the smallest integer mode
>and
>>>skip modes that are too small. The loop already breaks at the first
>>>mode wider than word_mode.
>>>
>>>Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK for trunk and
>>>GCC 10 branch?
>>>
>>>Richard
>>>
>>>
>>>gcc/
>>> PR rtl-optimization/98037
>>> * dse.c (find_shift_sequence): Iterate over all integers and
>>> skip modes that are too small.
>>>
>>>gcc/testsuite/
>>> PR rtl-optimization/98037
>>> * gcc.target/aarch64/sve/acle/general/pr98037.c: New test.
>>>---
>>> gcc/dse.c | 5
>+++--
>>> gcc/testsuite/gcc.target/aarch64/sve/acle/general/pr98037.c | 6
>++++++
>>> 2 files changed, 9 insertions(+), 2 deletions(-)
>>>create mode 100644
>>>gcc/testsuite/gcc.target/aarch64/sve/acle/general/pr98037.c
>>>
>>>diff --git a/gcc/dse.c b/gcc/dse.c
>>>index d65266b5476..651e6e7e71e 100644
>>>--- a/gcc/dse.c
>>>+++ b/gcc/dse.c
>>>@@ -1757,8 +1757,7 @@ find_shift_sequence (poly_int64 access_size,
>>> the machine. */
>>>
>>> opt_scalar_int_mode new_mode_iter;
>>>- FOR_EACH_MODE_FROM (new_mode_iter,
>>>- smallest_int_mode_for_size (GET_MODE_BITSIZE (read_mode)))
>>>+ FOR_EACH_MODE_IN_CLASS (new_mode_iter, MODE_INT)
>>> {
>>> rtx target, new_reg, new_lhs;
>>> rtx_insn *shift_seq, *insn;
>>>@@ -1767,6 +1766,8 @@ find_shift_sequence (poly_int64 access_size,
>>> new_mode = new_mode_iter.require ();
>>> if (GET_MODE_BITSIZE (new_mode) > BITS_PER_WORD)
>>> break;
>>>+ if (maybe_lt (GET_MODE_SIZE (new_mode), GET_MODE_SIZE
>>>(read_mode)))
>>>+ continue;
>>>
>>> /* Try a wider mode if truncating the store mode to NEW_MODE
>>> requires a real instruction. */
>>>diff --git
>>>a/gcc/testsuite/gcc.target/aarch64/sve/acle/general/pr98037.c
>>>b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/pr98037.c
>>>new file mode 100644
>>>index 00000000000..b91e940b18e
>>>--- /dev/null
>>>+++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/pr98037.c
>>>@@ -0,0 +1,6 @@
>>>+/* { dg-options "-msve-vector-bits=1024 -O3" } */
>>>+
>>>+typedef __SVInt8_t vec __attribute__((arm_sve_vector_bits(1024)));
>>>+struct pair { vec v[2]; };
>>>+void use (struct pair *);
>>>+vec f (struct pair p) { vec v = p.v[1]; use (&p); return v; }