On Mon, 16 Aug 2021, Xiong Hu Luo wrote:
> It seems to me that ALWAYS_EXECUTED_IN is not computed correctly for
> nested loops. inn_loop is updated to inner loop, so it need be restored
> when exiting from innermost loop. With this patch, the store instruction
> in outer loop could also be moved out of outer loop by store motion.
> Any comments? Thanks.
> gcc/ChangeLog:
>
> * tree-ssa-loop-im.c (fill_always_executed_in_1): Restore
> inn_loop when exiting from innermost loop.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/tree-ssa/ssa-lim-19.c: New test.
> ---
> gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-19.c | 24 ++++++++++++++++++++++
> gcc/tree-ssa-loop-im.c | 6 +++++-
> 2 files changed, 29 insertions(+), 1 deletion(-)
> create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-19.c
>
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-19.c
> b/gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-19.c
> new file mode 100644
> index 00000000000..097a5ee4a4b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-19.c
> @@ -0,0 +1,24 @@
> +/* PR/101293 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-lim2-details" } */
> +
> +struct X { int i; int j; int k;};
> +
> +void foo(struct X *x, int n, int l)
> +{
> + for (int j = 0; j < l; j++)
> + {
> + for (int i = 0; i < n; ++i)
> + {
> + int *p = &x->j;
> + int tem = *p;
> + x->j += tem * i;
> + }
> + int *r = &x->k;
> + int tem2 = *r;
> + x->k += tem2 * j;
> + }
> +}
> +
> +/* { dg-final { scan-tree-dump-times "Executing store motion" 2 "lim2" } } */
> +
> diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c
> index b24bc64f2a7..5ca4738b20e 100644
> --- a/gcc/tree-ssa-loop-im.c
> +++ b/gcc/tree-ssa-loop-im.c
> @@ -3211,6 +3211,10 @@ fill_always_executed_in_1 (class loop *loop, sbitmap
> contains_call)
> if (dominated_by_p (CDI_DOMINATORS, loop->latch, bb))
> last = bb;
>
> + if (inn_loop != loop
> + && flow_loop_nested_p (bb->loop_father, inn_loop))
> + inn_loop = bb->loop_father;
> +
The comment says
/* In a loop that is always entered we may proceed anyway.
But record that we entered it and stop once we leave it.
*/
inn_loop = bb->loop_father;
and your change would defeat that early return, no?
> if (bitmap_bit_p (contains_call, bb->index))
> break;
>
> @@ -3238,7 +3242,7 @@ fill_always_executed_in_1 (class loop *loop, sbitmap
> contains_call)
>
> if (bb->loop_father->header == bb)
> {
> - if (!dominated_by_p (CDI_DOMINATORS, loop->latch, bb))
> + if (!dominated_by_p (CDI_DOMINATORS, bb->loop_father->latch, bb))
> break;
That's now a always false condition - a loops latch is always dominated
by its header. The condition as written tries to verify whether the
loop is always entered - mind we visit all blocks, not only those
always executed.
In fact for your testcase the x->j ref is _not_ always executed
since the inner loop is conditional on n > 0.
Richard.