On 7/23/21 2:41 AM, Kewen.Lin wrote:
on 2021/7/22 下午8:56, Richard Biener wrote:
On Tue, Jul 20, 2021 at 4:37
PM Kewen.Lin <li...@linux.ibm.com> wrote:
Hi,
This v2 has addressed some review comments/suggestions:
- Use "!=" instead of "<" in function operator!= (const Iter &rhs)
- Add new CTOR loops_list (struct loops *loops, unsigned flags)
to support loop hierarchy tree rather than just a function,
and adjust to use loops* accordingly.
I actually meant struct loop *, not struct loops * ;) At the point
we pondered to make loop invariant motion work on single
loop nests we gave up not only but also because it iterates
over the loop nest but all the iterators only ever can process
all loops, not say, all loops inside a specific 'loop' (and
including that 'loop' if LI_INCLUDE_ROOT). So the
CTOR would take the 'root' of the loop tree as argument.
I see that doesn't trivially fit how loops_list works, at least
not for LI_ONLY_INNERMOST. But I guess FROM_INNERMOST
could be adjusted to do ONLY_INNERMOST as well?
Thanks for the clarification! I just realized that the previous
version with struct loops* is problematic, all traversal is
still bounded with outer_loop == NULL. I think what you expect
is to respect the given loop_p root boundary. Since we just
record the loops' nums, I think we still need the function* fn?
So I add one optional argument loop_p root and update the
visiting codes accordingly. Before this change, the previous
visiting uses the outer_loop == NULL as the termination condition,
it perfectly includes the root itself, but with this given root,
we have to use it as the termination condition to avoid to iterate
onto its possible existing next.
For LI_ONLY_INNERMOST, I was thinking whether we can use the
code like:
struct loops *fn_loops = loops_for_fn (fn)->larray;
for (i = 0; vec_safe_iterate (fn_loops, i, &aloop); i++)
if (aloop != NULL
&& aloop->inner == NULL
&& flow_loop_nested_p (tree_root, aloop))
this->to_visit.quick_push (aloop->num);
it has the stable bound, but if the given root only has several
child loops, it can be much worse if there are many loops in fn.
It seems impossible to predict the given root loop hierarchy size,
maybe we can still use the original linear searching for the case
loops_for_fn (fn) == root? But since this visiting seems not so
performance critical, I chose to share the code originally used
for FROM_INNERMOST, hope it can have better readability and
maintainability.
I might be mixing up the two patches (they both seem to touch
the same functions), but in this one the loops_list ctor looks
like a sizeable function with at least one loop. Since the ctor
is used in the initialization of each of the many range-for loops,
that could result in inlining of a lot of these calls and so quite
a bit code bloat. Unless this is necessary for efficiency (not
my area) I would recommend to consider defining the loops_list
ctor out-of-line in some .c or .cc file.
(Also, if you agree with the rationale, I'd replace loop_p with
loop * in the new code.)
Thanks
Martin
Bootstrapped and regtested on powerpc64le-linux-gnu P9,
x86_64-redhat-linux and aarch64-linux-gnu, also
bootstrapped on ppc64le P9 with bootstrap-O3 config.
Does the attached patch meet what you expect?
BR,
Kewen
-----
gcc/ChangeLog:
* cfgloop.h (loops_list::loops_list): Add one optional argument root
and adjust accordingly.