https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102383

Kewen Lin <linkw at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |linkw at gcc dot gnu.org

--- Comment #4 from Kewen Lin <linkw at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #2)
> The issue is that we tame PRE because it tends to inhibit vectorization.
> 
>       /* Inhibit the use of an inserted PHI on a loop header when
>          the address of the memory reference is a simple induction
>          variable.  In other cases the vectorizer won't do anything
>          anyway (either it's loop invariant or a complicated
>          expression).  */
>       if (sprime
>           && TREE_CODE (sprime) == SSA_NAME
>           && do_pre
>           && (flag_tree_loop_vectorize || flag_tree_parallelize_loops > 1)
>           && loop_outer (b->loop_father)
>           && has_zero_uses (sprime)
>           && bitmap_bit_p (inserted_exprs, SSA_NAME_VERSION (sprime))
>           && gimple_assign_load_p (stmt))
> 
> the heuristic would either need to become much more elaborate (do more
> checks whether vectorization is likely) or we could make the behavior
> depend on the cost model as well, for example exclude very-cheap here.
> That might have an influence on the performance benefit seen from
> -O2 default vectorization though.
> 
> IIRC we suggested to enable predictive commoning at -O2 but avoid
> unroll factors > 1 when it was not explicitely enabled.
> 

Yeah, it's PR100794.  I also collected some data for different approaches at
that time.  Recently I opened another issue PR102054 which is also related to
that we restrict PRE due to loop-vect.

Reply via email to