[Bug tree-optimization/98138] BB vect fail to SLP one case

linkw at gcc dot gnu.org via Gcc-bugs Tue, 05 Jan 2021 19:29:53 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98138


--- Comment #5 from Kewen Lin <linkw at gcc dot gnu.org> ---
(In reply to Kewen Lin from comment #4)
> One rough idea seems:
>   1) Relax this condition all_uniform_p somehow to get SLP instance building
> to go deeper and get those p1/p2 loads as SLP nodes.
>   2) Introduce one more vect_pattern recognizer to catch this kind of
> pattern, transform the slp instance as we expect. I assume we can know the
> whole slp instance then we can transform it as we want here. Probably need
> some costing condition to gate this pattern matching.
>   3) If 2) fail, trim the slp instance from those nodes which satisfy
> all_uniform_p condition to ensure it's same as before.
> 

For 2), instead of vect_pattern with IFN, the appropriate place seems to be
vect_optimize_slp.

But after more thinking, building SLP instance starting from group loads
instead of group stores looks more straightforward. 

  a0 = (p1[0] - p2[0]);
  a1 = (p1[1] - p2[1]);
  a2 = (p1[2] - p2[2]);
  a3 = (p1[3] - p2[3]);

Building the vector <a0, a1, a2, a3> looks more natural and then check the uses
of its all lanes and special patterns to have vector <t0, t1, t2, t3> and
repeat similarly.

Hi Richi,

Is this a good example to request SLP instance build starting group loads?

[Bug tree-optimization/98138] BB vect fail to SLP one case

Reply via email to