https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69710

--- Comment #8 from amker at gcc dot gnu.org ---
Reproduced on arm with saxpy.c.  The dump for slp is as below:

  <bb 13>:
  _82 = prologue_after_cost_adjust.7_43 * 4;
  vectp_dy.13_81 = dy_9(D) + _82;
  _87 = prologue_after_cost_adjust.7_43 * 4;
  vectp_dx.16_86 = dx_13(D) + _87;
  vect_cst__91 = {da_6(D), da_6(D), da_6(D), da_6(D)};
  _95 = prologue_after_cost_adjust.7_43 * 4;
  vectp_dy.21_94 = dy_9(D) + _95;

  <bb 14>:
  # vectp_dy.12_83 = PHI <vectp_dy.13_81(13), vectp_dy.12_84(21)>
  # vectp_dx.15_88 = PHI <vectp_dx.16_86(13), vectp_dx.15_89(21)>
  # vectp_dy.20_96 = PHI <vectp_dy.21_94(13), vectp_dy.20_97(21)>
  # ivtmp_99 = PHI <0(13), ivtmp_100(21)>
  vect__12.14_85 = MEM[(float *)vectp_dy.12_83];
  vect__15.17_90 = MEM[(float *)vectp_dx.15_88];
  vect__16.18_92 = vect_cst__91 * vect__15.17_90;
  vect__17.19_93 = vect__12.14_85 + vect__16.18_92;
  MEM[(float *)vectp_dy.20_96] = vect__17.19_93;
  vectp_dy.12_84 = vectp_dy.12_83 + 16;
  vectp_dx.15_89 = vectp_dx.15_88 + 16;
  vectp_dy.20_97 = vectp_dy.20_96 + 16;
  ivtmp_100 = ivtmp_99 + 1;
  if (ivtmp_100 < bnd.9_53)
    goto <bb 21>;
  else
    goto <bb 16>;

  <bb 21>:
    goto <bb 14>;

IVO recognized below uses:

use 0
  address
  in statement vect__12.14_85 = MEM[(float *)vectp_dy.12_83];

  at position MEM[(float *)vectp_dy.12_83]
  type vector(4) float *
  base vectp_dy.13_81
  step 16
  base object (void *) vectp_dy.13_81
  related candidates 

use 1
  generic
  in statement vectp_dx.15_88 = PHI <vectp_dx.16_86(13), vectp_dx.15_89(21)>

  at position 
  type vector(4) float *
  base vectp_dx.16_86
  step 16
  base object (void *) vectp_dx.16_86
  is a biv
  related candidates 

use 2
  address
  in statement MEM[(float *)vectp_dy.20_96] = vect__17.19_93;

  at position MEM[(float *)vectp_dy.20_96]
  type vector(4) float *
  base vectp_dy.21_94
  step 16
  base object (void *) vectp_dy.21_94
  related candidates 

use 3
  compare
  in statement if (ivtmp_100 < bnd.9_53)

  at position 
  type unsigned int
  base 1
  step 1
  is a biv
  related candidates 

There are two problems:
1) we failed recognize that use 0 and 2 are identical to each other.  This is
because vectorizer generates redundant setup code in loop pre-header.  There
are two possible fixes here.  One is to make expand_simple_operations more
aggressive in expanding (used by ivopts) in tree-ssa-loop-niter.c.  But I don't
think this is a good idea in all cases, because expanded complicated expression
makes ivo transform and niter analysis harder.  The other is to fix vectorizer
to generate clean code.  Richard's suggestion is to use gimple_build for that.

2) use 1 is not recognized as an address iv because alignment of that memory
reference.

Reply via email to