http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50955

--- Comment #18 from rguenther at suse dot de <rguenther at suse dot de> ---
"amker.cheng at gmail dot com" <gcc-bugzi...@gcc.gnu.org> wrote:
>http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50955
>
>bin.cheng <amker.cheng at gmail dot com> changed:
>
>           What    |Removed                     |Added
>----------------------------------------------------------------------------
>           CC|                            |amker.cheng at gmail dot com
>
>--- Comment #17 from bin.cheng <amker.cheng at gmail dot com> ---
>Hi Richard,
>I am having difficulty in understanding cases if this PR.
>For the reported case with two loops:
>
>  for( y=0; y<4; y++, pDst += dstStep ) {
>    for( x=y+1; x<4; x++ ) {
>            s = ( p1[x-y-1] + p1[x-y] + p1[x-y] + p1[x-y+1] + 2 ) >> 2;
>                    pDst[x] = (unsigned char)s;
>    }
>
>    pDst[y] = p3;
>  }
>
>The dump for statement 'pDst[y] = p3;' before IVOPT is like:
>
><bb 4>:
>Invalid sum of incoming frequencies 1667, should be 278
>  y.2_64 = (sizetype) y_89;
>  D.6421_65 = pDst_88 + y.2_64;
>  *D.6421_65 = p3_37;
>  pDst_69 = pDst_88 + pretmp.21_118;
>  ivtmp.35_116 = ivtmp.35_87 - 1;
>  if (ivtmp.35_116 != 0)
>    goto <bb 18>;
>  else
>    goto <bb 19>;
>
>
>IVOPT chooses candidate 15:
>candidate 15
>  depends on 3
>  var_before ivtmp.154
>  var_after ivtmp.154
>  incremented before exit test
>  type unsigned int
>  base (unsigned int) pDst_39(D) - (unsigned int) &p1
>  step (unsigned int) (pretmp.21_118 + 1)
>for use 1:
>use 1
>  address
>  in statement *D.6421_65 = p3_37;
>
>  at position *D.6421_65
>  type unsigned char *
>  base pDst_39(D)
>  step pretmp.21_118 + 1
>  base object (void *) pDst_39(D)
>  related candidates 
>
>After rewriting, the dump is like:
>
><bb 4>:
>Invalid sum of incoming frequencies 1667, should be 278
>  MEM[symbol: p1, index: ivtmp.154_200, offset: 0B] = p3_37;
>  pDst_69 = pDst_88 + pretmp.21_118;
>  ivtmp.149_218 = ivtmp.149_249 - 1;
>  ivtmp.154_190 = ivtmp.154_200 + D.6617_250;
>  if (x_40 != 4)
>    goto <bb 18>;
>  else
>    goto <bb 19>;
>
>Eventually, the storing to TMR[p1,ivtmp,0] is considered local and
>deleted.
>
>BUT, for your reduced case:
>
>  p3 = (unsigned char)(((signed int)p1[1] + (signed int)p2[1]
>                   + (signed int)p1[0] +(signed int)p1[0] + 2 ) >> 2 );
>
>  for( x=y+1; x<4; x++ ) {
>      s = ( p1[x-y-1] + p1[x-y] + p1[x-y] + p1[x-y+1] + 2 ) >> 2;
>      pDst[x] = (unsigned char)s;
>  }
>
>  pDst[y] = p3;
>
>It is about the the TMR in below dump (before IVOPT):
>
><bb 6>:
>  # vect_pp1.30_166 = PHI <vect_pp1.30_167(16), vect_pp1.33_165(5)>
>  # vect_pp1.37_176 = PHI <vect_pp1.37_177(16), vect_pp1.40_175(5)>
>  # vect_pp1.46_194 = PHI <vect_pp1.46_195(16), vect_pp1.49_193(5)>
>  # vect_p.60_223 = PHI <vect_p.60_224(16), vect_p.63_222(5)>
>  # ivtmp.64_225 = PHI <ivtmp.64_226(16), 0(5)>
>  ...
>  MEM[(unsigned char *)vect_p.60_223] = vect_var_.58_219;
>  vect_pp1.30_167 = vect_pp1.30_166 + 8;
>  vect_pp1.37_177 = vect_pp1.37_176 + 8;
>  vect_pp1.46_195 = vect_pp1.46_194 + 8;
>  vect_p.60_224 = vect_p.60_223 + 8;
>  ivtmp.64_226 = ivtmp.64_225 + 1;
>  if (ivtmp.64_226 < bnd.27_128)
>    goto <bb 16>;
>  else
>    goto <bb 7>;
>
>Your patch prevents IVOPT from choosing cand 4:
>candidate 4 (important)
>  var_before ivtmp.110
>  var_after ivtmp.110
>  incremented before exit test
>  type unsigned int
>  base (unsigned int) (&p1 + 8)
>  step 8
>  base object (void *) &p1
>for use 3:
>use 3
>  generic
> in statement vect_p.60_223 = PHI <vect_p.60_224(16), vect_p.63_222(5)>
>
>  at position 
>  type vector(8) unsigned char *
>  base batmp.61_221 + 1
>  step 8
>  base object (void *) batmp.61_221
>  is a biv
>  related candidates 
>
>To prevent IVOPT from rewriting into:
>
><bb 6>:
>  # ivtmp.107_150 = PHI <ivtmp.107_256(16), 0(5)>
>  # ivtmp.110_241 = PHI <ivtmp.110_146(16), ivtmp.110_132(5)>
>  D.6585_133 = (unsigned int) batmp.61_221;
>  p1.131_277 = (unsigned int) &p1;
>  D.6587_278 = D.6585_133 - p1.131_277;
>  D.6588_279 = D.6587_278 + ivtmp.110_241;
>  D.6589_280 = D.6588_279 + 4294967289;
>  D.6590_281 = (vector(8) unsigned char *) D.6589_280;
>  vect_p.60_223 = D.6590_281;
>  ...
>  MEM[(unsigned char *)vect_p.60_223] = vect_var_.58_219;
>  ivtmp.107_256 = ivtmp.107_150 + 1;
>  ivtmp.110_146 = ivtmp.110_241 + 8;
>  if (ivtmp.107_256 < bnd.27_128)
>    goto <bb 16>;
>  else
>    goto <bb 7>;
>
>Thus prevents IVOPT from generating candidate 15 in outer loop. 
>(Expressing
>use 3 by cand 4 itself is good, right?)
>
>
>-------------------------------
>But,
>It seems because the check:
>
>  if (address_p)
>    {
>   /* Do not try to express address of an object with computation based
>      on address of a different object.  This may cause problems in rtl
>       level alias analysis (that does not expect this to be happening,
>         as this is illegal in C), and would be unlikely to be useful
>         anyway.  */
>      if (use->iv->base_object
>          && cand->iv->base_object
>  && !operand_equal_p (use->iv->base_object, cand->iv->base_object, 0))
>        return infinite_cost;
>
>failed because cand(15)->iv->base_object == NULL.  For the reported
>case, it's
>not about an iv use appearing in memory reference while not marked as
>address_p, and can be fixed by revise the existing check condition, is
>it true?

No, even expressing an address this way is broken as for example dependence
analysis via scev can get confused about the actual base object.

IIRC previously we already avoided the mem-use case and I had to generalize it
to also avoid addresses.

>PS, sorry for replying to a fixed PR, I found it's kind of impossible
>to fix
>PR52272 without fully understanding this one.

Reply via email to