http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55600



--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> 2012-12-05 
10:33:06 UTC ---

GCC fully unrolls the vectorized looo.  ICC does not.



The loop rolls 16 times:



  <bb 3>:

  # vect_p.5_30 = PHI <vect_p.5_45(4), vect_p.8_31(2)>

  # vect_su.12_52 = PHI <vect_su.12_53(4), { 0, 0, 0, 0 }(2)>

  # ivtmp_61 = PHI <ivtmp_62(4), 0(2)>

  vect_var_.9_46 = MEM[(int *)vect_p.5_30];

  vect_p.5_47 = vect_p.5_30 + 16;

  vect_var_.10_48 = MEM[(int *)vect_p.5_47];

  vect_perm_even_49 = VEC_PERM_EXPR <vect_var_.9_46, vect_var_.10_48, { 0, 2,

4, 6 }>;

  vect_perm_odd_50 = VEC_PERM_EXPR <vect_var_.9_46, vect_var_.10_48, { 1, 3, 5,

7 }>;

  vect_var_.11_51 = vect_perm_even_49 * vect_perm_odd_50;

  vect_su.12_53 = vect_var_.11_51 + vect_su.12_52;

  vect_p.5_45 = vect_p.5_47 + 16;

  ivtmp_62 = ivtmp_61 + 1;

  if (ivtmp_62 < 16)

    goto <bb 4>;

  else

    goto <bb 5>;



  <bb 4>:

  goto <bb 3>;



but at -O3 we don't care too much about code size in this case.  So I'm not

sure you can call this a "bug".  Does it run slower?

Reply via email to