https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95739

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|tree-optimization           |middle-end
           Assignee|unassigned at gcc dot gnu.org      |rguenth at gcc dot 
gnu.org
             Status|NEW                         |ASSIGNED

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
Hmm, the vectorizer emits

  vect_cst__41 = { 0.0, 0.0 };
  vect_cst__42 = { -1.0e+0, -1.0e+0 };
...
  vect__1.7_37 = MEM <vector(2) double> [(double *)vectp_s1.5_35];
  _1 = s1[i_23];
  vect__2.10_40 = MEM <vector(2) double> [(double *)vectp_s2.8_38];
  _2 = s2[i_23];
  _43 = vect__1.7_37 u<= vect__2.10_40;
  vect_iftmp.11_44 = VEC_COND_EXPR <_43, vect_cst__41, vect_cst__42>;
  iftmp.0_5 = _1 u<= _2 ? 0.0 : -1.0e+0;

but this is __builtin__isgreaterequal (s1[i], s2[i]) ? 0:0 : -1.0; already
this way from if-conversion it seems.  Before we have

  if (_1 u<= _2)
    goto <bb 10>; [50.00%]
  else
    goto <bb 4>; [50.00%]

  <bb 10> [local count: 429496729]:
  goto <bb 5>; [100.00%]

  <bb 4> [local count: 429496728]:

  <bb 5> [local count: 858993457]:
  # iftmp.0_5 = PHI <-1.0e+0(4), 0.0(10)>

And even .original:

    s3[i] = s1[i] u<= s2[i] ? 0.0 : -1.0e+0;

but that's the inverted condition plus swapped which should be u>=!?

Seemingly this is generated from

 /* !A ? B : C -> A ? C : B.  */
 (simplify
  (cnd (logical_inverted_value truth_valued_p@0) @1 @2)
  (cnd @0 @2 @1)))

fed by !(s1[i] u<= s2[i]) ? -1. : 0.  Hmm, which looks OK.

later the backend via ix86_prepare_sse_fp_compare_args correctly (!?)
swaps operands of the compare to s2[i] u>= s1[i].

Now somewhere things go wrong and the __builtin_isgreater vanishes completely,
leaving us with uninitialized stack slots:

main:
.LFB0:
        .cfi_startproc
        subq    $120, %rsp
        .cfi_def_cfa_offset 128
        movapd  .LC0(%rip), %xmm0
        movaps  %xmm0, s1(%rip)
        movapd  .LC1(%rip), %xmm0
        movaps  %xmm0, s1+16(%rip)
        movapd  .LC2(%rip), %xmm0
        movaps  %xmm0, s2(%rip)
        movaps  %xmm0, s2+16(%rip)
        movsd   56(%rsp), %xmm0
        movapd  48(%rsp), %xmm1
        movapd  96(%rsp), %xmm2
        movsd   %xmm0, 8(%rsp)
        movsd   .LC4(%rip), %xmm0
        ucomisd 8(%rsp), %xmm0
        movaps  %xmm1, s3(%rip)
        movaps  %xmm2, s3+16(%rip)

I guess the issue is

   31: r104:V2DF=unge(r103:V2DF,[r77:DI-0x60])
   32: r101:V2DF=~r104:V2DF&r102:V2DF
   33: r105:DI=`s3'
   34: r106:V2DF=[r77:DI-0x40]
   35: [r105:DI]=r106:V2DF

look at how we compute the result into r101 but then use [r77:D1-0x40]
as source for the store.  When I trace expand_vect_cond_optab_fn
I see 'target' is expanded to

(mem/c:V2DF (plus:DI (reg/f:DI 77 virtual-stack-vars)
        (const_int -64 [0xffffffffffffffc0])) [1 vect_iftmp.11+0 S16 A128])

but we don't check whether ops[0].value matches target after expand_insn
and fail to move it there.

testing patch.

Reply via email to