https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95739
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Component|tree-optimization |middle-end Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Status|NEW |ASSIGNED --- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> --- Hmm, the vectorizer emits vect_cst__41 = { 0.0, 0.0 }; vect_cst__42 = { -1.0e+0, -1.0e+0 }; ... vect__1.7_37 = MEM <vector(2) double> [(double *)vectp_s1.5_35]; _1 = s1[i_23]; vect__2.10_40 = MEM <vector(2) double> [(double *)vectp_s2.8_38]; _2 = s2[i_23]; _43 = vect__1.7_37 u<= vect__2.10_40; vect_iftmp.11_44 = VEC_COND_EXPR <_43, vect_cst__41, vect_cst__42>; iftmp.0_5 = _1 u<= _2 ? 0.0 : -1.0e+0; but this is __builtin__isgreaterequal (s1[i], s2[i]) ? 0:0 : -1.0; already this way from if-conversion it seems. Before we have if (_1 u<= _2) goto <bb 10>; [50.00%] else goto <bb 4>; [50.00%] <bb 10> [local count: 429496729]: goto <bb 5>; [100.00%] <bb 4> [local count: 429496728]: <bb 5> [local count: 858993457]: # iftmp.0_5 = PHI <-1.0e+0(4), 0.0(10)> And even .original: s3[i] = s1[i] u<= s2[i] ? 0.0 : -1.0e+0; but that's the inverted condition plus swapped which should be u>=!? Seemingly this is generated from /* !A ? B : C -> A ? C : B. */ (simplify (cnd (logical_inverted_value truth_valued_p@0) @1 @2) (cnd @0 @2 @1))) fed by !(s1[i] u<= s2[i]) ? -1. : 0. Hmm, which looks OK. later the backend via ix86_prepare_sse_fp_compare_args correctly (!?) swaps operands of the compare to s2[i] u>= s1[i]. Now somewhere things go wrong and the __builtin_isgreater vanishes completely, leaving us with uninitialized stack slots: main: .LFB0: .cfi_startproc subq $120, %rsp .cfi_def_cfa_offset 128 movapd .LC0(%rip), %xmm0 movaps %xmm0, s1(%rip) movapd .LC1(%rip), %xmm0 movaps %xmm0, s1+16(%rip) movapd .LC2(%rip), %xmm0 movaps %xmm0, s2(%rip) movaps %xmm0, s2+16(%rip) movsd 56(%rsp), %xmm0 movapd 48(%rsp), %xmm1 movapd 96(%rsp), %xmm2 movsd %xmm0, 8(%rsp) movsd .LC4(%rip), %xmm0 ucomisd 8(%rsp), %xmm0 movaps %xmm1, s3(%rip) movaps %xmm2, s3+16(%rip) I guess the issue is 31: r104:V2DF=unge(r103:V2DF,[r77:DI-0x60]) 32: r101:V2DF=~r104:V2DF&r102:V2DF 33: r105:DI=`s3' 34: r106:V2DF=[r77:DI-0x40] 35: [r105:DI]=r106:V2DF look at how we compute the result into r101 but then use [r77:D1-0x40] as source for the store. When I trace expand_vect_cond_optab_fn I see 'target' is expanded to (mem/c:V2DF (plus:DI (reg/f:DI 77 virtual-stack-vars) (const_int -64 [0xffffffffffffffc0])) [1 vect_iftmp.11+0 S16 A128]) but we don't check whether ops[0].value matches target after expand_insn and fail to move it there. testing patch.