On Tue, Apr 16, 2019 at 12:20:16PM +0200, Eric Botcazou wrote:
> > If I remember correctly and read the cfgexpand.c comment
> > 
> >   /* For a promoted variable, X will not be used directly but wrapped in a
> >      SUBREG with SUBREG_PROMOTED_VAR_P set, which means that the RTL land
> >      will assume that its upper bits can be inferred from its lower bits.
> >      Therefore, if X isn't initialized on every path from the entry, then
> >      we must do it manually in order to fulfill the above assumption.  */
> > 
> > you run into a similar situation but no, I don't think your patch is
> > really fixing this given SUBREG_PROMOTED_VAR_P is only true depending
> > on a condition (and not a data-dependence).
> 
> If the new pseudo is initialized with the above semantics on every possible 
> path where it is used, then I think it's OK.

The patch changed (expand dump with some linebreaks removed):
 ;; _12 = .MUL_OVERFLOW (_3, _6);
 (insn 7 6 8 (set (reg:DI 96) (zero_extend:DI (reg/v:SI 91 [ c ]))) 
"pr90095.c":14:32 -1 (nil))
 (insn 8 7 9 (parallel [ (set (reg:DI 97) (neg:DI (reg:DI 96))) (clobber 
(reg:CC 17 flags)) ]) "pr90095.c":14:7 -1 (nil))
 (insn 9 8 10 (parallel [ (set (reg:QI 98) (neg:QI (subreg:QI (reg/v:SI 91 [ c 
]) 0))) (clobber (reg:CC 17 flags)) ]) "pr90095.c":14:56 -1 (nil))
 (insn 10 9 11 (set (reg:SI 99) (zero_extend:SI (reg:QI 98))) "pr90095.c":14:7 
-1 (nil))
 (insn 11 10 12 (parallel [ (set (reg:DI 100) (sign_extend:DI (reg:SI 99))) 
(clobber (reg:CC 17 flags)) (clobber (scratch:SI)) ]) "pr90095.c":14:7 -1 (nil))
 (insn 12 11 13 (set (reg:DI 93 [ _12+8 ]) (const_int 0 [0])) "pr90095.c":14:7 
-1 (nil))
 (insn 13 12 14 (parallel [ (set (reg:DI 101) (lshiftrt:DI (reg:DI 97) 
(const_int 32 [0x20]))) (clobber (reg:CC 17 flags)) ]) "pr90095.c":14:7 -1 
(nil))
 (insn 14 13 15 (parallel [ (set (reg:DI 102) (lshiftrt:DI (reg:DI 100) 
(const_int 32 [0x20]))) (clobber (reg:CC 17 flags)) ]) "pr90095.c":14:7 -1 
(nil))
 (insn 15 14 16 (set (reg:CCZ 17 flags) (compare:CCZ (subreg:SI (reg:DI 101) 0) 
(const_int 0 [0]))) "pr90095.c":14:7 -1 (nil))
-(jump_insn 16 15 18 (set (pc) (if_then_else (ne (reg:CCZ 17 flags) (const_int 
0 [0])) (label_ref 28) (pc))) "pr90095.c":14:7 -1 (int_list:REG_BR_PROB 
214748364 (nil)))
-(insn 18 16 19 (parallel [ (set (reg:DI 105) (mult:DI (zero_extend:DI 
(subreg/s/v:SI (reg:DI 97) 0)) (zero_extend:DI (subreg/s/v:SI (reg:DI 100) 
0)))) (clobber (reg:CC 17 flags)) ]) "pr90095.c":14:7 -1 (nil))
+(jump_insn 16 15 17 (set (pc) (if_then_else (ne (reg:CCZ 17 flags) (const_int 
0 [0])) (label_ref 28) (pc))) "pr90095.c":14:7 -1 (int_list:REG_BR_PROB 
214748364 (nil)))
+(insn 17 16 18 (set (reg:DI 104) (reg:DI 97)) "pr90095.c":14:7 -1 (nil))
+(insn 18 17 19 (parallel [ (set (reg:DI 105) (mult:DI (zero_extend:DI 
(subreg/s/v:SI (reg:DI 104) 0)) (zero_extend:DI (subreg/s/v:SI (reg:DI 100) 
0)))) (clobber (reg:CC 17 flags)) ]) "pr90095.c":14:7 -1 (nil))
 (insn 19 18 20 (set (reg:DI 103) (reg:DI 105)) "pr90095.c":14:7 -1 (nil))
 (jump_insn 20 19 21 (set (pc) (label_ref 68)) "pr90095.c":14:7 -1 (nil))
 (barrier 21 20 22)

The pseudo 97 (op0) is initialized before the comparison of its high 32 bits
(insn 15), in the second basic block starting after jump_insn 16 previously
the first insn was the multiplication with (subreg/s/v:SI (reg:DI 97) 0)
(the only spot where such a subreg appears), with the patch we first copy
the pseudo 97 into another pseudo 104 and use (subreg/s/v:SI (reg:DI 104) 0)
in the multiplication instead.  Nothing after jump_insn 20 uses the pseudo
104 or that subreg/s/v.
The runtime check assures that at runtime, the upper 32 bits of pseudo 104
must be always 0 (in this case, in some other case could be sign bit
copies).

The question is if it would be valid say for forward propagation to first
propagate (or combine) the pseudo 97 into the (subreg/s/v:SI (reg:DI 104) 0),
then hoisting it before the jump_insn 16, have the subreg optimized away and
miscompile later on.

On the testcase that won't happen e.g. because of the subreg1 pass, which
lowers that insn 17 into:
(insn 99 86 100 3 (set (reg:SI 137)
        (subreg:SI (reg:DI 97) 0)) "pr90095.c":14:7 67 {*movsi_internal}
     (nil))
(insn 100 99 18 3 (set (reg:SI 138 [+4 ])
        (subreg:SI (reg:DI 97) 4)) "pr90095.c":14:7 67 {*movsi_internal}
     (nil))
(insn 18 100 101 3 (parallel [
            (set (reg:DI 105)
                (mult:DI (zero_extend:DI (reg:SI 137))
                    (zero_extend:DI (subreg/s/v:SI (reg:DI 100) 0))))
            (clobber (reg:CC 17 flags))
        ]) "pr90095.c":14:7 334 {*umulsidi3_1}
     (nil))
and so the problematic subreg/s/v is gone (the other is fine, op1_small_p is
true).  But if I add -fno-split-wide-types to the option set, the testcase
aborts again even with the patch.

That means either that the hoisting pass is buggy, or that SUBREG_PROMOTED_*
is only safe at the function boundary (function arguments and return value)
and not elsewhere.

        Jakub

Reply via email to