On Fri, Jul 14, 2023 at 10:31 AM Richard Biener <rguent...@suse.de> wrote: > > On Fri, 14 Jul 2023, Uros Bizjak wrote: > > > cprop1 pass does not consider paradoxical subreg and for (insn 22) claims > > that it equals 8 elements of HImodeby setting REG_EQUAL note: > > > > (insn 21 19 22 4 (set (reg:V4QI 98) > > (mem/u/c:V4QI (symbol_ref/u:DI ("*.LC1") [flags 0x2]) [0 S4 > > A32])) "pr110206.c":12:42 1530 {*movv4qi_internal} > > (expr_list:REG_EQUAL (const_vector:V4QI [ > > (const_int -52 [0xffffffffffffffcc]) repeated x4 > > ]) > > (nil))) > > (insn 22 21 23 4 (set (reg:V8HI 100) > > (zero_extend:V8HI (vec_select:V8QI (subreg:V16QI (reg:V4QI 98) 0) > > (parallel [ > > (const_int 0 [0]) > > (const_int 1 [0x1]) > > (const_int 2 [0x2]) > > (const_int 3 [0x3]) > > (const_int 4 [0x4]) > > (const_int 5 [0x5]) > > (const_int 6 [0x6]) > > (const_int 7 [0x7]) > > ])))) "pr110206.c":12:42 7471 > > {sse4_1_zero_extendv8qiv8hi2} > > (expr_list:REG_EQUAL (const_vector:V8HI [ > > (const_int 204 [0xcc]) repeated x8 > > ]) > > (expr_list:REG_DEAD (reg:V4QI 98) > > (nil)))) > > > > We rely on the "undefined" vals to have a specific value (from the earlier > > REG_EQUAL note) but actual code generation doesn't ensure this (it doesn't > > need to). That said, the issue isn't the constant folding per-se but that > > we do not actually constant fold but register an equality that doesn't hold. > > > > PR target/110206 > > > > gcc/ChangeLog: > > > > * fwprop.cc (contains_paradoxical_subreg_p): Move to ... > > * rtlanal.cc (contains_paradoxical_subreg_p): ... here. > > * rtlanal.h (contains_paradoxical_subreg_p): Add prototype. > > * cprop.cc (try_replace_reg): Do not set REG_EQUAL note > > when the original source contains a paradoxical subreg. > > > > gcc/testsuite/ChangeLog: > > > > * gcc.dg/torture/pr110206.c: New test. > > > > Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. > > > > OK for mainline and backports? > > OK. > > I think the testcase can also run on other targets if you add > dg-additional-options "-w -Wno-psabi", all generic vector ops > should be lowered if not supported.
True, but with lowered vector ops, the test would not even come close to the problem. The problem is specific to generic vector ops, and can be triggered only when paradoxical subregs are used to implement (partial) vector modes. This is the case on x86, where partial vectors are now heavily used, and even there we need the latest vector ISA enabled to trip the condition. The above is the reason that dg-torture is used, with the hope that the runtime failure will trip when testsuite is run with specific target options. Uros.