Am 20.11.24 um 11:33 schrieb Richard Biener:
On Wed, Nov 20, 2024 at 11:29 AM Georg-Johann Lay via Gcc <[email protected]> wrote:Consider the following RTL peephole from avr.md: (define_peephole2 ; avr.md:5387 [(match_scratch:QI 3 "d") (parallel [(set (match_operand:ALL4 0 "register_operand" "") (ashift:ALL4 (match_operand:ALL4 1 "register_operand" "") (match_operand:QI 2 "const_int_operand" ""))) (clobber (reg:CC REG_CC))])] "" [(parallel [(set (match_dup 0) (ashift:ALL4 (match_dup 1) (match_dup 2))) (clobber (match_dup 3)) (clobber (reg:CC REG_CC))])])
Finally, Andrew Pinski pointed out the correct solution: In order for the scratch from the peephole2 to not overlap with the destination of the matched insn, a top-level match_dup is required according to https://gcc.gnu.org/onlinedocs/gccint/define_005fpeephole2.html So the fixed peephole2 reads: (define_peephole2 [(match_scratch:QI 3 "d") (parallel [(set (match_operand:ALL4 0 "register_operand") (ashift:ALL4 (match_operand:ALL4 1 "register_operand") (match_operand:QI 2 "const_int_operand"))) (clobber (reg:CC REG_CC))]) (match_dup 3)] This is required due to my changes which turned the shift insn from a 2-operand insn to a 3-operand one. The 2-operand case was not an issue because AFAIU peephole2 makes sure that the scratch won't overlap with any input. Johann
As far as I understand, its purpose is to provide a QImode scratch register provided such a scratch is available. However, in the .peephole2 RTL dump with -da I see the following: Splitting with gen_peephole2_100 (avr.md:5387) ... (insn 24 8 15 2 (parallel [ (set (reg:SI 22 r22 [orig:47 _3 ] [47]) (ashift:SI (reg:SI 20 r20 [orig:48 x ] [48]) (const_int 7 [0x7]))) (clobber (reg:QI 24 r24)) (clobber (reg:CC 36 cc)) ]) (nil)) That is, the scratch r24:QI is overlapping the output in r22:SI. All hard registers are 8-bit regs and hence r22:SI extends from r22...r25. A scratch that overlaps the operands is pretty much useless or even plain wrong. recog.cc::peep2_find_free_register() has this comment: /* Don't use registers set or clobbered by the insn. */ from = peep2_buf_position (peep2_current + from); to = peep2_buf_position (peep2_current + to); gcc_assert (peep2_insn_data[from].insn != NULL_RTX); REG_SET_TO_HARD_REG_SET (live, peep2_insn_data[from].live_before); while (from != to) { gcc_assert (peep2_insn_data[from].insn != NULL_RTX); /* Don't use registers set or clobbered by the insn. */ FOR_EACH_INSN_DEF (def, peep2_insn_data[from].insn) SET_HARD_REG_BIT (live, DF_REF_REGNO (def)); from = peep2_buf_position (from + 1); } So it this bogus in that it assumes all registers extend only over one hard reg?Yes, looks like a bug to me.FYI, the purpose is to provide a scratch without increasing the register pressure (which "match_scratch" would do). Therefore, the RTL peephole is used instead of forcing reload to come up with a scratch. More specifically, I see this with $ avr-gcc bogus-peep2.c -S -Os -da long ashl32_7 (int i, long x) { return x << 7; } with the attached WIP patch atop trunk b222ee10045d. Johann Target: avr Configured with: ../../source/gcc-master/configure --target=avr --disable-nls --with-dwarf2 --with-gnu-as --with-gnu-ld --disable-shared --enable-languages=c,c++ Thread model: single Supported LTO compression algorithms: zlib gcc version 15.0.0 20241119 (experimental) (GCC)
