The following patch tightens the predicates of the peephole2 from my recent
"Integer min/max improvements patch" to only hoist clearing a register when
that register is a general register. Calling ix86_expand_clear with regs
other than GENERAL_REGS is not supported.
The following patch has been tested on x86_64-pc-linux-gnu with a
"make bootstrap" and "make -k check" with no new failures, and fixes
the new test case. Committed as obvious to fix the immediate regression.
An additional patch (for a supplementary fix) is in preparation.
2020-08-12 Roger Sayle <[email protected]>
Uroš Bizjak <[email protected]>
gcc/ChangeLog
PR target/96558
* config/i386/i386.md (peephole2): Only reorder register clearing
instructions to allow use of xor for general registers.
gcc/testsuite/ChangeLog
PR target/96558
* gcc.dg/pr96558.c: New test.
Sorry for the breakage.
Roger
--
Roger Sayle
NextMove Software
Cambridge, UK
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index f3799ac..9d4e669 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -18938,7 +18938,7 @@
;; i.e. prefer "xorl %eax,%eax; test/cmp" over "test/cmp; movl $0, %eax".
(define_peephole2
[(set (reg FLAGS_REG) (match_operand 0))
- (set (match_operand:SWI 1 "register_operand") (const_int 0))]
+ (set (match_operand:SWI 1 "general_reg_operand") (const_int 0))]
"peep2_regno_dead_p (0, FLAGS_REG)
&& !reg_overlap_mentioned_p (operands[1], operands[0])"
[(set (match_dup 2) (match_dup 0))]
diff --git a/gcc/testsuite/gcc.dg/pr96558.c b/gcc/testsuite/gcc.dg/pr96558.c
new file mode 100644
index 0000000..2f5739e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr96558.c
@@ -0,0 +1,32 @@
+/* PR target/96558 */
+/* { dg-do compile { target int128 } } */
+/* { dg-options "-O2 -fno-expensive-optimizations -fno-gcse" } */
+
+int ky;
+long int h1;
+__int128 f1;
+
+int
+sd (void);
+
+int __attribute__ ((simd))
+i8 (void)
+{
+ __int128 vh;
+
+ if (sd () == 0)
+ h1 = 0;
+
+ do
+ {
+ long int lf = (long int) f1 ? h1 : 0;
+
+ ky += lf;
+ vh = lf | f1;
+ f1 = 1;
+ }
+ while (vh < (f1 ^ 2));
+
+ return 0;
+}
+