On Thu, Apr 03, 2025 at 08:42:01AM -0400, Roger Sayle wrote: > I've been bootstrapping and regression testing a variant of Jakub's > previously proposed change: > > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc > index f38e3db41fa..123948a2dea 100644 > --- a/gcc/config/i386/i386.cc > +++ b/gcc/config/i386/i386.cc > @@ -21883,7 +21883,10 @@ ix86_rtx_costs (rtx x, machine_mode mode, int > outer_cod > e_i, int opno, > case SYMBOL_REF: > if (x86_64_immediate_operand (x, VOIDmode)) > *total = 0; > - else > + else if (x86_64_zext_immediate_operand (x, VOIDmode)) > + /* movl is as expensive as a simple instruction. */ > + *total = COSTS_N_INSNS (1); > + else > /* movabsq is slightly more expensive than a simple instruction. */ > *total = COSTS_N_INSNS (1) + 1; > return true;
This breaks the testcase again. With my version of the patch it is ;; positive division: unsigned cost: 26; signed cost: 28 with your adjustment ;; positive division: unsigned cost: 29; signed cost: 28 and with vanilla trunk ;; positive division: unsigned cost: 30; signed cost: 28 So, we'd need to put more effort into tweaking this. My preference would be in stage1, the patch restores the GCC 14 cost for the unsigned 32-bit immediates for normal rtx_cost and fixes up the weird behavior of pattern_cost everywhere where cost [1,3] is less expensive than cost 0. Also, compared to movl I'd then use something like COSTS_N_INSNS (1) + 3 for movabsq. > p.s. It's odd this is a P1. It's not wrong code, but a single-cycle > instruction timing issue > (where the current trunk implementation is typically more correct than GCC > 14's). It is not odd, it is a regression on primary platform and clearly it doesn't affect just that one testcase but most likely many of the unsigned vs. signed division decisions and clearly other code as well (see PR119594 for another testcase affected by that). Jakub