On Thu, Sep 5, 2024 at 2:57 PM Jeff Law <jeffreya...@gmail.com> wrote: > > > > On 9/5/24 12:59 PM, Palmer Dabbelt wrote: > > On Thu, 05 Sep 2024 11:52:57 PDT (-0700), Palmer Dabbelt wrote: > >> We have cheap logical ops, so let's just move this back to the default > >> to take advantage of the standard branch/op hueristics. > >> > >> gcc/ChangeLog: > >> > >> PR target/116615 > >> * config/riscv/riscv.h (LOGICAL_OP_NON_SHORT_CIRCUIT): Remove. > >> --- > >> There's a bunch more discussion in the bug, but it's starting to smell > >> like this was just a holdover from MIPS (where maybe it also shouldn't > >> be set). I haven't tested this, but I figured I'd send the patch to get > >> a little more visibility. > >> > >> I guess we should also kick off something like a SPEC run to make sure > >> there's no regressions? > > > > Sorry I missed it in the bug, but Ruoyao points to dddafe94823 > > ("LoongArch: Define LOGICAL_OP_NON_SHORT_CIRCUIT") where short- > > circuiting the FP comparisons helps on LoongArch. > > > > Not sure if I'm also missing something here, but it kind of feels like > > that should be handled by a more generic optimization decision that just > > globally "should we short circuit logical ops" -- assuming it really is > > the FP comparisons that are causing the cost, as opposed to the actual > > logical ops themselves. > > > > Probably best to actually run the benchmarks, though... > THe #define essentially is overriding the generic heuristics which look > at branch cost to determine how aggressively to try and combine several > conditional branch conditions using logical ops so they can use a single > conditional branch in the end. > > I don't remember all the history here, but in retrospect, the mere > existence of that #define points to a failing in the costing models.
I provided the original history of LOGICAL_OP_NON_SHORT_CIRCUIT in the RISCV bug report. And yes there is a costing model fail here. LOGICAL_OP_NON_SHORT_CIRCUIT was useful if you have a decent cset (or these days have a ccmp optab). One cost model issue is LOGICAL_OP_NON_SHORT_CIRCUIT does not handle if the comparison was fp or integer (which would handle the Loonsoog and MIPS; and to less sense RISCV). PowerPC backend does not implement the ccmp optab nor does it have a decent costing cset so having it as 0 is correct; even though BRANCH cost might be low for the target (though it could implement ccmp optab now but nobody has that implemented yet). Note RISCV's cset is cheap (both size and speed) due to being close to MIPS and just having instructions which set the GPRs and then comparing against 0. I don't have time until next year to start looking at improving the situation with respect of LOGICAL_OP_NON_SHORT_CIRCUIT/BRANCH_COST; it is on my radar since I want to improve how aarch64's ccmp is done and remove the use of LOGICAL_OP_NON_SHORT_CIRCUIT from fold-cost to only being in the ifcombine (or maybe even just in isel) pass. Thanks, Andrew Pinski > > FWIW, my general sense is that the gimple phases shouldn't work *too* > hard to try and combine logical ops, but the if-converters in the RTL > phases should be fairly aggressive. THe fact that we use BRANCH_COST > to drive both is likely sub-optimal. > jeff