https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97786
--- Comment #8 from Segher Boessenkool <segher at gcc dot gnu.org> --- (In reply to Surya Kumari Jangala from comment #7) > Hi Segher, > > Thanks for the pointers! > We can optimize the code further and remove the branch completely. > > For P10: > > xststdcdp 0,1,48 > setnbc 9,32 > setbc 3,34 > isel 3,9,3,2 > blr > > > For P9: > xststdcdp 0,1,48 > setb 9,0 > mfcr 3,128 > rlwinm 3,3,3,1 > lr 4,1 > isel 9,9,4,0 > isel 3,9,3,2 > blr Ah right, the tdc insns are ISA 3.1, not 3.0 as I misremembered. Bah. But we can move the bit to field bit 1 (the FG bit) using some crmove or similar, after which the setb will work fine? Something like xststdcdp 0,1,48 # Set CR bit 0 to sign, and CR bit 2 to isinf crmove 2,1 # Set CR bit 0 to sign, and CR bit 1 to isinf setb 3,0 Hrm, that isn't quite it, heh. We need bit 0 set for -inf and bit 2 for +inf (or for +in as well as -inf, also fine). So xststdcdp 0,1,48 # Set CR bit 0 to sign, and CR bit 2 to isinf crand 0,0,2 # Set CR bit 0 for -inf crmove 1,2 # Set CR bit 0 to sign, and CR bit 1 to isinf setb 3,0 (And no doubt I messed up there as well, and we probably *can* do it in just three insns anyway. Note that both the crlogical insns can execute concurrently though). It is fine to make this most optimal only for p10 and later, of course. "Gaze aimed at the future" and such, and setb is a horrible insn :-)