https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118360
Jeffrey A. Law <law at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |dbarboza at ventanamicro dot
com
--- Comment #6 from Jeffrey A. Law <law at gcc dot gnu.org> ---
Daniel. This one might be another good one for you.
Focus on the second testcase, the one with the inverted test:
long fun_not1 (int a, long b)
{
if (!(a & 1))
b ^= 8;
return b;
}
Which turns into this in the .optimized dump:
;; basic block 2, loop depth 0
;; pred: ENTRY
_1 = a_3(D) & 1;
if (_1 == 0)
goto <bb 3>; [50.00%]
else
goto <bb 4>; [50.00%]
;; succ: 3
;; 4
;; basic block 3, loop depth 0
;; pred: 2
b_5 = b_4(D) ^ 8;
;; succ: 4
;; basic block 4, loop depth 0
;; pred: 2
;; 3
# b_2 = PHI <b_4(D)(2), b_5(3)>
return b_2;
Seems like another case where phiopt should have turned this into branchless
code. If we look at the original test we get this:
_1 = a_3(D) & 1;
_7 = _1 * 8;
_8 = b_4(D) ^ _7;
I think we can get where we want to go by realizing that if we flip the low bit
of _1 we're good for the fun_not1 test. So something like;
_1 = a_3(D) & 1;
_temp = _1 ^ 1;
_7 = _temp * 8;
_8 = b_4(D) ^ _7;
Note this might cause the avr to go backwards. So we need to check that
carefully. But the form above should be better than the branchy sequence we're
currently getting on risc-v:
fun_not1:
andi a5,a0,1 # 9 [c=4 l=4] *anddi3/1
mv a0,a1 # 3 [c=4 l=4] *movdi_64bit/0
bne a5,zero,.L2 # 10 [c=16 l=4] *branchdi
xori a0,a1,8 # 12 [c=4 l=4] *xordi3/1
.L2:
ret # 44 [c=0 l=4] simple_return
I think the optimized code will look something like:
andi a0,a0,1 # 9 [c=4 l=4] *anddi3/1
xor a0,a0,1
slli a0,a0,3 # 10 [c=4 l=4] ashldi3
xor a0,a0,a1 # 16 [c=4 l=4] *xordi3/0
ret # 25 [c=0 l=4] simple_return
For AVR a sequence using shifts is bad, we may need to expand on Georg-Johann's
patch to convert it back to bit testing and such.