https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81763
--- Comment #33 from Jakub Jelinek <jakub at gcc dot gnu.org> --- (In reply to Uroš Bizjak from comment #31) > (In reply to Uroš Bizjak from comment #30) > > So, I'll bootstrap: > > Maybe we can also allow &r <- (r,r) for BMI, to be safe (c.f. comment #23): > > (define_insn "*andndi3_doubleword" > [(set (match_operand:DI 0 "register_operand" "=r,&r") > (and:DI > (not:DI (match_operand:DI 1 "register_operand" "0,r")) > (match_operand:DI 2 "nonimmediate_operand" "rm,rm"))) > (clobber (reg:CC FLAGS_REG))] > "!TARGET_64BIT && TARGET_STV && TARGET_SSE2" > "#" > [(set_attr "isa" "*,bmi")]) > > Manuel, can you please test this pattern? At least with a smarter splitter we don't really need to avoid no overlap at all for the r <- (r, r) bmi case, we can choose which of the two 32-bit andn's we do first depending on the overlap, all we need to guarantee is that the splitter is not impossible and ideally doesn't need any instructions but the two. Hard registers for DImode must be consecutive because we identify them by the (lowest) register number and mode and for r <- (r, r) there can't be any overlap between the two input operands. So, even if DImode registers can start at any GPR number other than the last, not just even ones, either there is no overlap at all in between output and inputs, or the output is the same as the first, or as the second input (all these cases are fine), or there is a partial overlap with one or both of the operands. For the partial operand I can think of DI:N = DI:N+1 &~ DI:unrelated, or DI:N+1 = DI:N &~ DI:unrelated, or DI:N+1 = DI:N &~ DI:N+2 (or swapped operands), the last case is partial overlap with both inputs. So, right now we'd split those into: SI:N = SI:N+1 &~ SI:unrelated; SI:N+1 = SI:N+2 &~ SI:unrelated+1 which is fine, or SI:N+1 = SI:N &~ SI:unrelated; SI:N+2 = SI:N+1 &~ SI:unrelated+1 which is wrong but we can swap those: SI:N+2 = SI:N+1 &~ SI:unrelated+1; SI:N+1 = SI:N &~ SI:unrelated and it should work. The last case would be right now: SI:N+1 = SI:N &~ SI:N+2; SI:N+2 = SI:N+1 &~ SI:N+3; and is again wrong, but we could again swap: SI:N+2 = SI:N+1 &~ SI:N+3; SI:N+1 = SI:N &~ SI:N+2; and all is fine.