On 6/28/23 06:39, Christoph Müllner wrote:

+;; XTheadMemIdx overview:
+;; All peephole passes attempt to improve the operand utilization of
+;; XTheadMemIdx instructions, where one sign or zero extended
+;; register-index-operand can be shifted left by a 2-bit immediate.
+;;
+;; The basic idea is the following optimization:
+;; (set (reg 0) (op (reg 1) (imm 2)))
+;; (set (reg 3) (mem (plus (reg 0) (reg 4)))
+;; ==>
+;; (set (reg 3) (mem (plus (reg 4) (op2 (reg 1) (imm 2))))
+;; This optimization only valid if (reg 0) has no further uses.
Couldn't this be done by combine if you created define_insn patterns
rather than define_peephole2 patterns?  Similarly for the other cases
handled here.

I was inspired by XTheadMemPair, which merges two memory accesses
into a mem-pair instruction (and which got inspiration from
gcc/config/aarch64/aarch64-ldpstp.md).
Right. I'm pretty familiar with those. They cover a different case, specifically the two insns being optimized don't have a true data dependency between them. ie, the first instruction does not produce a result used in the second insn.


In the case above there is a data dependency on reg0. ie, the first instruction generates a result used in the second instruction. combine is usually the best place to handle the data dependency case.



I don't see the benefit of using combine or peephole, but I can change
if necessary. At least for the provided test cases, the implementation
works quite well.
Peepholes require the instructions to be consecutive in the stream while combine relies on data dependence links and can thus find these opportunities even when the two insn we care about are separated by unrelated other insns.


Jeff

Reply via email to