https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83854
Bug ID: 83854 Summary: [performance] Improve cse optimization for insn with inout ops Product: gcc Version: 4.7.4 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: zhongyunde at huawei dot com Target Milestone: --- Now, as we define match_dup in target insn in *.md, so when the cse only try to replace the SRC and DEST of memory in function cse_insn , see detail as following: /* Canonicalize sources and addresses of destinations. We do this in a separate pass to avoid problems when a MATCH_DUP is present in the insn pattern. In that case, we want to ensure that we don't break the duplicate nature of the pattern. So we will replace both operands at the same time. Otherwise, we would fail to find an equivalent substitution in the loop calling validate_change below. We used to suppress canonicalization of DEST if it appears in SRC, but we don't do this any more. */ for (i = 0; i < n_sets; i++) { rtx dest = SET_DEST (sets[i].rtl); rtx src = SET_SRC (sets[i].rtl); rtx new_rtx = canon_reg (src, insn); validate_change (insn, &SET_SRC (sets[i].rtl), new_rtx, 1); if (GET_CODE (dest) == ZERO_EXTRACT) { validate_change (insn, &XEXP (dest, 1), canon_reg (XEXP (dest, 1), insn), 1); validate_change (insn, &XEXP (dest, 2), canon_reg (XEXP (dest, 2), insn), 1); } while (GET_CODE (dest) == SUBREG || GET_CODE (dest) == ZERO_EXTRACT || GET_CODE (dest) == STRICT_LOW_PART) dest = XEXP (dest, 0); if (MEM_P (dest)) canon_reg (dest, insn); } so it will not do optimation for insn with inout ops, as we used to use match_dup to make the inout ops use same registers. But some time, we can see some set ops such as reg_147 in insn_34 will be not used later, so it can be tried to optimized for both reg_147 in insn_34, but not only for the SRC of reg_147 in insn_34. (insn 35 31 34 5 (set (reg:VALIGN 147 [ uu ]) (reg/v:VALIGN 136 [ uu ])) test.c:53 218 {movvalign_internal} (expr_list:REG_DEAD (reg/v:VALIGN 136 [ uu ]) (nil))) (insn 34 35 38 5 (parallel [ (set (reg:VALIGN 147 [ uu ]) (const_int 0 [0])) (set (mem:VALIGN (plus:SI (reg/f:SI 126 [ D.2594 ]) (const_int 16 [0x10])) [0 MEM[(void *)D.2594_12 + 16B] S16 A8]) (unspec:VALIGN [ (reg:VALIGN 147 [ uu ]) (mem:VALIGN (plus:SI (reg/f:SI 126 [ D.2594 ]) (const_int 16 [0x10])) [0 MEM[(void *)D.2594_12 + 16B] S16 A8]) ] UNSPEC_SVA)) ]) test.c:53 453 {sva_f} (expr_list:REG_DEAD (reg/f:SI 126 [ D.2594 ]) (expr_list:REG_UNUSED (reg:VALIGN 147 [ uu ]) resolution: @@ -4386,6 +4386,12 @@ cse_insn (rtx insn) || GET_CODE (dest) == STRICT_LOW_PART) || GET_CODE (dest) == STRICT_LOW_PART) dest = XEXP (dest, 0); dest = XEXP (dest, 0); /* Both input and output value will not be used after current insn, so the dest can also be updated with regnote REG_UNUSED. */ if (REG_P (dest) && find_reg_note (insn, REG_UNUSED, dest)) validate_change (insn, &SET_DEST (sets[i].rtl), canon_reg (dest, insn), 1); if (MEM_P (dest)) if (MEM_P (dest)) canon_reg (dest, insn); canon_reg (dest, insn); } }