https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120067
Bug ID: 120067 Summary: RISC-V: x264 sub4x4_dct high icount Product: gcc Version: 15.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: parras at gcc dot gnu.org Target Milestone: --- Created attachment 61273 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=61273&action=edit Detailed dump from the expand pass (RISC-V) This is reduced from 525.x264_r's 4th hottest block: https://godbolt.org/z/KdWv1er6f AArch64 assembly is clean and efficient (35 insns) while RISC-V's is long and messy (114 insns). The most obvious issue is that it keeps spilling and reloading the same data from the stack. Also I do not understand why we need those vslidedown. A rapid look at the expand dump (see attachment) shows that the latter come from VIEW_CONVERT_EXPR<vector(4) unsigned short>. I will keep looking into this.