https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121947
--- Comment #2 from H.J. Lu <hjl.tools at gmail dot com> --- (In reply to Hongtao Liu from comment #1) > ---------------Quote from [1].--------------------------- > > > No, the approach is wrong. You have to solve output clearing on RTL > > > level, please look at how e.g. tzcnt false dep is solved: > > > > Actually we have considered such approach before, but we found we need > > to break original define_insn to remove the mask/rounding subst, > > since define_split could not adopt subst, and that would add 6 more > > define_insn_and_split and 4 define_insn for each instruction. We think > > such approach would introduce too much redundant code. > > > > Do you think the code size increment is acceptable? > > Also that 100+ more patterns increases maintenance effort. If we split > them at epilogue_complete stage, 100+ patterns do require quite a bit of work. But split it before RA should remove redundant vxorps. But it is hard to tell if it will improve performance.
