Wilco Dijkstra <wilco.dijks...@arm.com> writes: > Hi Richard, > >> I was worried that reusing "dest" for intermediate results would >> prevent CSE for cases like: >> >> void g (long long, long long); >> void >> f (long long *ptr) >> { >> g (0xee11ee22ee11ee22LL, 0xdc23dc44ee11ee22LL); >> } > > Note that aarch64_internal_mov_immediate may be called after reload, > so it would end up even more complex.
The sequence I quoted was supposed to work before and after reload. The: rtx tmp = aarch64_target_reg (dest, DImode); would create a fresh temporary before reload and reuse dest otherwise. So the sequence after reload would be the same as in your patch, but the sequence before reload would use a temporary. > This should be done as a > dedicated mid-end optimization similar to TARGET_CONST_ANCHOR. > However the number of 3/4-instruction immediates is so small that > sharable cases would be very rare, so I don't believe it is worth it. Yeah. If, with a few tweaks, we could easily reuse the existing pass flow to optimise the split forms, then it might have been worth it. But I agree it's not worth doing something special that only works for multi-insn immediates. I think there are other cases where CSE after split would help though. Thanks, Richard