https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114741
--- Comment #6 from Tamar Christina <tnfchris at gcc dot gnu.org> --- and the exact armv9-a cost model you quoted, also does the right codegen. https://godbolt.org/z/obafoT6cj There is just an inexplicable penalty being applied to the r->r alternative.