https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116825
--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> --- This all depends on the context of the tbl and uzp1. If this was inside a loop, then the load might/will be hoisted and GCC code generation of one TBL vs 2 uzp1 might be better.