https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88070
--- Comment #6 from Uroš Bizjak <ubizjak at gmail dot com> ---
It looks that we need to relax mode-switching:
--cut here--
Index: mode-switching.c
===================================================================
--- mode-switching.c (revision 266278)
+++ mode-switching.c (working copy)
@@ -252,7 +252,21 @@ create_pre_exit (int n_entities, int *entity_map,
if (EDGE_COUNT (EXIT_BLOCK_PTR_FOR_FN (cfun)->preds) == 1
&& NONJUMP_INSN_P ((last_insn = BB_END (src_bb)))
&& GET_CODE (PATTERN (last_insn)) == USE
- && GET_CODE ((ret_reg = XEXP (PATTERN (last_insn), 0))) == REG)
+ && GET_CODE ((ret_reg = XEXP (PATTERN (last_insn), 0))) == REG
+
+ /* x86 targets use mode-switching infrastructure to
+ conditionally insert vzeroupper instruction at the exit
+ from the function where there is no need to switch the
+ mode before the return value copy. The vzeroupper insertion
+ pass runs after reload, so use !reload_completed as a stand-in
+ for x86 to skip the search for the return value copy insn.
+
+ N.b.: the code below assumes that the return copy insn
+ immediately precedes its corresponding use insn. This
+ assumption does not hold after reload, since sched1 pass
+ can schedule the return copy insn away from its
+ corresponding use insn. */
+ && !reload_completed)
{
int ret_start = REGNO (ret_reg);
int nregs = REG_NREGS (ret_reg);
--cut here--