https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115024

Roger Sayle <roger at nextmovesoftware dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |roger at nextmovesoftware dot 
com

--- Comment #9 from Roger Sayle <roger at nextmovesoftware dot com> ---
Created attachment 60680
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60680&action=edit
Standalone reduction of libgcc's __udivti3.

The bugzilla title implies that the issue is with 128-bit division, which in
this testcase is performed by libgcc's __udivti3. Indeed, in Colin's
attachments we appear to be doing worse at argument passing/shuffling (as
observed by Jakub).  However, this appears to be fixed (or better) for me on
mainline, and godbolt's gcc14 (see attached code).  Confusingly, __udivti3
wouldn't be impacted by the callers use of -mavx, and indeed none of the
attached code (caller and calleee) actually uses AVX/SSE instructions or
registers, so perhaps Haochen's analysis is right that this is some strange DSB
scheduling issue?

I've not yet managed to reproduce the problem, so if someone could check
whether linking the gcc-13 stress-cpu with the gcc-14 udivti3, and likewise the
gcc-14 stress-cpu against the gcc-13 udivti3, we can narrow down which
combination actually triggered the regression.  Thanks in advance.

Reply via email to