https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70773
wilco at gcc dot gnu.org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |WAITING
CC| |wilco at gcc dot gnu.org
--- Comment #10 from wilco at gcc dot gnu.org ---
I can't reproduce any of this. GCC6 and GCC7 always use smull for the divisions
on ARM, even with profile-use. I could only make GCC emit a library call by
using -Os on a CPU that doesn't have divide, but that is expected and correct.
On AArch64 I get > 20% speedup with -fprofile-use vs plain -O3, so it works as
expected. With -mcpu=cortex-a53 there are more uses of sdiv, but the profiled
version is still faster.
So without more details I don't see any issue here.