Re: Suboptimal code generated for __buitlin_ceil on AMD64 without SS4_4.1

2021-08-05 Thread Hongtao Liu via Gcc
Could you file a bugzilla for that? https://gcc.gnu.org/bugzilla/enter_bug.cgi?product=gcc On Thu, Aug 5, 2021 at 3:34 PM Stefan Kanthak wrote: > > Hi, > > targeting AMD64 alias x86_64 with -O3, GCC 10.2.0 generates the > following code (17 instructions using 78 bytes, plus 6 quadwords > using 48

Suboptimal code generated for __buitlin_ceil on AMD64 without SS4_4.1

2021-08-05 Thread Stefan Kanthak
Hi, targeting AMD64 alias x86_64 with -O3, GCC 10.2.0 generates the following code (17 instructions using 78 bytes, plus 6 quadwords using 48 bytes) for __builtin_ceil() when -msse4.1 is NOT given: .text 0: f2 0f 10 15 10 00 00 00 movsd .LC1(%rip), %xmm2