https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80861
Bug ID: 80861 Summary: ARM (VFPv3): Inefficient float-to-char conversion goes through memory Product: gcc Version: 8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: gergo.barany at inria dot fr Target Milestone: --- Created attachment 41407 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41407&action=edit Input C file for triggering the bug Consider the attached code: $ cat tst.c char fn1(float p1) { return (char) p1; } GCC from trunk from two weeks ago generates this code on ARM: $ gcc tst.c -O3 -S -o - .arch armv7-a .eabi_attribute 28, 1 .eabi_attribute 20, 1 .eabi_attribute 21, 1 .eabi_attribute 23, 3 .eabi_attribute 24, 1 .eabi_attribute 25, 1 .eabi_attribute 26, 1 .eabi_attribute 30, 2 .eabi_attribute 34, 1 .eabi_attribute 18, 4 .file "tst.c" .text .align 2 .global fn1 .syntax unified .arm .fpu vfpv3-d16 .type fn1, %function fn1: @ args = 0, pretend = 0, frame = 8 @ frame_needed = 0, uses_anonymous_args = 0 @ link register save eliminated. vcvt.u32.f32 s15, s0 sub sp, sp, #8 vstr.32 s15, [sp, #4] @ int ldrb r0, [sp, #4] @ zero_extendqisi2 add sp, sp, #8 @ sp needed bx lr .size fn1, .-fn1 .ident "GCC: (GNU) 8.0.0 20170510 (experimental)" Going through memory for the int-to-char truncation after the float-to-int conversion (vcvt) is excessive. For comparison, this is the entire code generated by Clang: @ BB#0: vcvt.u32.f32 s0, s0 vmov r0, s0 bx lr And this is what CompCert produces for the core of the function (stack manipulation code omitted): vcvt.u32.f32 s12, s0 vmov r0, s12 and r0, r0, #255 My GCC version: Target: armv7a-eabihf Configured with: --target=armv7a-eabihf --with-arch=armv7-a --with-fpu=vfpv3-d16 --with-float-abi=hard --with-float=hard Thread model: single gcc version 8.0.0 20170510 (experimental) (GCC)