https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80861
Bug ID: 80861
Summary: ARM (VFPv3): Inefficient float-to-char conversion goes
through memory
Product: gcc
Version: 8.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: gergo.barany at inria dot fr
Target Milestone: ---
Created attachment 41407
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41407&action=edit
Input C file for triggering the bug
Consider the attached code:
$ cat tst.c
char fn1(float p1) {
return (char) p1;
}
GCC from trunk from two weeks ago generates this code on ARM:
$ gcc tst.c -O3 -S -o -
.arch armv7-a
.eabi_attribute 28, 1
.eabi_attribute 20, 1
.eabi_attribute 21, 1
.eabi_attribute 23, 3
.eabi_attribute 24, 1
.eabi_attribute 25, 1
.eabi_attribute 26, 1
.eabi_attribute 30, 2
.eabi_attribute 34, 1
.eabi_attribute 18, 4
.file "tst.c"
.text
.align 2
.global fn1
.syntax unified
.arm
.fpu vfpv3-d16
.type fn1, %function
fn1:
@ args = 0, pretend = 0, frame = 8
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
vcvt.u32.f32 s15, s0
sub sp, sp, #8
vstr.32 s15, [sp, #4] @ int
ldrb r0, [sp, #4] @ zero_extendqisi2
add sp, sp, #8
@ sp needed
bx lr
.size fn1, .-fn1
.ident "GCC: (GNU) 8.0.0 20170510 (experimental)"
Going through memory for the int-to-char truncation after the float-to-int
conversion (vcvt) is excessive. For comparison, this is the entire code
generated by Clang:
@ BB#0:
vcvt.u32.f32 s0, s0
vmov r0, s0
bx lr
And this is what CompCert produces for the core of the function (stack
manipulation code omitted):
vcvt.u32.f32 s12, s0
vmov r0, s12
and r0, r0, #255
My GCC version:
Target: armv7a-eabihf
Configured with: --target=armv7a-eabihf --with-arch=armv7-a
--with-fpu=vfpv3-d16 --with-float-abi=hard --with-float=hard
Thread model: single
gcc version 8.0.0 20170510 (experimental) (GCC)