[Bug target/80861] New: ARM (VFPv3): Inefficient float-to-char conversion goes through memory

gergo.barany at inria dot fr Mon, 22 May 2017 14:02:42 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80861


            Bug ID: 80861
           Summary: ARM (VFPv3): Inefficient float-to-char conversion goes
                    through memory
           Product: gcc
           Version: 8.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: gergo.barany at inria dot fr
  Target Milestone: ---

Created attachment 41407
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41407&action=edit
Input C file for triggering the bug

Consider the attached code:

$ cat tst.c
char fn1(float p1) {
  return (char) p1;
}

GCC from trunk from two weeks ago generates this code on ARM:

$ gcc tst.c -O3 -S -o -
        .arch armv7-a
        .eabi_attribute 28, 1
        .eabi_attribute 20, 1
        .eabi_attribute 21, 1
        .eabi_attribute 23, 3
        .eabi_attribute 24, 1
        .eabi_attribute 25, 1
        .eabi_attribute 26, 1
        .eabi_attribute 30, 2
        .eabi_attribute 34, 1
        .eabi_attribute 18, 4
        .file   "tst.c"
        .text
        .align  2
        .global fn1
        .syntax unified
        .arm
        .fpu vfpv3-d16
        .type   fn1, %function
fn1:
        @ args = 0, pretend = 0, frame = 8
        @ frame_needed = 0, uses_anonymous_args = 0
        @ link register save eliminated.
        vcvt.u32.f32    s15, s0
        sub     sp, sp, #8
        vstr.32 s15, [sp, #4]   @ int
        ldrb    r0, [sp, #4]    @ zero_extendqisi2
        add     sp, sp, #8
        @ sp needed
        bx      lr
        .size   fn1, .-fn1
        .ident  "GCC: (GNU) 8.0.0 20170510 (experimental)"


Going through memory for the int-to-char truncation after the float-to-int
conversion (vcvt) is excessive. For comparison, this is the entire code
generated by Clang:

@ BB#0:
        vcvt.u32.f32    s0, s0
        vmov    r0, s0
        bx      lr

And this is what CompCert produces for the core of the function (stack
manipulation code omitted):

        vcvt.u32.f32 s12, s0
        vmov    r0, s12
        and     r0, r0, #255


My GCC version:

Target: armv7a-eabihf
Configured with: --target=armv7a-eabihf --with-arch=armv7-a
--with-fpu=vfpv3-d16 --with-float-abi=hard --with-float=hard
Thread model: single
gcc version 8.0.0 20170510 (experimental) (GCC)

[Bug target/80861] New: ARM (VFPv3): Inefficient float-to-char conversion goes through memory

Reply via email to