[Bug c/107415] New: RISCV-gcc: Leaf function compiles as recursive with -O3

2022-10-26 Thread michael.meier at hexagon dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107415

Bug ID: 107415
   Summary: RISCV-gcc: Leaf function compiles as recursive with
-O3
   Product: gcc
   Version: 12.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: michael.meier at hexagon dot com
  Target Milestone: ---

Created attachment 53776
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53776&action=edit
The -save-temps output as requested

I am using gcc 12.1.0 to compile for RISC-V. Compiling the following leaf
function (it's in the .i file in the attachment) results in gcc generating a
recursive call to memset itself, resulting in an endless loop. The code is a
slightly reduced version function that led me to finding this issue. It does
not make sense as memset, but it triggers the issue.

void* memset(void* dest, int byte, size_t len)
{
  if uintptr_t)dest | len) & (sizeof(uintptr_t)-1)) == 0) {
uintptr_t *d = dest;
*d = byte;
  } else {
char *d = dest;
while (d < (char*)(dest + len))
  *d++ = byte;
  }
  return dest;
}


First lines of the objdump output of the generated object file:

 :
   0:   ff010113addisp,sp,-16
   4:   00812423sw  s0,8(sp)
   8:   00912223sw  s1,4(sp)
   c:   00112623sw  ra,12(sp)
  10:   01010413addis0,sp,16
  14:   00c56733or  a4,a0,a2
  18:   00377713andia4,a4,3
  1c:   00050493mv  s1,a0
  20:   02070a63beqza4,54 <.L2>
20: R_RISCV_BRANCH  .L2

0024 <.LBB2>:
  24:   00c507b3add a5,a0,a2
  28:   00f57a63bgeua0,a5,3c <.L4>
28: R_RISCV_BRANCH  .L4
  2c:   01859593sllia1,a1,0x18

0030 <.LVL2>:
  30:   4185d593sraia1,a1,0x18
  34:   0097auipc   ra,0x0
34: R_RISCV_CALLmemset
34: R_RISCV_RELAX   *ABS*
  38:   80e7jalrra # 34 <.LVL2+0x4>

At address 38 inside we find a call to memset. This instruction can be jumped
over if the right conditions on a4 and a0 happen to be true. However running
the function up to the jalr memset does not change any of these conditions.
After calling memset, the PC will arrive will arrive at the jalr memset again,
resulting in an endless loop.


Output of gcc -v:

Using built-in specs.
COLLECT_GCC=/home/memi/Downloads/riscv/bin/riscv64-unknown-elf-gcc
COLLECT_LTO_WRAPPER=/home/memi/Downloads/riscv/bin/../libexec/gcc/riscv64-unknown-elf/12.1.0/lto-wrapper
Target: riscv64-unknown-elf
Configured with: /tmp/rv_gcc/riscv-gnu-toolchain/gcc/configure
--target=riscv64-unknown-elf --prefix=/opt/riscv --disable-shared
--disable-threads --enable-languages=c,c++ --with-pkgversion=g1ea978e3066
--with-system-zlib --enable-tls --with-newlib
--with-sysroot=/opt/riscv/riscv64-unknown-elf
--with-native-system-header-dir=/include --disable-libmudflap --disable-libssp
--disable-libquadmath --disable-libgomp --disable-nls
--disable-tm-clone-registry --src=/tmp/rv_gcc/riscv-gnu-toolchain/gcc
--enable-multilib
--with-multilib-generator='rv32i-ilp32--;rv32ic-ilp32--;rv32im-ilp32--;rv32imc-ilp32--'
--with-abi=lp64d --with-arch=rv64imafdc --with-tune=rocket --with-isa-spec=2.2
'CFLAGS_FOR_TARGET=-Os   -mcmodel=medlow' 'CXXFLAGS_FOR_TARGET=-Os  
-mcmodel=medlow'
Thread model: single
Supported LTO compression algorithms: zlib
gcc version 12.1.0 (g1ea978e3066) 

Complete command line that triggers the problem:

/home/memi/Downloads/riscv/bin/riscv64-unknown-elf-gcc -c own-memset.c  -O3
-DNDEBUG -march=rv32im -mabi=ilp32 -msmall-data-limit=8 -mno-save-restore
-fmessage-length=0 -fsigned-char -ffunction-sections -fdata-sections -g
-fno-omit-frame-pointer -std=gnu11 -MD -MT
softconsole_oriole_main/CMakeFiles/MIV_Project.dir/ISP_PRO.c.obj  -save-temps

The compiler output (on the console):
None

The preprocessed file is attached.

[Bug tree-optimization/107415] RISCV-gcc: Leaf function compiles as recursive with -O3

2022-10-26 Thread michael.meier at hexagon dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107415

--- Comment #2 from Michael Meier  ---
(In reply to Andrew Pinski from comment #1)
> Gcc can detect memset and optimize it to memset. And this is what is
> happening. 
> This is documented too.

Thanks for your quick reply. I understand what is happening now.

Now that you pointed me in this direction, I found many references to this
behavior only, including that I can use -fno-tree-loop-distribute-patterns to
disable the optimization. However I didn't find mention of the feature in the
official documentation. Can you provide me with a reference, please?