[Bug c++/117037] New: gcc14.2.1 riscv64-unknown-linux-gnu-g++ build opencv 5.x error

2024-10-08 Thread bigmagicreadsun at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117037

Bug ID: 117037
   Summary: gcc14.2.1 riscv64-unknown-linux-gnu-g++ build opencv
5.x error
   Product: gcc
   Version: 14.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bigmagicreadsun at gmail dot com
  Target Milestone: ---

When compiling OpenCV 5.x with riscv64-unknown-linux-gnu-g++, the following
error occurs.

set(CMAKE_C_FLAGS_INIT "-march=rv64imafdcv -mabi=lp64d -O3 -static")
set(CMAKE_CXX_FLAGS_INIT "-march=rv64imafdcv -mabi=lp64d -O3 -static")

opencv/modules/dnn/src/layers/reduce_layer.cpp: In member function 'void
cv::dnn::ReduceLayerImpl::ReduceAllInvoker::operator()(const cv::Range&)
const [with Op = cv::dnn::ReduceLayerImpl::ReduceProd]':
/Local/home/zhaofujin/opencv/opencv/modules/dnn/src/layers/reduce_layer.cpp:290:14:
internal compiler error: in vect_create_partial_epilog, at
tree-vect-loop.cc:5866
  290 | void operator()(const Range& r) const CV_OVERRIDE {
  |  ^~~~
0x7f5a1f485082 __libc_start_main
../csu/libc-start.c:308
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.
make[2]: *** [modules/dnn/CMakeFiles/opencv_dnn.dir/build.make:1181:
modules/dnn/CMakeFiles/opencv_dnn.dir/src/layers/reduce_layer.cpp.o] Error 1
make[2]: *** Waiting for unfinished jobs



The code where the error occurs in reduce_layer.cpp is as follows:



ReduceAllInvoker(const Mat& src_, Mat& dst_) : src(src_), dst(dst_) {
auto shape_src = shape(src);

n_reduce = std::accumulate(shape_src.begin(), shape_src.end(), 1,
std::multiplies());
loop_size = n_reduce;

total = 1;
cost_per_thread = 1;
}

void operator()(const Range& r) const CV_OVERRIDE {
int start = r.start;
int end = r.end;

const dtype* p_src = src.ptr();
dtype* p_dst = dst.ptr();

for (int i = start; i < end; ++i) {
Op accumulator(n_reduce, *p_src);
for (int l = 0; l < loop_size; ++l) {
accumulator.update(p_src[l]);
}
p_dst[i] = accumulator.get_value();
}
}

This error seems to be related to g++'s generation logic for RVV.

[Bug target/117037] gcc14.2.1 riscv64-unknown-linux-gnu-g++ build opencv 5.x error

2024-10-11 Thread bigmagicreadsun at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117037

--- Comment #3 from fujin zhao  ---
The same project compiles fine with GCC 13.1.1, but it reports an error with
GCC 14.2.1. However, adding the -fwrapv option in GCC 14.2.1 can fixed the
bug.Could you please explain the reason for this?

[Bug target/117037] gcc14.2.1 riscv64-unknown-linux-gnu-g++ build opencv 5.x error

2024-10-11 Thread bigmagicreadsun at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117037

--- Comment #4 from fujin zhao  ---
After adding the -freport-bug compilation option, the error message is as
follows:

[ 49%] Building CXX object
modules/dnn/CMakeFiles/opencv_dnn.dir/src/layers/reduce_layer.cpp.o
during GIMPLE pass: vect
opencv/modules/dnn/src/layers/reduce_layer.cpp: In member function 'void
cv::dnn::ReduceLayerImpl::ReduceAllInvoker::operator()(const cv::Range&)
const [with Op = cv::dnn::ReduceLayerImpl::ReduceProd]':
opencv/modules/dnn/src/layers/reduce_layer.cpp:290:14: internal compiler error:
in vect_create_partial_epilog, at tree-vect-loop.cc:5866
  290 | void operator()(const Range& r) const CV_OVERRIDE {
  |  ^~~~
0x7fddb57b7082 __libc_start_main
../csu/libc-start.c:308
Please submit a full bug report, with preprocessed source.
Please include the complete backtrace with any bug report.
See  for instructions.
The bug is not reproducible, so it is likely a hardware or OS problem.
make[2]: *** [modules/dnn/CMakeFiles/opencv_dnn.dir/build.make:1181:
modules/dnn/CMakeFiles/opencv_dnn.dir/src/layers/reduce_layer.cpp.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:2564:
modules/dnn/CMakeFiles/opencv_dnn.dir/all] Error 2
make: *** [Makefile:163: all] Error 2


The current version information of g++:

riscv64-unknown-linux-gnu-g++ -v
Using built-in specs.
COLLECT_GCC=riscv64-unknown-linux-gnu-g++
COLLECT_LTO_WRAPPER=/2024-gcc14/gcc/bin/../libexec/gcc/riscv64-unknown-linux-gnu/14.2.1/lto-wrapper
Target: riscv64-unknown-linux-gnu
Configured with: /builds/software/devtools/riscv-gnu-toolchain/gcc/configure
--target=riscv64-unknown-linux-gnu
--prefix=/work/toolchain/install/linux64/glibc/2024-gcc14/gcc
--with-sysroot=/work/toolchain/install/linux64/glibc/2024-gcc14/gcc/sysroot
--with-pkgversion=gd92407a96 --with-system-zlib --enable-shared --enable-tls
--enable-languages=c,c++,fortran --disable-libmudflap --disable-libssp
--disable-libquadmath --disable-libsanitizer --disable-nls --disable-bootstrap
--src=/builds/software/devtools/riscv-gnu-toolchain/gcc --enable-multilib
--with-abi=lp64 --with-arch=rv64ima --with-tune=rocket --with-isa-spec=2.2
'CFLAGS_FOR_TARGET=-O2-mcmodel=medany' 'CXXFLAGS_FOR_TARGET=-O2   
-mcmodel=medany'
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 14.2.1 20240816 (gd92407a96)

[Bug c/118618] New: RISC-V: Zcmp extension and RVV auto-vectorization are both enabled,the sp register error.

2025-01-22 Thread bigmagicreadsun at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118618

Bug ID: 118618
   Summary: RISC-V: Zcmp extension and RVV auto-vectorization are
both enabled,the sp register error.
   Product: gcc
   Version: 14.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bigmagicreadsun at gmail dot com
  Target Milestone: ---

I found that when both the RISC-V Zve32 and Zcmp extensions are enabled during
compilation, RVV automatically performs vectorization, and SP also saves RVV
vector registers. However, it seems that Zcmp does not take this condition into
account when handling SP.

Here is a practical example:

When I only enable Zve32, -march=rv32imafdc_zve32f

8000185e :
8000185e:   1101c.addi  sp,-32
80001860:   c22022f3csrrs   t0,vlenb,zero
80001864:   00052803lw  a6,0(a0)
80001868:   ca26c.swsp  s1,20(sp)
8000186a:   ce06c.swsp  ra,28(sp)
8000186c:   cc22c.swsp  s0,24(sp)
8000186e:   c84ac.swsp  s2,16(sp)
80001870:   c64ec.swsp  s3,12(sp)
80001872:   c452c.swsp  s4,8(sp)
80001874:   c256c.swsp  s5,4(sp)
80001876:   00229313sllit1,t0,0x2
8000187a:   4144c.lws1,4(a0)
8000187c:   40610133sub sp,sp,t1
80001880:   0a080c63beq a6,zero,80001938

80001884:   0c8077d7vsetvli a5,zero,e16,m1,ta,ma
80001888:   5e05c157vmv.v.x v2,a1
8000188c:   8526c.mva0,s1
8000188e:   00181a13sllis4,a6,0x1
80001892:   4981c.lis3,0
80001894:   86aac.mva3,a0
80001896:   8442c.mvs0,a6
80001898:   872ac.mva4,a0
8000189a:   0c8477d7vsetvli a5,s0,e16,m1,ta,ma
8000189e:   0206d087vle16.v v1,(a3)
800018a2:   00179613sllia2,a5,0x1
800018a6:   8c1dc.sub   s0,a5
800018a8:   96b2c.add   a3,a2
800018aa:   021100d7vadd.vv v1,v1,v2
800018ae:   020750a7vse16.v v1,(a4)
800018b2:   9732c.add   a4,a2
800018b4:   f07dc.bnez  s0,8000189a

800018b6:   00198913addis2,s3,1
800018ba:   9552c.add   a0,s4
800018bc:   01280463beq a6,s2,800018c4

800018c0:   89cac.mvs3,s2
800018c2:   bfc9c.j 80001894

800018c4:   c2202af3csrrs   s5,vlenb,zero
800018c8:   002a9793sllia5,s5,0x2
800018cc:   415787b3sub a5,a5,s5
800018d0:   978ac.add   a5,sp
800018d2:   4581c.lia1,0
800018d4:   4501c.lia0,0
800018d6:   02878127vs1r.v  v2,(a5)
800018da:   7ce000efjal ra,800020a8 
800018de:   002a9793sllia5,s5,0x2
800018e2:   415787b3sub a5,a5,s5
800018e6:   978ac.add   a5,sp
800018e8:   0287d107vl1re16.v   v2,(a5)
800018ec:   8626c.mva2,s1
800018ee:   86cac.mva3,s2
800018f0:   8726c.mva4,s1
800018f2:   0c86f7d7vsetvli a5,a3,e16,m1,ta,ma
800018f6:   02065087vle16.v v1,(a2)
800018fa:   00179593sllia1,a5,0x1
800018fe:   8e9dc.sub   a3,a5
80001900:   962ec.add   a2,a1
80001902:   021100d7vadd.vv v1,v1,v2
80001906:   020750a7vse16.v v1,(a4)
8000190a:   972ec.add   a4,a1
8000190c:   f2fdc.bnez  a3,800018f2

8000190e:   94d2c.add   s1,s4
80001910:   00140793addia5,s0,1
80001914:   00898463beq s3,s0,8000191c

80001918:   843ec.mvs0,a5
8000191a:   bfc9c.j 800018ec

8000191c:   c22022f3csrrs   t0,vlenb,zero
80001920:   00229313sllit1,t0,0x2
80001924:   911ac.add   sp,t1
80001926:   40f2c.lwsp  ra,28(sp)
80001928:   4462c.lwsp  s0,24(sp)
8000192a:   44d2c.lwsp  s1,20(sp)
8000192c:   4942c.lwsp  s2,16(sp)
8000192e:   49b2   

[Bug target/117037] gcc14.2.1 riscv64-unknown-linux-gnu-g++ build opencv 5.x error

2025-01-23 Thread bigmagicreadsun at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117037

fujin zhao  changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |FIXED

[Bug target/117037] gcc14.2.1 riscv64-unknown-linux-gnu-g++ build opencv 5.x error

2025-01-23 Thread bigmagicreadsun at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117037

--- Comment #5 from fujin zhao  ---
It was my code that had an error, this error does not exist.

[Bug c/119475] New: RISC-V: After enabling LTO with RVV, compilation errors occur.

2025-03-26 Thread bigmagicreadsun at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119475

Bug ID: 119475
   Summary: RISC-V: After enabling LTO with RVV, compilation
errors occur.
   Product: gcc
   Version: 14.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bigmagicreadsun at gmail dot com
  Target Milestone: ---

When compiling the following code using RISC-V RVV extensions with LTO enabled:

#include 
#include 
typedef float float32_t;
int rvv_cfft_f32(void) {
size_t avl = 5;
const float32_t *px;
size_t vl = __riscv_vsetvl_e32m2(avl);
vfloat32m2x2_t v_tuple = __riscv_vlseg2e32_v_f32m2x2(px, vl);
vfloat32m2_t va_im = __riscv_vget_v_f32m2x2_f32m2(v_tuple, 1);
va_im = __riscv_vfneg_v_f32m2(va_im, vl);
}

The compilation fails with the following error:

lto1: fatal error: target specific builtin not available
compilation terminated.
lto-wrapper: fatal error: riscv32-unknown-elf-gcc returned 1 exit status
compilation terminated.
/home/riscv/bin/../lib/gcc/riscv32-unknown-elf/14.2.0/../../../../riscv32-unknown-elf/bin/ld:
error: lto-wrapper failed
collect2: error: ld returned 1 exit status

Details of the Issue

Toolchain Versions:

Prebuilt riscv-gnu-toolchain (riscv-gnu-toolchain releases 2025.01.20-nightly).
Self-compiled GCC 15 (same error occurs).

riscv32-unknown-elf-gcc -O0 -flto -march=rv32imafc_zve32f -mabi=ilp32f
-nostartfiles test.c

[Bug c/119709] New: RISC-V: Why volatile int16_t variables generate extra shift instructions in compiler output

2025-04-10 Thread bigmagicreadsun at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119709

Bug ID: 119709
   Summary: RISC-V: Why volatile int16_t variables generate extra
shift instructions in compiler output
   Product: gcc
   Version: 14.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bigmagicreadsun at gmail dot com
  Target Milestone: ---

When compiling the following code with RISC-V GCC:


#include 
volatile int16_t x;
int get() {
  return x;
}
The generated assembly is:


get:
lui a5,%hi(x)
lhu a0,%lo(x)(a5)
sllia0,a0,16
sraia0,a0,16
ret
x:
.zero   2
(Full example: Godbolt link: https://godbolt.org/z/Y93T4c7M7)

Why does the compiler generate redundant shift operations (slli + srai) instead
of directly using lh?

When I change x to volatile uint16_t x, GCC correctly generates an lhu
instruction without shifts. Why does this behavior occur?

[Bug c/119830] New: RISC-V:Internal Compiler Error on RISC-V Windows Toolchain (32-bit program) with -march=rv64gc_zbb_zbs

2025-04-16 Thread bigmagicreadsun at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119830

Bug ID: 119830
   Summary: RISC-V:Internal Compiler Error on RISC-V Windows
Toolchain (32-bit program) with -march=rv64gc_zbb_zbs
   Product: gcc
   Version: 14.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bigmagicreadsun at gmail dot com
  Target Milestone: ---

Description:

When compiling the following code using a RISC-V Windows toolchain (32-bit
program) with -march=rv64gc_zbb_zbs -mabi=lp64d -O3, an internal compiler error
occurs:


#include 
void test(int32_t N, int16_t* A, int16_t val) {
int32_t i, j;
for (i = 0; i < N; i++) {
for (j = 0; j < N; j++) {
A[i * N + j] += val;
}
}
}
Command & Error:


riscv64-unknown-elf-gcc.exe -Ofast -march=rv64gc_zbb_zbs -mabi=lp64d -c .\xxx.c
-fdump-rtl-all -freport-bug
.\xxx.c: In function 'test':
.\xxx.c:10:1: error: unrecognizable insn:
(insn 264 263 265 28 (set (reg:DI 419)
(and:DI (reg:DI 421)
(const_int 2147483647 [0x7fff]))) -1
 (expr_list:REG_EQUAL (const_int -4294901761 [0x])
(nil)))
during RTL pass: vregs
xxx.c.267r.vregs
internal compiler error: in extract_insn, at recog.cc:2812
Additional Context:

Comparing the RISC-V Linux toolchain’s RTL dump (Godbolt
link:https://godbolt.org/z/rcooKfc1n), the correct insn uses the bclridi
template:

(insn 329 328 330 28 (set (reg:DI 413)
(and:DI (reg:DI 415)
(const_int -2147483649 [0x7fff]))) 630 {*bclridi}
 (expr_list:REG_EQUAL (const_int -4294901761 [0x])...))
The Windows toolchain produces a mismatched insn:

(insn 264 [...] (const_int 2147483647 [0x7fff])))  ; Not matching bclridi
The discrepancy in the generated and constant (2147483647 vs. -2147483649)
causes failure to match the bclridi pattern, triggering the ICE.

Question: Why does the Windows toolchain generate inconsistent RTL patterns
compared to Linux for the bclridi instruction template?

Compiler Version: riscv64-unknown-elf-gcc 14.2.1

Reproduction Steps:

Compile the code with -march=rv64gc_zbb_zbs -mabi=lp64d -O3.
Observe RTL differences in the vregs pass.
This includes all critical details and allows GCC developers to investigate
architecture-specific code generation differences. Let me know if you need
further adjustments.

[Bug c/119847] New: RISC-V:GCC fail to optimize repeated patterns in volatile operations

2025-04-17 Thread bigmagicreadsun at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119847

Bug ID: 119847
   Summary: RISC-V:GCC fail to optimize repeated patterns in
volatile operations
   Product: gcc
   Version: 14.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bigmagicreadsun at gmail dot com
  Target Milestone: ---

When optimizing volatile memory operations, GCC 13.1 and 14.1 generate
redundant instruction sequences instead of optimizing them. Below is my test
case:

-march=rv64gc -mabi=lp64d  -Os


#include 
uint16_t test_rd(uint32_t addr)
{
return (uint16_t)(*(volatile uint32_t *)((uintptr_t)(0xF800 + (addr <<
2;
}
void test(uint16_t *buf)
{
buf[10] = test_rd(0x300ae);
buf[11] = test_rd(0x300ad);
buf[12] = test_rd(0x300ac);
}
Generated assembly (GCC 13.1/14.1):


test:
li  a5,65024000
sllia5,a5,6
addia5,a5,696
lw  a5,0(a5)
sh  a5,20(a0)
li  a5,65024000
sllia5,a5,6
addia5,a5,692
lw  a5,0(a5)
sh  a5,22(a0)
li  a5,65024000
sllia5,a5,6
addia5,a5,688
lw  a5,0(a5)
sh  a5,24(a0)
ret
The sequence li a5,65024000 + slli a5,a5,6 is needlessly repeated for each
test_rd() call. Full example: https://godbolt.org/z/1ooj71f6f

Expected behavior (GCC 12.1):


test:
li  a5,65024000
sllia5,a5,6
addia4,a5,696
lw  a4,0(a4)
sh  a4,20(a0)
addia4,a5,692
addia5,a5,688
lw  a4,0(a4)
lw  a5,0(a5)
sh  a4,22(a0)
sh  a5,24(a0)
ret
The older compiler correctly avoids redundant base address calculations. Why
does this optimization regress in GCC 13/14?