Hi Kito.
I fixed almost all of the rv32be testcase failures simply by taking
endianness into account on the first line of riscv_subword, which is
used for long long handling on 32-bit.
Now, I only have one failing testcase (which does not also fail on
little endian), and it's a doozy.
The test in question is gcc.c-torture/compile/pr35318.c. The test in
its entirety is
double x = 4, y;
__asm__ volatile ("# %0,%1,%2,%3" : "=r,r" (x), "=r,r" (y) : "%0,0" (x),
"m,r" (8));
(the asm comment in the first argument was added by me to track what
the actual assignments were.)
When compiled with -mbig-endian, this results in an ICE:
---8<---
/tmp/pr35318.c: In function 'foo':
/tmp/pr35318.c:9:1: error: unrecognizable insn:
9 | }
| ^
(insn 12 24 25 2 (parallel [
(set (reg:DF 11 a1 [orig:74 x ] [74])
(asm_operands/v:DF ("# %0,%1,%2,%3") ("=r,r") 0 [
(reg:SI 12 a2 [orig:74 x+4 ] [74])
(mem/c:DF (plus:SI (reg/f:SI 8 s0)
(const_int -40 [0xffffffffffffffd8])) [2
%sfp+-24 S8 A64])
]
[
(asm_input:DF ("%0,0") /tmp/pr35318.c:8)
(asm_input:SI ("m,r") /tmp/pr35318.c:8)
]
[] /tmp/pr35318.c:8))
(set (reg:DF 15 a5 [orig:75 y ] [75])
(asm_operands/v:DF ("# %0,%1,%2,%3") ("=r,r") 1 [
(reg:SI 12 a2 [orig:74 x+4 ] [74])
(mem/c:DF (plus:SI (reg/f:SI 8 s0)
(const_int -40 [0xffffffffffffffd8])) [2
%sfp+-24 S8 A64])
]
[
(asm_input:DF ("%0,0") /tmp/pr35318.c:8)
(asm_input:SI ("m,r") /tmp/pr35318.c:8)
]
[] /tmp/pr35318.c:8))
]) "/tmp/pr35318.c":8:3 -1
(nil))
during RTL pass: reload
dump file: /tmp/pr35318b.txt
/tmp/pr35318.c:9:1: internal compiler error: in extract_constrain_insn, at
recog.c:2670
0x101bf90b _fatal_insn(char const*, rtx_def const*, char const*, int, char
const*)
../../../riscv-gcc/gcc/rtl-error.c:108
0x101bf953 _fatal_insn_not_found(rtx_def const*, char const*, int, char const*)
../../../riscv-gcc/gcc/rtl-error.c:116
0x10a1193f extract_constrain_insn(rtx_insn*)
../../../riscv-gcc/gcc/recog.c:2670
0x1088fc77 check_rtl
../../../riscv-gcc/gcc/lra.c:2087
0x108971c7 lra(_IO_FILE*)
../../../riscv-gcc/gcc/lra.c:2505
0x1082fcb7 do_reload
../../../riscv-gcc/gcc/ira.c:5827
0x1082fcb7 execute
../../../riscv-gcc/gcc/ira.c:6013
---8<---
This insn looks extremely similar to one that's in the dump-rtl for
little endian:
---8<---
(insn 12 20 21 2 (parallel [
(set (reg:DF 13 a3 [orig:74 x ] [74])
(asm_operands/v:DF ("# %0,%1,%2,%3") ("=r,r") 0 [
(reg:SI 13 a3 [orig:74 x ] [74])
(mem/c:DF (plus:SI (reg/f:SI 8 s0)
(const_int -40 [0xffffffffffffffd8])) [2
%sfp+-24 S8 A64])
]
[
(asm_input:DF ("%0,0") /tmp/pr35318.c:8)
(asm_input:SI ("m,r") /tmp/pr35318.c:8)
]
[] /tmp/pr35318.c:8))
(set (reg:DF 15 a5 [orig:75 y ] [75])
(asm_operands/v:DF ("# %0,%1,%2,%3") ("=r,r") 1 [
(reg:SI 13 a3 [orig:74 x ] [74])
(mem/c:DF (plus:SI (reg/f:SI 8 s0)
(const_int -40 [0xffffffffffffffd8])) [2
%sfp+-24 S8 A64])
]
[
(asm_input:DF ("%0,0") /tmp/pr35318.c:8)
(asm_input:SI ("m,r") /tmp/pr35318.c:8)
]
[] /tmp/pr35318.c:8))
]) "/tmp/pr35318.c":8:3 -1
(nil))
---8<---
So I don't know what's "unrecognizable" about it...
I also don't understand the code that is actually generated in the
little-endian case.
The way I read the asm statement, %2 should be a register (same as %0)
containing the (floating point?) value "4", and %3 should be a memory
location (assuming the first alternative is chosen) containing the
value "8".
However, looking at the generated assembler code, it seems that %2 is
a register (a3) which contains the integer value "8" and %3 is a
memory location (-40(s0)) which contains the floating point value
"4.0". This seems mixed up.
---8<---
foo:
addi sp,sp,-48
sw s0,44(sp)
addi s0,sp,48
lui a5,%hi(.LC0)
fld fa5,%lo(.LC0)(a5)
fsd fa5,-24(s0)
fld fa5,-24(s0)
li a5,8
fsd fa5,-40(s0)
mv a3,a5
#APP
# 8 "/tmp/pr35318.c" 1
# a3,a5,a3,-40(s0)
# 0 "" 2
#NO_APP
sw a3,-40(s0)
sw a4,-36(s0)
fld fa5,-40(s0)
fsd fa5,-24(s0)
sw a5,-32(s0)
sw a6,-28(s0)
nop
lw s0,44(sp)
addi sp,sp,48
jr ra
.size foo, .-foo
.section .rodata
.align 3
.LC0: # little endian double "4.0"
.word 0
.word 1074790400
---8<---
Is this code correct, or is there some deeper issue at play here?
(AFAIU the testcase only checks that the compiler doesn't ICE, not
that the generated code is correct...)
If the code generated for LE is bad, I probably should not try to make
BE generate the same thing. :-/
// Marcus