https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71374
Uroš Bizjak <ubizjak at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |ra
Status|UNCONFIRMED |NEW
Last reconfirmed| |2016-06-02
CC| |vmakarov at gcc dot gnu.org
Component|target |rtl-optimization
Ever confirmed|0 |1
--- Comment #1 from Uroš Bizjak <ubizjak at gmail dot com> ---
This is register allocator failure, as evident from even more simplified
testcase:
int a, b, c;
extern inline void fn1 (void *p1, void *p2)
{
__asm__ ("#": "=&c" (a), "=&D" (b), "=&S" (c): "r" (p2), "2" (p2));
}
LRA gets following RTX:
(insn 10 4 7 2 (parallel [
(set (reg:SI 89)
(asm_operands:SI ("#") ("=&c") 0 [
(reg/v/f:DI 88 [ p2 ])
(reg/v/f:DI 88 [ p2 ])
]
[
(asm_input:DI ("r") t.c:4)
(asm_input:DI ("2") t.c:4)
]
[] t.c:4))
(set (reg:SI 90)
(asm_operands:SI ("#") ("=&D") 1 [
(reg/v/f:DI 88 [ p2 ])
(reg/v/f:DI 88 [ p2 ])
]
[
(asm_input:DI ("r") t.c:4)
(asm_input:DI ("2") t.c:4)
]
[] t.c:4))
(set (reg:SI 91)
(asm_operands:SI ("#") ("=&S") 2 [
(reg/v/f:DI 88 [ p2 ])
(reg/v/f:DI 88 [ p2 ])
]
[
(asm_input:DI ("r") t.c:4)
(asm_input:DI ("2") t.c:4)
]
[] t.c:4))
(clobber (reg:CCFP 18 fpsr))
(clobber (reg:CC 17 flags))
]) t.c:4 -1
(expr_list:REG_DEAD (reg/v/f:DI 88 [ p2 ])
(expr_list:REG_UNUSED (reg:CCFP 18 fpsr)
(expr_list:REG_UNUSED (reg:CC 17 flags)
(nil)))))
Please note how asm input is tied through p2 variable. LRA ties "2" matching
constraint with "=&D" earlyclobber output constraint (BTW: matching
earlyclobber output is allowed), but it can't resolve tie through p2. This
results in:
(insn 10 4 15 2 (parallel [
(set (reg:SI 2 cx [89])
(asm_operands:SI ("#") ("=&c") 0 [
(reg/v/f:DI 4 si [orig:88 p2 ] [88])
(reg/v/f:DI 4 si [orig:88 p2 ] [88])
]
[
(asm_input:DI ("r") t.c:4)
(asm_input:DI ("2") t.c:4)
]
[] t.c:4))
(set (reg:SI 5 di [90])
(asm_operands:SI ("#") ("=&D") 1 [
(reg/v/f:DI 4 si [orig:88 p2 ] [88])
(reg/v/f:DI 4 si [orig:88 p2 ] [88])
]
[
(asm_input:DI ("r") t.c:4)
(asm_input:DI ("2") t.c:4)
]
[] t.c:4))
(set (reg:SI 4 si [orig:88 p2 ] [88])
(asm_operands:SI ("#") ("=&S") 2 [
(reg/v/f:DI 4 si [orig:88 p2 ] [88])
(reg/v/f:DI 4 si [orig:88 p2 ] [88])
]
[
(asm_input:DI ("r") t.c:4)
(asm_input:DI ("2") t.c:4)
]
[] t.c:4))
(clobber (reg:CCFP 18 fpsr))
(clobber (reg:CC 17 flags))
]) t.c:4 -1
(nil))
which results in reg SI allocated to asm input 0. This violates earlyclobber
requirement that "this operand may not lie in a register that is read by the
instruction or as part of any memory address" with output operand 2, which is
also reg SI.
LRA should copy asm input 0 to an appropriate class temporary reg in the above
case.
Confirmed as rtl-optimization problem.