I'm porting gcc to a new VLIW architecture.
There are 11 function units in the chip, and 4 of them are DSPs.
Now I'm designing the SIMD instruction patterns, and I wouldn't like use the
built-in functions to support it.
If I wrote some instruction patterns which involved many V4QI
packing/unpacking/arithmetic operations,
could gcc try to select them automatically and smartly?
(Of course I never wrote any define_expand/define_split to generate any V4QI
operations myself.)
For example:
1. my packing instruction patterns ('D' means DSP register):
(define_insn "*packqi_from_mem"
[(set (vec_select:QI (match_operand:V4QI 0 "register_operand" "D")
(parallel [(match_operand:SI 2 "const_int_operand"
"i")]))
(match_operand:QI 1 "memory_operand" "m"))]
""
"ldub.b%2\\t%0, %1"
)
2. my V4QI + V4QI SIMD operation
(define_insn "*SIMD_addqi3"
[(set (match_operand:V4QI 0 "register_operand" "=D")
(plus:V4QI (match_operand:V4QI 1 "register_operand" "%D")
(match_operand:V4QI 2 "register_operand"
"D")))]
""
"add.ub\\t%0, %1, %2"
)
Is it possible that gcc can try to load 4 QImode value to a register by the pattern
"*packqi_from_mem"
and perform the V4QI + V4QI SIMD add by the pattern "*SIMD_addqi3" itself?