On 05/04/16 09:13, Ilya Enkovich wrote: > 2016-04-05 1:59 GMT+03:00 David Guillen Fandos <da...@davidgf.net>: >> >> >> On 04/04/16 10:55, Ilya Enkovich wrote: >>> 2016-04-02 3:32 GMT+03:00 David Guillen Fandos <da...@davidgf.net>: >>>> Hello there! >>>> >>>> I'm trying to add some vector registers to a MIPS arch (32 bit). This >>>> arch has 32 x 128 bit registers that can essentially be seen as V4SF. >>>> So far I'm using this test: >>>> >>>> volatile float foo __attribute__ ((vector_size (16))); >>>> volatile float bar __attribute__ ((vector_size (16))); >>>> >>>> int main() { >>>> foo = foo + bar; >>>> } >>>> >>>> Which produces the right SSE/AVX instructions for x86 but fails on my >>>> mips cross compiler with my modifications. >>>> The modifications I did so far are: >>>> >>>> - Add 32 new regsiters, adding a register class, updating/adding bit >>>> fields, updating also other macros that deal with reg allocation (like >>>> caller saved and stuff). Also incremented the first pseudo reg value. >>>> >>>> - Add 3 define_insn that load, store and add vectors. >>>> >>>> - Tweak some things here and there to let the compiler know about the >>>> V4SF type being available. >>>> >>>> So far the compiler goes back to scalar code, not working properly at >>>> the veclower pass. My test.c.123t.veclower21 looks like: >>>> >>>> <bb 2>: >>>> foo.0_2 ={v} foo; >>>> bar.1_3 ={v} bar; >>>> _6 = BIT_FIELD_REF <foo.0_2, 32, 0>; >>>> _7 = BIT_FIELD_REF <bar.1_3, 32, 0>; >>>> _8 = _6 + _7; >>>> _9 = BIT_FIELD_REF <foo.0_2, 32, 32>; >>>> _10 = BIT_FIELD_REF <bar.1_3, 32, 32>; >>>> _11 = _9 + _10; >>>> _12 = BIT_FIELD_REF <foo.0_2, 32, 64>; >>>> _13 = BIT_FIELD_REF <bar.1_3, 32, 64>; >>>> _14 = _12 + _13; >>>> _15 = BIT_FIELD_REF <foo.0_2, 32, 96>; >>>> _16 = BIT_FIELD_REF <bar.1_3, 32, 96>; >>>> _17 = _15 + _16; >>>> foo.2_4 = {_8, _11, _14, _17}; >>>> foo ={v} foo.2_4; >>>> return 0; >>>> >>>> >>>> Any ideas on what I'm missing and/or how to further debug this? I don't >>>> really want autovectorization, just to be able to use vec registers >>>> "manually". >>> >>> Hi. >>> >>> Can't say for sure since you didn't attach your patch. But vector >>> lowering happens for a vector statement which doesn't have corresponding >>> entry in optab. You must ensure your templates have proper names to >>> get them added to optabs. >>> >>> Thanks, >>> Ilya >>> >>>> >>>> Thanks! >>>> David >>>> >> >> Hey Ilya, thanks for the response. >> >> My patterns look like this: >> >> >> ;; Vector load. >> (define_insn "load_v4sf" >> [(set (match_operand:V4SF 0 "register_operand" "=kv") >> (match_operand:V4SF 1 "memory_operand" "m") )] >> "" >> "lv.q\t%0,%1" >> ) >> ;; Vector store. >> (define_insn "store_v4sf" >> [(set (match_operand:V4SF 0 "memory_operand" "=m") >> (match_operand:V4SF 1 "register_operand" "kv") )] >> "" >> "sv.q\t%0,%1" >> ) >> >> ;; Add vector. >> (define_insn "vadd4sf" >> [(set (match_operand:V4SF 0 "register_operand" "=kv") >> (plus:V4SF (match_operand:V4SF 1 "register_operand" "kv") >> (match_operand:V4SF 2 "register_operand" "kv")))] >> "" >> "vadd.q\t%0,%1,%2" >> [(set_attr "type" "fadd")]) >> >> >> kv represents a constraint that maps to a vector register pool of registers. >> Does it make sense to you? > > Your pattern names don't match standard pattern names and therefore are not > recognized by optabs. It means these vector patterns can't be used by > vectorizer > and corresponding vector statements will be lowered into scalar ones. Look > into > [1] for more details. E.g. for 'add' pattern you should use name 'addv4sf3'. > > BR > Ilya > > [1] https://gcc.gnu.org/onlinedocs/gccint/Standard-Names.html > >> >> Many thanks! >> David
Thanks again Ilya, That seems to help to solve the problem. Now I'm facing another issue. It seems the tree-vec-generic pass is promoting my vector operations to BLKmode and therefore the VECTOR_MODE_P macro evaluates to false, falling back to scalar mode. I thought I got it working for a moment when I forgot to fix the HARD_MODE_REGNO_OK macro that evaluated to false for vector registers. In that case I mange to dodge this issue but I see another issue regarding register allocation (obviously! :P) So the bottom line would be, how do I make sure that my "compute_type" is V4SF instead of BLKmode? Where does this promotion happen? Thanks a lot! David