Re: Vector registers on MIPS arch
2016-04-06 1:50 GMT+03:00 David Guillen Fandos : > > Thanks again Ilya, > > That seems to help to solve the problem. Now I'm facing another issue. > It seems the tree-vec-generic pass is promoting my vector operations to > BLKmode and therefore the VECTOR_MODE_P macro evaluates to false, > falling back to scalar mode. > I thought I got it working for a moment when I forgot to fix the > HARD_MODE_REGNO_OK macro that evaluated to false for vector registers. > In that case I mange to dodge this issue but I see another issue > regarding register allocation (obviously! :P) > > So the bottom line would be, how do I make sure that my "compute_type" > is V4SF instead of BLKmode? Where does this promotion happen? TYPE_MODE macro for vectors is actually a call to vector_type_mode. You should probably look at it first. You may also check what mode_for_vector returns for float vector in your case. Ilya > > Thanks a lot! > David
Unnecessary check on phi node in tree if-conversion?
Hi, Function if_convertible_phi_p has below check on virtual PHI nodes: if (any_mask_load_store) return true; /* When there were no if-convertible stores, check that there are no memory writes in the branches of the loop to be if-converted. */ if (virtual_operand_p (gimple_phi_result (phi))) { imm_use_iterator imm_iter; use_operand_p use_p; if (bb != loop->header) { if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, "Virtual phi not on loop->header.\n"); return false; } FOR_EACH_IMM_USE_FAST (use_p, imm_iter, gimple_phi_result (phi)) { if (gimple_code (USE_STMT (use_p)) == GIMPLE_PHI && USE_STMT (use_p) != phi) { if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, "Difficult to handle this virtual phi.\n"); return false; } } } Since the check is short-cut by any_mask_load_store, when the check is triggered, it means there is virtual phi node but no conditional memory stores? Is this possible? Plus, any_mask_load_store is set by below code in if_convertible_gimple_assign_stmt_p: if (ifcvt_can_use_mask_load_store (stmt)) { gimple_set_plf (stmt, GF_PLF_2, true); *any_mask_load_store = true; return true; } So in theory it's possible to skip aforementioned check when only mask load is encountered. Any ideas? Thanks, bin
Re: Unnecessary check on phi node in tree if-conversion?
On Wed, Apr 6, 2016 at 5:07 PM, Bin.Cheng wrote: > Hi, > Function if_convertible_phi_p has below check on virtual PHI nodes: > > > if (any_mask_load_store) > return true; > > /* When there were no if-convertible stores, check > that there are no memory writes in the branches of the loop to be > if-converted. */ > if (virtual_operand_p (gimple_phi_result (phi))) > { > imm_use_iterator imm_iter; > use_operand_p use_p; > > if (bb != loop->header) > { > if (dump_file && (dump_flags & TDF_DETAILS)) > fprintf (dump_file, "Virtual phi not on loop->header.\n"); > return false; > } > > FOR_EACH_IMM_USE_FAST (use_p, imm_iter, gimple_phi_result (phi)) > { > if (gimple_code (USE_STMT (use_p)) == GIMPLE_PHI > && USE_STMT (use_p) != phi) > { > if (dump_file && (dump_flags & TDF_DETAILS)) > fprintf (dump_file, "Difficult to handle this virtual phi.\n"); > return false; > } > } > } > > Since the check is short-cut by any_mask_load_store, when the check is > triggered, it means there is virtual phi node but no conditional > memory stores? Is this possible? Plus, any_mask_load_store is set by > below code in if_convertible_gimple_assign_stmt_p: > if (ifcvt_can_use_mask_load_store (stmt)) > { > gimple_set_plf (stmt, GF_PLF_2, true); > *any_mask_load_store = true; > return true; > } > > So in theory it's possible to skip aforementioned check when only mask > load is encountered. > > Any ideas? It's possible to have a loop like: .MEM_2232 = PHI <.MEM_574(179), .MEM_1247(183)> ... if (cond) goto else goto : //empty : .MEM_1247 = PHI <.MEM_2232(180), .MEM_2232(181)> if (cond2) goto else goto : goto So can we handle the PHI which can be degenerated in if-cvt? Thanks, bin
Re: Vector registers on MIPS arch
On 06/04/16 10:44, Ilya Enkovich wrote: > 2016-04-06 1:50 GMT+03:00 David Guillen Fandos : >> >> Thanks again Ilya, >> >> That seems to help to solve the problem. Now I'm facing another issue. >> It seems the tree-vec-generic pass is promoting my vector operations to >> BLKmode and therefore the VECTOR_MODE_P macro evaluates to false, >> falling back to scalar mode. >> I thought I got it working for a moment when I forgot to fix the >> HARD_MODE_REGNO_OK macro that evaluated to false for vector registers. >> In that case I mange to dodge this issue but I see another issue >> regarding register allocation (obviously! :P) >> >> So the bottom line would be, how do I make sure that my "compute_type" >> is V4SF instead of BLKmode? Where does this promotion happen? > > TYPE_MODE macro for vectors is actually a call to vector_type_mode. You > should probably look at it first. You may also check what mode_for_vector > returns for float vector in your case. > > Ilya > >> >> Thanks a lot! >> David Thanks a lot Ilya! I managed to get it working. There were some bugs regarding register allocation that ended up promoting the class to be BLKmode instead of V4SFmode. I had to debug it a bit, which is tricky, but in the end I found my way through it. Just to finish this. Do you think from your experience that is difficult to implement vector instructions that have variable sizes? This particular VFU has 4, 3, 2 and 1 element operations with arbitrary swizzling. This is, we can load a V3SF and perform a dot product operation with another V3SF to get a V1SF for instance. Of course the elements might overlap, so if a vreg is A B C D we can have a 4 element vector ABCD or a pair of 3 element vregs ABC and BCD, the same logic applies to have 3 registers of V2SF type and so forth. It is very flexible. It also allows column and row arranging, so we can load 4 vectors in a 4x4 matrix and multiply them with another matrix transposing them on the fly. I guess this is too difficult to expose to gcc, which is more used to intel SIMD stuff. In the past I wrote most of the kernels in assembly and wrap them around C functions, but if you use classes and inline functions having gcc on your side helps a lot (register allocation and therefore less load/stores to memory). Thanks a lot for your help! David
gcc-4.9-20160406 is now available
Snapshot gcc-4.9-20160406 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.9-20160406/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.9 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_9-branch revision 234800 You'll find: gcc-4.9-20160406.tar.bz2 Complete GCC MD5=de1a3f88aefbb0bc76c88a5691c2c93e SHA1=bf9d3e27848d8b1c0be965168cc64ffb25c79897 Diffs from 4.9-20160330 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.9 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.