Hi Richard, Did you have a chance to look at this updated patch?
Thanks. Yuri. 2015-06-18 17:32 GMT+03:00 Yuri Rumyantsev <ysrum...@gmail.com>: > Richard, > > Here is updated patch which does not include your proposal related to > the target hook deletion. > You wrote: >> I still don't understand why you need the new target hook. If we have a >> masked >> load/store then the mask is computed by an assignment with a VEC_COND_EXPR >> (in your example) and thus a test for a zero mask is expressible as just >> > > if (vect__ifc__41.17_167 == { 0, 0, 0, 0... }) >> >> or am I missing something? > Such vector compare produces vector and does not set up cc flags > required for conditional branch (GIMPLE_COND). > If we use such comparison for GIMPLE_COND we got error message, so I > used target hook which does set up cc flags aka ptest instruction and > I left this part. > I moved new routines to loop-vectorizer.c file and both they are static. > I re-wrote is_valid_sink function to use def-use chain as you proposed. > I also add new parameter to control such transformation. > Few redundant tests have also been deleted. > Any comments will be appreciated. > > Thanks. > Yuri. > > 2015-06-18 Yuri Rumyantsev <ysrum...@gmail.com> > > * config/i386/i386.c: Include files stringpool.h and tree-ssanames.h. > (ix86_vectorize_build_zero_vector_test): New function. > (TARGET_VECTORIZE_BUILD_ZERO_VECTOR_TEST): New target macro > * doc/tm.texi.in: Add @hook TARGET_VECTORIZE_BUILD_ZERO_VECTOR_TEST. > * doc/tm.texi: Updated. > * params.def (PARAM_ZERO_TEST_FOR_STORE_MASK): New DEFPARAM. > * params.h (ENABLE_ZERO_TEST_FOR_STORE_MASK): new macros. > * target.def (build_zero_vector_test): New DEFHOOK. > * tree-vect-stmts.c (vectorizable_mask_load_store): Initialize > has_mask_store field of vect_info. > * tree-vectorizer.c: Include files stringpool.h and tree-ssanames.h. > (is_valid_sink): New function. > (optimize_mask_stores): New function. > (vectorize_loops): Invoke optimaze_mask_stores for loops having masked > stores. > * tree-vectorizer.h (loop_vec_info): Add new has_mask_store field and > correspondent macros. > > gcc/testsuite/ChangeLog: > * gcc.target/i386/avx2-vect-mask-store-move1.c: New test. > > > > 2015-06-09 15:13 GMT+03:00 Richard Biener <richard.guent...@gmail.com>: >> On Wed, May 20, 2015 at 4:00 PM, Yuri Rumyantsev <ysrum...@gmail.com> wrote: >>> Hi All, >>> >>> Here is updated patch to optimize mask stores. The main goal of it is >>> to avoid execution of mask store if its mask is zero vector since >>> loads that follow it can be blocked. >>> The following changes were done: >>> 1. A test on sink legality was added - it simply prohibits to cross >>> statements with non-null vdef or vuse. >>> 2. New phi node is created in join block for moved MASK_STORE statements. >>> 3. Test was changed to check that 2 MASK_STORE statements are not >>> moved to the same block. >>> 4. New field was added to loop_vec_info structure to mark loops >>> having MASK_STORE's. >>> >>> Any comments will be appreciated. >>> Yuri. >> >> I still don't understand why you need the new target hook. If we have a >> masked >> load/store then the mask is computed by an assignment with a VEC_COND_EXPR >> (in your example) and thus a test for a zero mask is expressible as just >> >> if (vect__ifc__41.17_167 == { 0, 0, 0, 0... }) >> >> or am I missing something? As we dont' want this transform on all targets >> (or always) we can control it by either a --param or a new flag which default >> targets can adjust. [Is the hazard still present with Skylake or other >> AVX512 implementations for example?] >> >> +/* Helper for optimize_mask_stores: returns true if STMT sinking to end >> + of BB is valid and false otherwise. */ >> + >> +static bool >> +is_valid_sink (gimple stmt) >> +{ >> >> so STMT is either a masked store or a masked load? You shouldn't >> walk all stmts here but instead rely on the factored use-def def-use >> chains of virtual operands. >> >> +void >> +optimize_mask_stores (struct loop *loop) >> +{ >> + basic_block bb = loop->header; >> + gimple_stmt_iterator gsi; >> + gimple stmt; >> + auto_vec<gimple> worklist; >> + >> + if (loop->dont_vectorize >> + || loop->num_nodes > 2) >> + return; >> >> looks like no longer needed given the place this is called from now >> (or does it guard against outer loop vectorization as well?) >> Also put it into tree-vect-loop.c and make it static. >> >> + /* Loop was not vectorized if mask does not have vector type. */ >> + if (!VECTOR_TYPE_P (TREE_TYPE (mask))) >> + return; >> >> this should be always false >> >> + store_bb->frequency = bb->frequency / 2; >> + efalse->probability = REG_BR_PROB_BASE / 2; >> >> I think the == 0 case should end up as unlikely. >> >> + if (dom_info_available_p (CDI_POST_DOMINATORS)) >> + set_immediate_dominator (CDI_POST_DOMINATORS, bb, store_bb); >> >> post dominators are not available in the vectorizer. >> >> + /* Put definition statement of stored value in STORE_BB >> + if possible. */ >> + arg3 = gimple_call_arg (last, 3); >> + if (TREE_CODE (arg3) == SSA_NAME && has_single_use (arg3)) >> + { >> + def_stmt = SSA_NAME_DEF_STMT (arg3); >> + /* Move def_stmt to STORE_BB if it is in the same bb and >> + it is legal. */ >> + if (gimple_bb (def_stmt) == bb >> + && is_valid_sink (def_stmt)) >> >> ok, so you move arbitrary statements. You can always move non-memory >> statements and if you keep track of the last store you moved you >> can check if gimple_vuse () is the same as on that last store and be >> done with that, or if you sink another store (under the same condition) >> then just update the last store. >> >> Otherwise this looks better now. >> >> Thus - get rid of the target hook and of is_valid_sink. >> >> Thanks, >> Richard. >> >>> 2015-05-20 Yuri Rumyantsev <ysrum...@gmail.com> >>> >>> * config/i386/i386.c: Include files stringpool.h and tree-ssanames.h. >>> (ix86_vectorize_is_zero_vector): New function. >>> (TARGET_VECTORIZE_IS_ZERO_VECTOR): New target macro >>> * doc/tm.texi.in: Add @hook TARGET_VECTORIZE_IS_ZERO_VECTOR. >>> * doc/tm.texi: Updated. >>> * target.def (is_zero_vector): New DEFHOOK. >>> * tree-vect-stmts.c : Include tree-into-ssa.h. >>> (vectorizable_mask_load_store): Initialize has_mask_store field. >>> (is_valid_sink): New function. >>> (optimize_mask_stores): New function. >>> * tree-vectorizer.c (vectorize_loops): Invoke optimaze_mask_stores for >>> loops having masked stores. >>> * tree-vectorizer.h (loop_vec_info): Add new has_mask_store field and >>> correspondent macros. >>> (optimize_mask_stores): Update prototype. >>> >>> gcc/testsuite/ChangeLog: >>> * gcc.target/i386/avx2-vect-mask-store-move1.c: New test.