On Fri, Aug 21, 2015 at 2:17 PM, Ilya Enkovich <enkovich....@gmail.com> wrote: > 2015-08-21 14:00 GMT+03:00 Richard Biener <richard.guent...@gmail.com>: >> On Fri, Aug 21, 2015 at 12:49 PM, Ilya Enkovich <enkovich....@gmail.com> >> wrote: >>> 2015-08-21 11:15 GMT+03:00 Richard Biener <richard.guent...@gmail.com>: >>>> On Thu, Aug 20, 2015 at 8:46 PM, Jeff Law <l...@redhat.com> wrote: >>>>> On 08/17/2015 10:25 AM, Ilya Enkovich wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> This patch intoriduces a new vectorizer hook use_scalar_mask_p which >>>>>> affects code generated by if-conversion pass (and affects patterns in >>>>>> later >>>>>> patches). >>>>>> >>>>>> Thanks, >>>>>> Ilya >>>>>> -- >>>>>> 2015-08-17 Ilya Enkovich <enkovich....@gmail.com> >>>>>> >>>>>> * doc/tm.texi (TARGET_VECTORIZE_USE_SCALAR_MASK_P): New. >>>>>> * doc/tm.texi.in: Regenerated. >>>>>> * target.def (use_scalar_mask_p): New. >>>>>> * tree-if-conv.c: Include target.h. >>>>>> (predicate_mem_writes): Don't convert boolean predicates into >>>>>> integer when scalar masks are used. >>>>> >>>>> Presumably this is how you prevent the generation of scalar masks rather >>>>> than boolean masks on targets which don't have the former? >>>>> >>>>> I hate to ask, but how painful would it be to go from a boolean to integer >>>>> masks later such as during expansion? Or vice-versa. >>>>> >>>>> WIthout a deep knowledge of the entire patchkit, it feels like we're >>>>> introducing target stuff in a place where we don't want it and that we'd >>>>> be >>>>> better served with a canonical representation through gimple, then >>>>> dropping >>>>> into something more target specific during gimple->rtl expansion. >>> >>> I want a work with bitmasks to be expressed in a natural way using >>> regular integer operations. Currently all masks manipulations are >>> emulated via vector statements (mostly using a bunch of vec_cond). For >>> complex predicates it may be nontrivial to transform it back to scalar >>> masks and get an efficient code. Also the same vector may be used as >>> both a mask and an integer vector. Things become more complex if you >>> additionally have broadcasts and vector pack/unpack code. It also >>> should be transformed into a scalar masks manipulations somehow. >> >> Hmm, I don't see how vector masks are more difficult to operate with. > > There are just no instructions for that but you have to pretend you > have to get code vectorized.
Huh? Bitwise ops should be readily available. >> >>> Also according to vector ABI integer mask should be used for mask >>> operand in case of masked vector call. >> >> What ABI? The function signature of the intrinsics? How would that >> come into play here? > > Not intrinsics. I mean OpenMP vector functions which require integer > arg for a mask in case of 512-bit vector. How do you declare those? >> >>> Current implementation of masked loads, masked stores and bool >>> patterns in vectorizer just reflect SSE4 and AVX. Can (and should) we >>> really call it a canonical representation for all targets? >> >> No idea - we'll revisit when another targets adds a similar capability. > > AVX-512 is such target. Current representation forces multiple scalar > mask -> vector mask and back transformations which are artificially > introduced by current bool patterns and are hard to optimize out. I dislike the bool patterns anyway and we should try to remove those and make the vectorizer handle them in other ways (they have single-use issues anyway). I don't remember exactly what caused us to add them but one reason was there wasn't a vector type for 'bool' (but I don't see how it should be necessary to ask "get me a vector type for 'bool'"). >> >>> Using scalar masks everywhere should probably cause the same conversion >>> problem for SSE I listed above though. >>> >>> Talking about a canonical representation, shouldn't we use some >>> special masks representation and not mixing it with integer and vector >>> of integers then? Only in this case target would be able to >>> efficiently expand it into a corresponding rtl. >> >> That was my idea of vector<bool> ... but I didn't explore it and see where >> it will cause issues. >> >> Fact is GCC already copes with vector masks generated by vector compares >> just fine everywhere and I'd rather leave it as that. > > Nope. Currently vector mask is obtained from a vec_cond <A op B, {0 .. > 0}, {-1 .. -1}>. AND and IOR on bools are also expressed via > additional vec_cond. I don't think vectorizer ever generates vector > comparison. Ok, well that's an implementation detail then. Are you sure about AND and IOR? The comment above vect_recog_bool_pattern says Assuming size of TYPE is the same as size of all comparisons (otherwise some casts would be added where needed), the above sequence we create related pattern stmts: S1' a_T = x1 CMP1 y1 ? 1 : 0; S3' c_T = x2 CMP2 y2 ? a_T : 0; S4' d_T = x3 CMP3 y3 ? 1 : 0; S5' e_T = c_T | d_T; S6' f_T = e_T; thus has vector mask | > And I wouldn't say it's fine 'everywhere' because there is a single > target utilizing them. Masked loads and stored for AVX-512 just don't > work now. And if we extend existing MASK_LOAD and MASK_STORE optabs to > 512-bit vector then we get an ugly inefficient code. The question is > where to fight with this inefficiency: in RTL or in GIMPLE. I want to > fight with it where it appears, i.e. in GIMPLE by preventing bool -> > int conversions applied everywhere even if target doesn't need it. > > If we don't want to support both types of masks in GIMPLE then it's > more reasonable to make bool -> int conversion in expand for targets > requiring it, rather than do it for everyone and then leave it to > target to transform it back and try to get rid of all those redundant > transformations. I'd give vector<bool> a chance to become a canonical > mask representation for that. Well, you are missing the case of bool b = a < b; int x = (int)b; where the bool is used as integer (and thus an integer mask would have to be "expanded"). When the bool is a mask in itself the integer use is either free or a matter of a widening/shortening operation. Richard. > > Thanks, > Ilya > >> >>>> >>>> Indeed. I don't remember my exact comments during the talk at the Cauldron >>>> but the scheme used there was sth like >>>> >>>> mask = GEN_MASK <vec1 < vec2>; >>>> b = a + 1; >>>> x = VEC_COND <mask, a, b> >>>> >>>> to model conditional execution already at the if-conversion stage (for >>>> all scalar >>>> stmts made executed unconditionally rather than just the PHI results). I >>>> was >>>> asking for the condition to be removed from GEN_MASK (patch 1 has this >>>> fixed now AFAICS). And I also asked why it was necessary to do this >>>> "lowering" >>>> here and not simply do >>>> >>>> mask = vec1 < vec2; // regular vector mask! >>>> b = a + 1; >>>> x = VEC_COND <mask, a, b> >>>> >>>> and have the lowering to an integer mask done later. You'd still >>>> change if-conversion >>>> to predicate _all_ statements, not just those with side-effects. So I >>>> think there >>>> still needs to be a new target hook to trigger this, similar to how >>>> the target capabilities >>>> trigger the masked load/store path in if-conversion. >>> >>> I think you mix scalar masks with a loop reminders optimization. I'm >>> not going to do other changes in if-conversion other then in this >>> posted patch to support scalar masks. Statements predication will be >>> used to vectorize loop reminders. And not all of them, only reduction >>> definitions. This will be independent from scalar masks and will work >>> for vector masks also. And these changes are not going to be in >>> if-conversion. >> >> Maybe I misremember. Didn't look at the patch in detail yet. >> >> Richard. >> >>> >>> Thanks, >>> Ilya >>> >>>> >>>> But I don't like changing our IL so much as to allow 'integer' masks >>>> everywhere. >>>> >>>> Richard. >>>> >>>> >>>>> >>>>> Jeff