On Thu, Oct 12, 2017 at 10:32 AM, Jakub Jelinek <ja...@redhat.com> wrote:
> On Thu, Oct 12, 2017 at 08:32:32AM +0200, Uros Bizjak wrote:
>> On Wed, Oct 11, 2017 at 10:59 PM, Jakub Jelinek <ja...@redhat.com> wrote:
>> > As can be seen on the testcase below, the *<rotate_insn><mode>3_mask
>> > insn/splitter is able to optimize only the case when the and is
>> > performed in SImode and then the result subreged into QImode,
>> > while if the computation is already in QImode, we don't handle it.
>> >
>> > Fixed by adding another pattern, bootstrapped/regtested on x86_64-linux and
>> > i686-linux, ok for trunk?
>>
>> We probably want to add this variant to *all* *_mask splitters (there
>> are a few of them in i386.md, please grep for "Avoid useless
>
> Well, not all of them, just the 5 ones that have (subreg:QI (and:SI ...)),
> *jcc_bt<mode>_mask expects SImode argument, so doesn't have this kind of
> problem.
>
>> masking"). Which finally begs a question - should we implement this
>> simplification in a generic, target-independent way? OTOH, we already
>
> The target independent way is SHIFT_COUNT_TRUNCATED, but that isn't
> applicable to targets which aren't consistent in their behavior across
> the whole ISA.  x86 has some instructions that truncate the mask and others
> that saturate and others (b* with memory operand) that use something
> still different.
> So we need to apply it only to the instructions which really truncate
> the shift counts and the best infrastructure for that I'm aware is the
> combiner.
>
> In order to have just one set of the *_mask patterns we'd need to
> change simplify_truncation to canonicalize
> (subreg:QI (and:SI (something:SI) (const_int N)) low)
> into
> (and:QI (subreg:QI (something:SI) low) (const_int lowN))
> and change all these patterns; but I'm afraid that is going to break almost
> all WORD_REGISTER_OPERATIONS targets and various others, those actually
> perform all the operations in word mode and that is why the first form
> is actually preferred for them.  Except that if something is already
> QImode there is no need for the subreg at all, which is why we I'm afraid
> need the two sets of mask patterns.
>
> So, if you aren't against it, I can extend the patch to handle the 4
> other mask patterns; as for other modes, SImode is what is being handled
> already, DImode is not a problem, because the FEs truncate the shift counts
> to integer_type_node already, and for HImode I haven't seen problem
> probably because most tunings avoid HImode math and so it isn't worth
> optimizing.

OK, I think that we can live wtih 4 new patterns. Since these are all
written in the same way (as in the patch you posted), the ammended
patch is pre-approved for mainline.

Thanks,
Uros.

Reply via email to