On Thu, Dec 31, 2015 at 8:12 AM, Jakub Jelinek <ja...@redhat.com> wrote:
> On Thu, Dec 31, 2015 at 07:29:21AM -0800, H.J. Lu wrote:
>> > This patch is just papering over deeper problem, as Jakub said in the PR 
>> > [1]:
>> >
>> > --q--
>> > GCC uses the ix86_legitimate_combined_insn target hook to disallow
>> > misaligned memory into certain SSE instructions.
>> > (subreg:V4SI (reg:TI 245 [ MEM[(const struct bitset &)FeatureEntry_21 + 8] 
>> > ]) 0)
>> > is not misaligned memory, it is a subreg of a pseudo register, so it is 
>> > fine.
>> > If the replacement of the pseudo register with memory happens in some
>> > other pass, then it probably either should use the
>> > legitimate_combined_insn target hook or some other one.  I think we
>> > have already a PR where that happens during live range shrinking.
>> > --/q--
>> >
>> > Please figure out where memory replacement happens. There are several
>> > other SSE insns (please grep the .md for "ssememalign" attribute) that
>> > are affected by this problem, so fixing a couple of patterns won't
>> > solve the problem completely.
>>
>> LRA turns
>>
>> insn 64 63 108 6 (set (reg:V4SI 148 [ vect__28.85 ])
>>         (xor:V4SI (reg:V4SI 149)
>>             (subreg:V4SI (reg:TI 147 [ MEM[(const struct bitset
>> &)FeatureEntry_2(D)] ]) 0))) foo.ii:26 3454 {*xorv4si3}
>>      (expr_list:REG_DEAD (reg:V4SI 149)
>>         (expr_list:REG_DEAD (reg:TI 147 [ MEM[(const struct bitset
>> &)FeatureEntry_2(D)] ])
>>             (expr_list:REG_EQUIV (mem/c:V4SI (plus:DI (reg/f:DI 20 frame)
>>                         (const_int -16 [0xfffffffffffffff0])) [3
>> MEM[(unsigned int *)&D.2851]+0 S16 A128])
>>                 (nil)))))
>>
>> into
>>
>> (insn 64 63 108 6 (set (reg:V4SI 21 xmm0 [orig:148 vect__28.85 ] [148])
>>         (xor:V4SI (reg:V4SI 21 xmm0 [149])
>>             (mem:V4SI (reg/v/f:DI 4 si [orig:117 FeatureEntry ] [117])
>> [6 MEM[(const struct bitset &)FeatureEntry_2(D)]+0 S16 A32])))
>> foo.ii:26 3454 {*xorv4si3}
>>      (expr_list:REG_EQUIV (mem/c:V4SI (plus:DI (reg/f:DI 20 frame)
>>                 (const_int -16 [0xfffffffffffffff0])) [3 MEM[(unsigned
>> int *)&D.2851]+0 S16 A128])
>>         (nil)))
>>
>> since
>>
>> (mem:V4SI (reg/v/f:DI 4 si [orig:117 FeatureEntry ] [117]) [6
>> MEM[(const struct bitset &)FeatureEntry_2(D)]+0 S16 A32])))
>>
>> satisfies the 'm" constraint.  I don't think LRA should call
>> ix86_legitimate_combined_insn to validate to validate constraints on
>> an instruction.
>
> I think best would be if we arrange (through some gen* extension?) that
> the backend in *.md files could specify additional code to be run during
> recognition (ideally have a way that it is used only on operands that are or
> might have vector mode and are or might be MEM, so that it wouldn't be done
> on every insn) and move the
>           /* For pre-AVX disallow unaligned loads/stores where the
>              instructions don't support it.  */
>           if (!TARGET_AVX
>               && VECTOR_MODE_P (mode)
>               && misaligned_operand (op, mode))
>             {
>               unsigned int min_align = get_attr_ssememalign (insn);
>               if (min_align == 0
>                   || MEM_ALIGN (op) < min_align)
>                 return false;
>             }
> from the legitimate_combined_insn target hook into that.  Or run some target
> hook during recog?  That way it would
> be checked in all recog spots, rather than just the combiner.
> Or indeed move to different predicates and/or constraints for operands that
> disallow misaligned operands and perform this check in there.
> Unless something in LRA is the only oher spot that could try to "combine"
> misaligned memory into vector instructions.

The backend should provide accurate constraint information to
register alllocator so that the proper operand will be selected.

-- 
H.J.

Reply via email to