On Sat, Jan 2, 2016 at 3:58 AM, Richard Biener
<richard.guent...@gmail.com> wrote:
> On January 2, 2016 11:32:33 AM GMT+01:00, Uros Bizjak <ubiz...@gmail.com> 
> wrote:
>>On Thu, Dec 31, 2015 at 4:29 PM, H.J. Lu <hjl.to...@gmail.com> wrote:
>>> On Thu, Dec 31, 2015 at 1:14 AM, Uros Bizjak <ubiz...@gmail.com>
>>wrote:
>>>> On Wed, Dec 30, 2015 at 9:53 PM, H.J. Lu <hjl.to...@gmail.com>
>>wrote:
>>>>> SSE vector arithmetic and logic instructions only accept aligned
>>memory
>>>>> operand.  This patch adds vector_memory_operand and "Bm" constraint
>>for
>>>>> aligned SSE memory operand.  They are applied to SSE any_logic
>>patterns.
>>>>>
>>>>> OK for trunk and release branches if there are regressions?
>>>>
>>>> This patch is just papering over deeper problem, as Jakub said in
>>the PR [1]:
>>>>
>>>> --q--
>>>> GCC uses the ix86_legitimate_combined_insn target hook to disallow
>>>> misaligned memory into certain SSE instructions.
>>>> (subreg:V4SI (reg:TI 245 [ MEM[(const struct bitset
>>&)FeatureEntry_21 + 8] ]) 0)
>>>> is not misaligned memory, it is a subreg of a pseudo register, so it
>>is fine.
>>>> If the replacement of the pseudo register with memory happens in
>>some
>>>> other pass, then it probably either should use the
>>>> legitimate_combined_insn target hook or some other one.  I think we
>>>> have already a PR where that happens during live range shrinking.
>>>> --/q--
>>>>
>>>> Please figure out where memory replacement happens. There are
>>several
>>>> other SSE insns (please grep the .md for "ssememalign" attribute)
>>that
>>>> are affected by this problem, so fixing a couple of patterns won't
>>>> solve the problem completely.
>>>
>>> LRA turns
>>>
>>> insn 64 63 108 6 (set (reg:V4SI 148 [ vect__28.85 ])
>>>         (xor:V4SI (reg:V4SI 149)
>>>             (subreg:V4SI (reg:TI 147 [ MEM[(const struct bitset
>>> &)FeatureEntry_2(D)] ]) 0))) foo.ii:26 3454 {*xorv4si3}
>>>      (expr_list:REG_DEAD (reg:V4SI 149)
>>>         (expr_list:REG_DEAD (reg:TI 147 [ MEM[(const struct bitset
>>> &)FeatureEntry_2(D)] ])
>>>             (expr_list:REG_EQUIV (mem/c:V4SI (plus:DI (reg/f:DI 20
>>frame)
>>>                         (const_int -16 [0xfffffffffffffff0])) [3
>>> MEM[(unsigned int *)&D.2851]+0 S16 A128])
>>>                 (nil)))))
>>>
>>> into
>>>
>>> (insn 64 63 108 6 (set (reg:V4SI 21 xmm0 [orig:148 vect__28.85 ]
>>[148])
>>>         (xor:V4SI (reg:V4SI 21 xmm0 [149])
>>>             (mem:V4SI (reg/v/f:DI 4 si [orig:117 FeatureEntry ]
>>[117])
>>> [6 MEM[(const struct bitset &)FeatureEntry_2(D)]+0 S16 A32])))
>>> foo.ii:26 3454 {*xorv4si3}
>>>      (expr_list:REG_EQUIV (mem/c:V4SI (plus:DI (reg/f:DI 20 frame)
>>>                 (const_int -16 [0xfffffffffffffff0])) [3
>>MEM[(unsigned
>>> int *)&D.2851]+0 S16 A128])
>>>         (nil)))
>>>
>>> since
>>>
>>> (mem:V4SI (reg/v/f:DI 4 si [orig:117 FeatureEntry ] [117]) [6
>>> MEM[(const struct bitset &)FeatureEntry_2(D)]+0 S16 A32])))
>>>
>>> satisfies the 'm" constraint.  I don't think LRA should call
>>> ix86_legitimate_combined_insn to validate to validate constraints on
>>> an instruction.
>>
>>Hm...
>>
>>if LRA desn't assume that generic "m" constraint implies at least
>>natural alignment of propageted operand, then your patch is the way to
>>go.
>
> I don't think it even considers alignment. Archs where alignment validity 
> depends on the actual instruction should model this with proper constraints.
>
> But in this case, *every* SSE vector memory constraint should be
>>changed to Bm.
>
> I'd say so ...

The "Bm" constraint should be applied only to non-move SSE
instructions with 16-byte memory operand.

-- 
H.J.

Reply via email to