On January 2, 2016 11:32:33 AM GMT+01:00, Uros Bizjak <ubiz...@gmail.com> wrote: >On Thu, Dec 31, 2015 at 4:29 PM, H.J. Lu <hjl.to...@gmail.com> wrote: >> On Thu, Dec 31, 2015 at 1:14 AM, Uros Bizjak <ubiz...@gmail.com> >wrote: >>> On Wed, Dec 30, 2015 at 9:53 PM, H.J. Lu <hjl.to...@gmail.com> >wrote: >>>> SSE vector arithmetic and logic instructions only accept aligned >memory >>>> operand. This patch adds vector_memory_operand and "Bm" constraint >for >>>> aligned SSE memory operand. They are applied to SSE any_logic >patterns. >>>> >>>> OK for trunk and release branches if there are regressions? >>> >>> This patch is just papering over deeper problem, as Jakub said in >the PR [1]: >>> >>> --q-- >>> GCC uses the ix86_legitimate_combined_insn target hook to disallow >>> misaligned memory into certain SSE instructions. >>> (subreg:V4SI (reg:TI 245 [ MEM[(const struct bitset >&)FeatureEntry_21 + 8] ]) 0) >>> is not misaligned memory, it is a subreg of a pseudo register, so it >is fine. >>> If the replacement of the pseudo register with memory happens in >some >>> other pass, then it probably either should use the >>> legitimate_combined_insn target hook or some other one. I think we >>> have already a PR where that happens during live range shrinking. >>> --/q-- >>> >>> Please figure out where memory replacement happens. There are >several >>> other SSE insns (please grep the .md for "ssememalign" attribute) >that >>> are affected by this problem, so fixing a couple of patterns won't >>> solve the problem completely. >> >> LRA turns >> >> insn 64 63 108 6 (set (reg:V4SI 148 [ vect__28.85 ]) >> (xor:V4SI (reg:V4SI 149) >> (subreg:V4SI (reg:TI 147 [ MEM[(const struct bitset >> &)FeatureEntry_2(D)] ]) 0))) foo.ii:26 3454 {*xorv4si3} >> (expr_list:REG_DEAD (reg:V4SI 149) >> (expr_list:REG_DEAD (reg:TI 147 [ MEM[(const struct bitset >> &)FeatureEntry_2(D)] ]) >> (expr_list:REG_EQUIV (mem/c:V4SI (plus:DI (reg/f:DI 20 >frame) >> (const_int -16 [0xfffffffffffffff0])) [3 >> MEM[(unsigned int *)&D.2851]+0 S16 A128]) >> (nil))))) >> >> into >> >> (insn 64 63 108 6 (set (reg:V4SI 21 xmm0 [orig:148 vect__28.85 ] >[148]) >> (xor:V4SI (reg:V4SI 21 xmm0 [149]) >> (mem:V4SI (reg/v/f:DI 4 si [orig:117 FeatureEntry ] >[117]) >> [6 MEM[(const struct bitset &)FeatureEntry_2(D)]+0 S16 A32]))) >> foo.ii:26 3454 {*xorv4si3} >> (expr_list:REG_EQUIV (mem/c:V4SI (plus:DI (reg/f:DI 20 frame) >> (const_int -16 [0xfffffffffffffff0])) [3 >MEM[(unsigned >> int *)&D.2851]+0 S16 A128]) >> (nil))) >> >> since >> >> (mem:V4SI (reg/v/f:DI 4 si [orig:117 FeatureEntry ] [117]) [6 >> MEM[(const struct bitset &)FeatureEntry_2(D)]+0 S16 A32]))) >> >> satisfies the 'm" constraint. I don't think LRA should call >> ix86_legitimate_combined_insn to validate to validate constraints on >> an instruction. > >Hm... > >if LRA desn't assume that generic "m" constraint implies at least >natural alignment of propageted operand, then your patch is the way to >go.
I don't think it even considers alignment. Archs where alignment validity depends on the actual instruction should model this with proper constraints. But in this case, *every* SSE vector memory constraint should be >changed to Bm. I'd say so ... > >Let's ask Vladimir for details. > >Uros.