On Thu, Dec 31, 2015 at 8:12 AM, Jakub Jelinek <ja...@redhat.com> wrote: > On Thu, Dec 31, 2015 at 07:29:21AM -0800, H.J. Lu wrote: >> > This patch is just papering over deeper problem, as Jakub said in the PR >> > [1]: >> > >> > --q-- >> > GCC uses the ix86_legitimate_combined_insn target hook to disallow >> > misaligned memory into certain SSE instructions. >> > (subreg:V4SI (reg:TI 245 [ MEM[(const struct bitset &)FeatureEntry_21 + 8] >> > ]) 0) >> > is not misaligned memory, it is a subreg of a pseudo register, so it is >> > fine. >> > If the replacement of the pseudo register with memory happens in some >> > other pass, then it probably either should use the >> > legitimate_combined_insn target hook or some other one. I think we >> > have already a PR where that happens during live range shrinking. >> > --/q-- >> > >> > Please figure out where memory replacement happens. There are several >> > other SSE insns (please grep the .md for "ssememalign" attribute) that >> > are affected by this problem, so fixing a couple of patterns won't >> > solve the problem completely. >> >> LRA turns >> >> insn 64 63 108 6 (set (reg:V4SI 148 [ vect__28.85 ]) >> (xor:V4SI (reg:V4SI 149) >> (subreg:V4SI (reg:TI 147 [ MEM[(const struct bitset >> &)FeatureEntry_2(D)] ]) 0))) foo.ii:26 3454 {*xorv4si3} >> (expr_list:REG_DEAD (reg:V4SI 149) >> (expr_list:REG_DEAD (reg:TI 147 [ MEM[(const struct bitset >> &)FeatureEntry_2(D)] ]) >> (expr_list:REG_EQUIV (mem/c:V4SI (plus:DI (reg/f:DI 20 frame) >> (const_int -16 [0xfffffffffffffff0])) [3 >> MEM[(unsigned int *)&D.2851]+0 S16 A128]) >> (nil))))) >> >> into >> >> (insn 64 63 108 6 (set (reg:V4SI 21 xmm0 [orig:148 vect__28.85 ] [148]) >> (xor:V4SI (reg:V4SI 21 xmm0 [149]) >> (mem:V4SI (reg/v/f:DI 4 si [orig:117 FeatureEntry ] [117]) >> [6 MEM[(const struct bitset &)FeatureEntry_2(D)]+0 S16 A32]))) >> foo.ii:26 3454 {*xorv4si3} >> (expr_list:REG_EQUIV (mem/c:V4SI (plus:DI (reg/f:DI 20 frame) >> (const_int -16 [0xfffffffffffffff0])) [3 MEM[(unsigned >> int *)&D.2851]+0 S16 A128]) >> (nil))) >> >> since >> >> (mem:V4SI (reg/v/f:DI 4 si [orig:117 FeatureEntry ] [117]) [6 >> MEM[(const struct bitset &)FeatureEntry_2(D)]+0 S16 A32]))) >> >> satisfies the 'm" constraint. I don't think LRA should call >> ix86_legitimate_combined_insn to validate to validate constraints on >> an instruction. > > I think best would be if we arrange (through some gen* extension?) that > the backend in *.md files could specify additional code to be run during > recognition (ideally have a way that it is used only on operands that are or > might have vector mode and are or might be MEM, so that it wouldn't be done > on every insn) and move the > /* For pre-AVX disallow unaligned loads/stores where the > instructions don't support it. */ > if (!TARGET_AVX > && VECTOR_MODE_P (mode) > && misaligned_operand (op, mode)) > { > unsigned int min_align = get_attr_ssememalign (insn); > if (min_align == 0 > || MEM_ALIGN (op) < min_align) > return false; > } > from the legitimate_combined_insn target hook into that. Or run some target > hook during recog? That way it would > be checked in all recog spots, rather than just the combiner. > Or indeed move to different predicates and/or constraints for operands that > disallow misaligned operands and perform this check in there. > Unless something in LRA is the only oher spot that could try to "combine" > misaligned memory into vector instructions.
The backend should provide accurate constraint information to register alllocator so that the proper operand will be selected. -- H.J.