On 30 November 2011 22:28, Michael Hope <michael.h...@linaro.org> wrote: >>> This run also showed the affect of loop unrolling. The loop seems to >>> be unrolled for loops of <= 64 words and drops off in performance past >>> around 8 words. When the unrolling finally drops out, performance >>> increases by 101 %. >> >> I see register spills starting from COUNT=36. > > Ah. Does the vectoriser cost model take register pressure into > account? How can I turn this on?
No, but the vectorizer doesn't perform loop unrolling either. The unrolling here is done by complete_unroll pass after the vectorization, and AFAIK it doesn't take register pressure into account. On 1 December 2011 02:40, Michael Hope <michael.h...@linaro.org> wrote: > I had a look in the backend and the vld1/vst1 %A operand adds the > alignment if known. It correctly adds [r1:64] if I feed in an array > of int64s. The code checks based on MEM_ALIGN and MEM_SIZE of the > operand: > align = MEM_ALIGN (x) >> 3; > memsize = INTVAL (MEM_SIZE (x)); > > Not sure why the backend generates a vldmia instead of a vld1 though. I don't see how the alignment info set by the vectorizer influences MEM_ALIGN. The vectorizer sets align and misalign fields of struct ptr_info_def. I see it used in expand_expr_real_1 for MEM_REF only to decide if there is a need in movmisalign (for unaligned accesses). MEM_ALIGN is determined in set_mem_attributes_minus_bitpos from DECL_ALIGN or TYPE_ALIGN. For the cases where the vectorizer forces alignment this should work, since we then set DECL_ALIGN (in vect_compute_data_ref_alignment). But peeling obviously doesn't change DECL_ALIGN, so I don't understand how we can create alignment hint in this case with the current code. Ira > > -- Michael > _______________________________________________ linaro-toolchain mailing list linaro-toolchain@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-toolchain