------- Comment #31 from dominiq at lps dot ens dot fr 2007-12-03 18:58 ------- > If there are no loops, then "straight-line parallelization" [SLP] should > vectorize > your manually unrolled sequence in comment #24.
Yes it should, but if does not after patch #5. The unanswered question so far is why it does not, then how to change the patch so that it does it. Anyhow, the "good" vectorization should be along the k loop (length 9 instead of 3). My understanding of my tests is first that 5/9<2/3 and, more important, the packing/unpacking overhead is a smaller penalty if it is shared as in the k vectorization. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34265