------- Comment #7 from eres at il dot ibm dot com 2009-07-05 08:12 ------- Testing test_fpu on Power7 with the power7 branch shows no significant difference between the version compiled with the misaligned store support patch and without it. (using -mcpu=power7 -ffast-math -funroll-loops -O3) The version with the misaligned store support patch is ~23% faster than the version with -fno-tree-vectorize. So it seems like this is a tuning issue for x86-64 and might be addressed in the cost model.
-- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40648