Hi , This improves the vdup_n intrinsics where one tries to form constant vectors. This uses targetm.fold_builtin to fold these vector initializations to actual vector constants. The vdup_n cases are fine with both endian-ness as the vector constant is just duplicated. In addition I've made the *neon_vmov patterns take a const_zero vector to allow the compiler to generate vmov.i32 <reg>, #0 for vdup_n_f32 (0.0f); type operations. It has the nice side effect that zero initalization of FP vectors for Neon doesn't need a load from the literal pool. I will point out that the vcreate and a lot of the other intrinsics could be improved in a similar vein (caveat big-endian) . This helps in a number of cases where we were initially generating a mov of a constant into an integer register and then dupping it over and indeed helps the tree optimizers recognize the value for the constant vector that it is.
This also needed some work with making a testcase for vabd more robust which just showed that the folding works ! In the process I've also cleaned up a few prototypes which was obvious. Tested cross on arm-linux-gnueabi with no regressions. Ok (to commit as 2 separate patches one for the prototype cleanup and the other for the vdup case ) ? regards, Ramana 2012-06-20 Ramana Radhakrishnan <[email protected]> * config/arm/arm.c (arm_vector_alignment_reachable): Fix declaration. (arm_builtin_support_vector_misalignment): Likewise. (arm_preferred_rename_class): Likewise. (arm_vectorize_vec_perm_const_ok): Likewise. (arm_fold_builtin): New. (TARGET_FOLD_BUILTIN): New. * config/arm/neon.md (*neon_mov<mode>:VDX, VQX): Add Dz alternative. testsuite/ * gcc.target/arm/neon-combine-sub-abs-into-abd.c: Make test more robust.
vmovzero.patch
Description: Binary data
