On 11/09/14 14:55, Jiong Wang wrote:
On 11/09/14 14:43, Christophe Lyon wrote:
Hi Jiong,
On 9 September 2014 12:59, Ramana Radhakrishnan
<ramana....@googlemail.com> wrote:
On Mon, Aug 18, 2014 at 11:31 AM, Jiong Wang <jiong.w...@arm.com> wrote:
this patch enable auto-vectorization for copysignf by using vector
bit selection instruction on arm32 when neon available.
I've noticed that your new testcase fails (the scan-tree-dump-times
line), in the following cases:
* forcing -march=armv5t in RUNTESTFLAGS (targets
arm-none-linux-gnueabi and arm-none-linux-gnueabihf)
* target armeb-none-linux-gnueabihf
You can have a look at:
http://cbuild.validation.linaro.org/build/cross-validation/gcc/trunk/215067/report-build-info.html
If you go 1 level up at
http://cbuild.validation.linaro.org/build/cross-validation/gcc/trunk/215067/
you'll be able to browse into the per-target subdirs and get the .sum
files if you need them.
Christophe,
the auto-test system is great!
the testcase only pass when both hardware & abi options meet requirement.
I tried to skip those environment where neon is not available by
"dg-require-effective-target arm_neon_hw"
there maybe something not covered. I'll have a look.
thanks.
the scan of
"/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } }
*/
is a little bit fragile, it needs -mfpu, -mfloat-abi both meet requirement,
and thus cause
trouble if the test environment has complicated options combinations like
the Linaro test farm.
currently, I haven't found any good way in Dejagnu to accurately detect
what's options used.
I was trying specify options like -march=armv7 -mfloat-abi=hard, and do
compile test only,
but even this, these options may be override by options specified explicitly
by user, and thus
pass those dejagnu check-*-target while fail on the later actual compile.
the one way I can think of which is 100% safe on all test environment is:
* keep the test as a "run" test only, as the correctness is important.
* remove the scan of "vectorized 1 loops".
-- Jiong
-- Jiong
Christophe.
for a simple testcase:
for (i = 0; i < N; i++)
r[i] = __builtin_copysignf (a[i], b[i]);
assuming vector factor be 4, the generated instruction sequences is:
vmov.i32 q10, #2147483648 @ v4si
.L2:
vld1.64 {d18-d19}, [ip:64]
add r3, r3, #16
add ip, ip, #16
vldr d16, [r3, #-16]
vldr d17, [r3, #-8]
vbif q8, q9, q10
Ok.
Ramana
thanks.
gcc/
* config/arm/arm.c (NEON_COPYSIGNF): New enum.
(arm_init_neon_builtins): Support NEON_COPYSIGNF.
(arm_builtin_vectorized_function): Likewise.
* config/arm/arm_neon_builtins.def: New macro for copysignf.
* config/arm/neon.md (neon_copysignf<mode>): New pattern for vector
copysignf.
gcc/testsuite/
* gcc.target/arm/vect-copysignf.c: New testcase.