On 11/09/14 14:55, Jiong Wang wrote:
On 11/09/14 14:43, Christophe Lyon wrote:
Hi Jiong,

On 9 September 2014 12:59, Ramana Radhakrishnan
<ramana....@googlemail.com> wrote:
On Mon, Aug 18, 2014 at 11:31 AM, Jiong Wang <jiong.w...@arm.com> wrote:
this patch enable auto-vectorization for copysignf by using vector
bit selection instruction on arm32 when neon available.

I've noticed that your new testcase fails (the scan-tree-dump-times
line), in the following cases:
* forcing -march=armv5t in RUNTESTFLAGS (targets
arm-none-linux-gnueabi and arm-none-linux-gnueabihf)
* target armeb-none-linux-gnueabihf

You can have a look at:
http://cbuild.validation.linaro.org/build/cross-validation/gcc/trunk/215067/report-build-info.html

If you go 1 level up at
http://cbuild.validation.linaro.org/build/cross-validation/gcc/trunk/215067/
you'll be able to browse into the per-target subdirs and get the .sum
files if you need them.
Christophe,

    the auto-test system is great!

    the testcase only pass when both hardware & abi options meet requirement.

    I tried to skip those environment where neon is not available by 
"dg-require-effective-target arm_neon_hw"

    there maybe something not covered. I'll have a look.

    thanks.

   the scan of

        "/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } 
*/

   is a little bit fragile, it needs -mfpu, -mfloat-abi both meet requirement, 
and thus cause
   trouble if the test environment has complicated options combinations like 
the Linaro test farm.

   currently, I haven't found any good way in Dejagnu to accurately detect 
what's options used.

   I was trying specify options like -march=armv7 -mfloat-abi=hard, and do 
compile test only,
   but even this, these options may be override by options specified explicitly 
by user, and thus
   pass those dejagnu check-*-target while fail on the later actual compile.

   the one way I can think of which is 100% safe on all test environment is:

     * keep the test as a "run" test only, as the correctness is important.
     * remove the scan of "vectorized 1 loops".

-- Jiong


-- Jiong

Christophe.


for a simple testcase:

    for (i = 0; i < N; i++)
      r[i] = __builtin_copysignf (a[i], b[i]);


assuming vector factor be 4, the generated instruction sequences is:

          vmov.i32        q10, #2147483648  @ v4si
.L2:
          vld1.64 {d18-d19}, [ip:64]
          add     r3, r3, #16
          add     ip, ip, #16
          vldr    d16, [r3, #-16]
          vldr    d17, [r3, #-8]
          vbif    q8, q9, q10

Ok.

Ramana


thanks.

gcc/
    * config/arm/arm.c (NEON_COPYSIGNF): New enum.
    (arm_init_neon_builtins): Support NEON_COPYSIGNF.
    (arm_builtin_vectorized_function): Likewise.
    * config/arm/arm_neon_builtins.def: New macro for copysignf.
    * config/arm/neon.md (neon_copysignf<mode>): New pattern for vector
copysignf.

gcc/testsuite/
    * gcc.target/arm/vect-copysignf.c: New testcase.




Reply via email to