I'm attaching an updated version of the patch, addressing the comments from
http://gcc.gnu.org/ml/gcc-patches/2012-04/msg01615.html
This patch adds arm32 to targets that support vect_char_mult. In addition,
the test is updated to prevent vectorization of the initialization loop. The
expected number of vectorized loops is adjusted accordingly.
No regression with check-gcc on qemu for arm-none-eabi cortex-a9 neon softfp
arm/thumb.
OK for trunk?
Thanks,
Greta
ChangeLog
gcc/testsuite
2012-05-30 Greta Yorsh <Greta.Yorsh at arm.com>
* gcc.dg/vect/slp-perm-8.c (main): Prevent vectorization
of the initialization loop.
(dg-final): Adjust the expected number of vectorized loops
depending on vect_char_mult target selector.
* lib/target-supports.exp (check_effective_target_vect_char_mult):
Add
arm32 to targets
> -----Original Message-----
> From: Richard Earnshaw [mailto:[email protected]]
> Sent: 25 April 2012 17:30
> To: Richard Guenther
> Cc: Greta Yorsh; [email protected]; [email protected];
> [email protected]
> Subject: Re: [Patch, testsuite] fix failure in test gcc.dg/vect/slp-
> perm-8.c
>
> On 25/04/12 15:31, Richard Guenther wrote:
> > On Wed, Apr 25, 2012 at 4:27 PM, Greta Yorsh <[email protected]>
> wrote:
> >> Richard Guenther wrote:
> >>> On Wed, Apr 25, 2012 at 3:34 PM, Greta Yorsh <[email protected]>
> >>> wrote:
> >>>> Richard Guenther wrote:
> >>>>> On Wed, Apr 25, 2012 at 1:51 PM, Greta Yorsh
> <[email protected]>
> >>>>> wrote:
> >>>>>> The test gcc.dg/vect/slp-perm-8.c fails on arm-none-eabi with
> neon
> >>>>> enabled:
> >>>>>> FAIL: gcc.dg/vect/slp-perm-8.c scan-tree-dump-times vect
> >>> "vectorized
> >>>>> 1
> >>>>>> loops" 2
> >>>>>>
> >>>>>> The test expects 2 loops to be vectorized, while gcc
> successfully
> >>>>> vectorizes
> >>>>>> 3 loops in this test using neon on arm. This patch adjusts the
> >>>>> expected
> >>>>>> output. Fixed test passes on qemu for arm and powerpc.
> >>>>>>
> >>>>>> OK for trunk?
> >>>>>
> >>>>> I think the proper fix is to instead of
> >>>>>
> >>>>> for (i = 0; i < N; i++)
> >>>>> {
> >>>>> input[i] = i;
> >>>>> output[i] = 0;
> >>>>> if (input[i] > 256)
> >>>>> abort ();
> >>>>> }
> >>>>>
> >>>>> use
> >>>>>
> >>>>> for (i = 0; i < N; i++)
> >>>>> {
> >>>>> input[i] = i;
> >>>>> output[i] = 0;
> >>>>> __asm__ volatile ("");
> >>>>> }
> >>>>>
> >>>>> to prevent vectorization of initialization loops.
> >>>>
> >>>> Actually, it looks like both arm and powerpc vectorize this
> >>> initialization loop (line 31), because the control flow is hoisted
> >>> outside the loop by previous optimizations. In addition, arm with
> neon
> >>> vectorizes the second loop (line 39), but powerpc does not:
> >>>>
> >>>> 39: not vectorized: relevant stmt not supported: D.2163_8 = i_40 *
> 9;
> >>>>
> >>>> If this is the expected behaviour for powerpc, then the patch I
> >>> proposed is still needed to fix the test failure on arm. Also,
> there
> >>> would be no need to disable vectorization of the initialization
> loop,
> >>> right?
> >>>
> >>> Ah, I thought that was what changed. Btw, the if () abort () tries
> to
> >>> disable
> >>> vectorization but does not succeed in doing so.
> >>>
> >>> Richard.
> >>
> >> Here is an updated patch. It prevents vectorization of the
> initialization
> >> loop, as Richard suggested, and updates the expected number of
> vectorized
> >> loops accordingly. This patch assumes that the second loop in main
> (line 39)
> >> should only be vectorized on arm with neon. The test passes for arm
> and
> >> powerpc.
> >>
> >> OK for trunk?
> >
> > If arm cannot handle 9 * i then the approrpiate condition would be
> > vect_int_mult, not arm_neon_ok.
> >
>
> The issue is that arm has (well, should be marked has having)
> vect_char_mult. The difference in count of vectorized loops is based
> on
> that.
>
> R.
>
> > Ok with that change.
> >
> > Richard.
> >
> >> Thank you,
> >> Greta
> >>
> >> gcc/testsuite/ChangeLog
> >>
> >> 2012-04-25 Greta Yorsh <[email protected]>
> >>
> >> * gcc.dg/vect/slp-perm-8.c (main): Prevent
> >> vectorization of initialization loop.
> >> (dg-final): Adjust the expected number of
> >> vectorized loops.
> >>
> >>
> >>
> >>
> >
>
diff --git a/gcc/testsuite/gcc.dg/vect/slp-perm-8.c
b/gcc/testsuite/gcc.dg/vect/slp-perm-8.c
index d211ef9..c4854d5 100644
--- a/gcc/testsuite/gcc.dg/vect/slp-perm-8.c
+++ b/gcc/testsuite/gcc.dg/vect/slp-perm-8.c
@@ -32,8 +32,7 @@ int main (int argc, const char* argv[])
{
input[i] = i;
output[i] = 0;
- if (input[i] > 256)
- abort ();
+ __asm__ volatile ("");
}
for (i = 0; i < N / 3; i++)
@@ -52,7 +51,8 @@ int main (int argc, const char* argv[])
return 0;
}
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { target
vect_perm_byte } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { target {
vect_perm_byte && vect_char_mult } } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target {
vect_perm_byte && {! vect_char_mult } } } } } */
/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" {
target vect_perm_byte } } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/lib/target-supports.exp
b/gcc/testsuite/lib/target-supports.exp
index b93dc5c..d249404 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -3462,7 +3462,8 @@ proc check_effective_target_vect_char_mult { } {
set et_vect_char_mult_saved 0
if { [istarget ia64-*-*]
|| [istarget i?86-*-*]
- || [istarget x86_64-*-*] } {
+ || [istarget x86_64-*-*]
+ || [check_effective_target_arm32] } {
set et_vect_char_mult_saved 1
}
}