Hi Avinash!

On 2025-10-21T11:46:04+0530, Avinash Jayakar <[email protected]> wrote:
> Some targets (aarch64 and x86_64 with multilib) reported regression for some
> test cases made for PR104116.

Thanks for looking into this.

I've similarly observed for '--target=amdgcn-amdhsa':

    +PASS: gcc.dg/vect/pr104116-ceil-umod-2.c (test for excess errors)
    +PASS: gcc.dg/vect/pr104116-ceil-umod-2.c execution test
    +FAIL: gcc.dg/vect/pr104116-ceil-umod-2.c scan-tree-dump-times vect 
"optimized: loop vectorized" 1

    +PASS: gcc.dg/vect/pr104116-ceil-umod-pow2.c (test for excess errors)
    +PASS: gcc.dg/vect/pr104116-ceil-umod-pow2.c execution test
    +FAIL: gcc.dg/vect/pr104116-ceil-umod-pow2.c scan-tree-dump-times vect 
"optimized: loop vectorized" 1

    +PASS: gcc.dg/vect/pr104116-round-div-2.c (test for excess errors)
    +PASS: gcc.dg/vect/pr104116-round-div-2.c execution test
    +FAIL: gcc.dg/vect/pr104116-round-div-2.c scan-tree-dump-times vect 
"optimized: loop vectorized" 1

    +PASS: gcc.dg/vect/pr104116-round-div-pow2.c (test for excess errors)
    +PASS: gcc.dg/vect/pr104116-round-div-pow2.c execution test
    +FAIL: gcc.dg/vect/pr104116-round-div-pow2.c scan-tree-dump-times vect 
"optimized: loop vectorized" 1

    +PASS: gcc.dg/vect/pr104116-round-div.c (test for excess errors)
    +PASS: gcc.dg/vect/pr104116-round-div.c execution test
    +FAIL: gcc.dg/vect/pr104116-round-div.c scan-tree-dump-times vect 
"optimized: loop vectorized" 1

    +PASS: gcc.dg/vect/pr104116-round-mod-2.c (test for excess errors)
    +PASS: gcc.dg/vect/pr104116-round-mod-2.c execution test
    +FAIL: gcc.dg/vect/pr104116-round-mod-2.c scan-tree-dump-times vect 
"optimized: loop vectorized" 1

    +PASS: gcc.dg/vect/pr104116-round-mod-pow2.c (test for excess errors)
    +PASS: gcc.dg/vect/pr104116-round-mod-pow2.c execution test
    +FAIL: gcc.dg/vect/pr104116-round-mod-pow2.c scan-tree-dump-times vect 
"optimized: loop vectorized" 1

    +PASS: gcc.dg/vect/pr104116-round-mod.c (test for excess errors)
    +PASS: gcc.dg/vect/pr104116-round-mod.c execution test
    +FAIL: gcc.dg/vect/pr104116-round-mod.c scan-tree-dump-times vect 
"optimized: loop vectorized" 1

    +PASS: gcc.dg/vect/pr104116-round-umod-2.c (test for excess errors)
    +PASS: gcc.dg/vect/pr104116-round-umod-2.c execution test
    +FAIL: gcc.dg/vect/pr104116-round-umod-2.c scan-tree-dump-times vect 
"optimized: loop vectorized" 1

> Turned out an extra loop which was for checking
> results in run-time was also being vectorized and the count of vect loop was 2
> instead of 1. In this patch I have made sure no other loop other than the one
> in interest of test case is vectorized. Ok for master?

> The commit gcc-16-4464-g6883d51304f added 30 new tests for testing
> vectorization of {FLOOR,MOD,ROUND}_{DIV,MOD}_EXPR. Few of them failed
> for certain targets due to the vectorization of runtime-check loop which
> was not intended.
> This patch disables optimization for all of the run-time check loops so
> that the count of vectorized loop is always 1.
>
> 2025-10-21  Avinash Jayakar  <[email protected]>
>
> gcc/testsuite/ChangeLog:
>       PR target/104116
>         * gcc.dg/vect/pr104116.h: disable optimizations.

Here, you should list the individual functions that your modifying.

> --- a/gcc/testsuite/gcc.dg/vect/pr104116.h
> +++ b/gcc/testsuite/gcc.dg/vect/pr104116.h
> @@ -106,6 +106,7 @@ int cl_div (int x, int y)
>    return q;
>  }
>  
> +__attribute__((optimize("O0")))
>  unsigned int cl_udiv (unsigned int x, unsigned int y)
>  {
>    unsigned int r = x % y;

As far as I can tell, the standard idiom is to put '#pragma GCC novector'
in front of the loop that's not to be vectorized.  That's more expressive
than enforcing '-O0'.  Or is '-O0' necessary for other reasons?


Grüße
 Thomas


> @@ -123,6 +124,7 @@ int cl_mod (int x, int y)
>    return r;
>  }
>  
> +__attribute__((optimize("O0")))
>  unsigned int cl_umod (unsigned int x, unsigned int y)
>  {
>    unsigned int r = x % y;
> @@ -141,7 +143,7 @@ int fl_div (int x, int y)
>    return q;
>  }
>  
> -
> +__attribute__((optimize("O0")))
>  int fl_mod (int x, int y)
>  {
>    int r = x % y;
> @@ -150,12 +152,14 @@ int fl_mod (int x, int y)
>    return r;
>  }
>  
> +__attribute__((optimize("O0")))
>  int abs(int x)
>  {
>    if (x < 0) return -x;
>    return x;
>  }
>  
> +__attribute__((optimize("O0")))
>  int rd_mod (int x, int y)
>  {
>    int r = x % y;
> @@ -169,6 +173,7 @@ int rd_mod (int x, int y)
>    return r;
>  }
>  
> +__attribute__((optimize("O0")))
>  int rd_div (int x, int y)
>  {
>    int r = x % y;
> @@ -183,6 +188,7 @@ int rd_div (int x, int y)
>    return q;
>  }
>  
> +__attribute__((optimize("O0")))
>  unsigned int rd_umod (unsigned int x, unsigned int y)
>  {
>    unsigned int r = x % y;
> @@ -191,6 +197,7 @@ unsigned int rd_umod (unsigned int x, unsigned int y)
>    return r;
>  }
>  
> +__attribute__((optimize("O0")))
>  unsigned int rd_udiv (unsigned int x, unsigned int y)
>  {
>    unsigned int r = x % y;
> -- 
> 2.51.0

Reply via email to