[Bug target/100799] Stackoverflow in optimized code on PPC

2024-03-22 Thread aagarwa at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799

--- Comment #33 from Ajit Kumar Agarwal  ---
Sent the patch for review.

Here is the patch:
PATCH] rs6000: Stackoverflow in optimized code on PPC (PR100799)

When using FlexiBLAS with OpenBLAS we noticed corruption of
the parameters passed to OpenBLAS functions. FlexiBLAS
basically provides a BLAS interface where each function
is a stub that forwards the arguments to a real BLAS lib,
like OpenBLAS.

Fixes the corruption of caller frame checking number of
arguments is less than equal to GP_ARG_NUM_REG (8)
excluding hidden unused DECLS.

2024-03-22  Ajit Kumar Agarwal  

gcc/ChangeLog:

PR rtk-optimization/100799
* config/rs600/rs600-calls.cc (rs6000_function_arg): Don't
generate parameter save area if number of arguments passed
less than equal to GP_ARG_NUM_REG (8) excluding hidden
paramter.
* function.cc (assign_parms_initialize_all): Check for hidden
parameter in fortran code and set the flag hidden_string_length
and actual paramter passed excluding hidden unused DECLS.
* function.h: Add new field hidden_string_length and
actual_parm_length in function structure.

[Bug target/100799] Stackoverflow in optimized code on PPC

2024-03-22 Thread aagarwa at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799

--- Comment #34 from Ajit Kumar Agarwal  ---
Sent the patch for review.

Here is the patch:
PATCH] rs6000: Stackoverflow in optimized code on PPC (PR100799)

When using FlexiBLAS with OpenBLAS we noticed corruption of
the parameters passed to OpenBLAS functions. FlexiBLAS
basically provides a BLAS interface where each function
is a stub that forwards the arguments to a real BLAS lib,
like OpenBLAS.

Fixes the corruption of caller frame checking number of
arguments is less than equal to GP_ARG_NUM_REG (8)
excluding hidden unused DECLS.

2024-03-22  Ajit Kumar Agarwal  

gcc/ChangeLog:

PR rtk-optimization/100799
* config/rs600/rs600-calls.cc (rs6000_function_arg): Don't
generate parameter save area if number of arguments passed
less than equal to GP_ARG_NUM_REG (8) excluding hidden
paramter.
* function.cc (assign_parms_initialize_all): Check for hidden
parameter in fortran code and set the flag hidden_string_length
and actual paramter passed excluding hidden unused DECLS.
* function.h: Add new field hidden_string_length and
actual_parm_length in function structure.

[Bug target/103784] suboptimal code for returning bool value on target ppc

2023-08-31 Thread aagarwa at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103784

--- Comment #16 from Ajit Kumar Agarwal  ---
This patch https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624751.html

removes the zero extension from the below testcase that has different cfg, My
patch is not based on any CFG shape but it is general valid for all the CFG.

Testcase from Surya.
#include 

bool glob1;
bool glob2;

bool foo (int a, bool d)
{
  bool c;
  if (a > 2)
c = glob1 & glob2;
  else
c = glob1 | glob2;
  return c^d;
}

[Bug middle-end/82940] Suboptimal code for (a & 0x7f) | (b & 0x80) on powerpc

2023-04-11 Thread aagarwa at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82940

Ajit Kumar Agarwal  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED

[Bug middle-end/41742] Unnecessary zero-extension at -O2 but not -O1

2023-04-11 Thread aagarwa at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=41742

Ajit Kumar Agarwal  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED

[Bug target/103784] suboptimal code for returning bool value on target ppc

2023-04-11 Thread aagarwa at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103784

Ajit Kumar Agarwal  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2023-04-11
 Status|UNCONFIRMED |ASSIGNED

[Bug target/65010] ppc backend generates unnecessary signed extension

2023-04-11 Thread aagarwa at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65010

Ajit Kumar Agarwal  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
 CC||aagarwa at gcc dot gnu.org

[Bug tree-optimization/36010] Loop interchange not performed

2023-10-16 Thread aagarwa at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=36010

Ajit Kumar Agarwal  changed:

   What|Removed |Added

 CC||aagarwa at gcc dot gnu.org

--- Comment #5 from Ajit Kumar Agarwal  ---
Use the following flags-fassociative-math -fno-signed-zeros -fno-trapping-math
to make loop-interchange work.

Following code in gcc/gimple-loop-interchange.cc
@@ -514,8 +514,8 @@ loop_cand::analyze_iloop_reduction_var (tree var)
   if (! (associative_tree_code (code)
 || (code == MINUS_EXPR
 && use_p->use == gimple_assign_rhs1_ptr (ass)))
 || (FLOAT_TYPE_P (TREE_TYPE (var))
   && ! flag_associative_math))
 return false;
 }
   else

Because of the flag_associative_math conditions at line no: 514 it returns
false and loop interchange doesn't work. Using the above flags make
flag_associative_math as true and loop interchange works.

[Bug testsuite/109880] [14 regression] gcc.target/powerpc/fold-vec-extract-int.p8.c fails after r14-916-g9417b30499ce09

2023-05-18 Thread aagarwa at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109880

--- Comment #2 from Ajit Kumar Agarwal  ---
Yes these are redundant zero extend and can be removed.

I will update the tests and send the patch for review.

[Bug target/91804] [10/11/12/13/14 regression] r265398 breaks gcc.target/powerpc/vec-rlmi-rlnm.c

2023-06-20 Thread aagarwa at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91804

Ajit Kumar Agarwal  changed:

   What|Removed |Added

 CC||aagarwa at gcc dot gnu.org

--- Comment #8 from Ajit Kumar Agarwal  ---
I don't see extra xxlor with latest gcc trunk. I think this is fixed.
I think we should close this PR.

Thanks & Regards
Ajit