[PATCH] Fix test errors after r15-1394 for sizeof(int)==sizeof(long) [PR115545]

2024-06-23 Thread Martin Uecker


This fixes the test failures introduced by the fix for PR115109.

Tested on x86_64 and also tested with -m32.



Fix test errors after r15-1394 for sizeof(int)==sizeof(long) [PR115545]

Some tests added to test the type of redeclarations of enumerators
in r15-1394 fail on architectures where sizeof(long) == sizeof(int).
Adapt tests to use long long and/or accept that long long is selected
as type for the enumerator.

gcc/testsuite/Changelog:
PR testsuite/115545
* gcc.dg/pr115109.c: Adapt test.
* gcc.dg/c23-tag-enum-6.c: Adapt test.
* gcc.dg/c23-tag-enum-7.c: Adapt test.

diff --git a/gcc/testsuite/gcc.dg/c23-tag-enum-6.c 
b/gcc/testsuite/gcc.dg/c23-tag-enum-6.c
index 29aef7ee3fd..d8d304d9b3d 100644
--- a/gcc/testsuite/gcc.dg/c23-tag-enum-6.c
+++ b/gcc/testsuite/gcc.dg/c23-tag-enum-6.c
@@ -7,10 +7,10 @@ enum E : int { a = 1, b = 2 };
 enum E : int { b = _Generic(a, enum E: 2), a = 1 };
 
 enum H { x = 1 };
-enum H { x = 2UL + UINT_MAX }; /* { dg-error "outside the range" } */
+enum H { x = 2ULL + UINT_MAX };/* { dg-error "outside the range" } */
 
 enum K : int { z = 1 };
-enum K : int { z = 2UL + UINT_MAX };   /* { dg-error "outside the range" } */
+enum K : int { z = 2ULL + UINT_MAX };  /* { dg-error "outside the range" } */
 
 enum F { A = 0, B = UINT_MAX };
 enum F { B = UINT_MAX, A };/* { dg-error "outside the range" } */
diff --git a/gcc/testsuite/gcc.dg/c23-tag-enum-7.c 
b/gcc/testsuite/gcc.dg/c23-tag-enum-7.c
index d4c787c8f71..974735bf2ef 100644
--- a/gcc/testsuite/gcc.dg/c23-tag-enum-7.c
+++ b/gcc/testsuite/gcc.dg/c23-tag-enum-7.c
@@ -4,23 +4,23 @@
 #include 
 
 // enumerators are all representable in int
-enum E { a = 1UL, b = _Generic(a, int: 2) };
+enum E { a = 1ULL, b = _Generic(a, int: 2) };
 static_assert(_Generic(a, int: 1));
 static_assert(_Generic(b, int: 1));
-enum E { a = 1UL, b = _Generic(a, int: 2) };
+enum E { a = 1ULL, b = _Generic(a, int: 2) };
 static_assert(_Generic(a, int: 1));
 static_assert(_Generic(b, int: 1));
 
 // enumerators are not representable in int
-enum H { c = 1UL << (UINT_WIDTH + 1), d = 2 };
+enum H { c = 1ULL << (UINT_WIDTH + 1), d = 2 };
 static_assert(_Generic(c, enum H: 1));
 static_assert(_Generic(d, enum H: 1));
-enum H { c = 1UL << (UINT_WIDTH + 1), d = _Generic(c, enum H: 2) };
+enum H { c = 1ULL << (UINT_WIDTH + 1), d = _Generic(c, enum H: 2) };
 static_assert(_Generic(c, enum H: 1));
 static_assert(_Generic(d, enum H: 1));
 
 // there is an overflow in the first declaration
-enum K { e = UINT_MAX, f, g = _Generic(e, unsigned int: 0) + _Generic(f, 
unsigned long: 1) };
+enum K { e = UINT_MAX, f, g = _Generic(e, unsigned int: 0) + _Generic(f, 
unsigned long: 1, unsigned long long: 1) };
 static_assert(_Generic(e, enum K: 1));
 static_assert(_Generic(f, enum K: 1));
 static_assert(_Generic(g, enum K: 1));
@@ -30,7 +30,7 @@ static_assert(_Generic(f, enum K: 1));
 static_assert(_Generic(g, enum K: 1));
 
 // there is an overflow in the first declaration
-enum U { k = INT_MAX, l, m = _Generic(k, int: 0) + _Generic(l, long: 1) };
+enum U { k = INT_MAX, l, m = _Generic(k, int: 0) + _Generic(l, long: 1, long 
long: 1) };
 static_assert(_Generic(k, enum U: 1));
 static_assert(_Generic(l, enum U: 1));
 static_assert(_Generic(m, enum U: 1));
diff --git a/gcc/testsuite/gcc.dg/pr115109.c b/gcc/testsuite/gcc.dg/pr115109.c
index 4baee0f3445..8245ff7fadb 100644
--- a/gcc/testsuite/gcc.dg/pr115109.c
+++ b/gcc/testsuite/gcc.dg/pr115109.c
@@ -3,6 +3,6 @@
 
 #include 
 
-enum E { a = 1UL << (ULONG_WIDTH - 5), b = 2 };
-enum E { a = 1ULL << (ULONG_WIDTH - 5), b = _Generic(a, enum E: 2) };
+enum E { a = 1ULL << (ULLONG_WIDTH - 5), b = 2 };
+enum E { a = 1ULL << (ULLONG_WIDTH - 5), b = _Generic(a, enum E: 2) };
 



Re: [PATCH 6/6] Add a late-combine pass [PR106594]

2024-06-23 Thread Richard Biener
On Sat, Jun 22, 2024 at 6:50 PM Richard Sandiford
 wrote:
>
> Takayuki 'January June' Suwa  writes:
> > On 2024/06/20 22:34, Richard Sandiford wrote:
> >> This patch adds a combine pass that runs late in the pipeline.
> >> There are two instances: one between combine and split1, and one
> >> after postreload.
> >>
> >> The pass currently has a single objective: remove definitions by
> >> substituting into all uses.  The pre-RA version tries to restrict
> >> itself to cases that are likely to have a neutral or beneficial
> >> effect on register pressure.
> >>
> >> The patch fixes PR106594.  It also fixes a few FAILs and XFAILs
> >> in the aarch64 test results, mostly due to making proper use of
> >> MOVPRFX in cases where we didn't previously.
> >>
> >> This is just a first step.  I'm hoping that the pass could be
> >> used for other combine-related optimisations in future.  In particular,
> >> the post-RA version doesn't need to restrict itself to cases where all
> >> uses are substitutable, since it doesn't have to worry about register
> >> pressure.  If we did that, and if we extended it to handle multi-register
> >> REGs, the pass might be a viable replacement for regcprop, which in
> >> turn might reduce the cost of having a post-RA instance of the new pass.
> >>
> >> On most targets, the pass is enabled by default at -O2 and above.
> >> However, it has a tendency to undo x86's STV and RPAD passes,
> >> by folding the more complex post-STV/RPAD form back into the
> >> simpler pre-pass form.
> >>
> >> Also, running a pass after register allocation means that we can
> >> now match define_insn_and_splits that were previously only matched
> >> before register allocation.  This trips things like:
> >>
> >>(define_insn_and_split "..."
> >>  [...pattern...]
> >>  "...cond..."
> >>  "#"
> >>  "&& 1"
> >>  [...pattern...]
> >>  {
> >>...unconditional use of gen_reg_rtx ()...;
> >>  }
> >>
> >> because matching and splitting after RA will call gen_reg_rtx when
> >> pseudos are no longer allowed.  rs6000 has several instances of this.
> >
> > xtensa also has something like that.
> >
> >> xtensa has a variation in which the split condition is:
> >>
> >>  "&& can_create_pseudo_p ()"
> >>
> >> The failure then is that, if we match after RA, we'll never be
> >> able to split the instruction.
> >
> > To be honest, I'm confusing by the possibility of adding a split pattern
> > application opportunity that depends on the optimization options after
> > Rel... ah, LRA and before the existing rtl-split2.
> >
> > Because I just recently submitted a patch that I expected would reliably
> > (i.e. regardless of optimization options, etc.) apply the split pattern
> > first in the rtl-split2 pass after RA, and it was merged.
> >
> >>
> >> The patch therefore disables the pass by default on i386, rs6000
> >> and xtensa.  Hopefully we can fix those ports later (if their
> >> maintainers want).  It seems easier to add the pass first, though,
> >> to make it easier to test any such fixes.
> >>
> >> gcc.target/aarch64/bitfield-bitint-abi-align{16,8}.c would need
> >> quite a few updates for the late-combine output.  That might be
> >> worth doing, but it seems too complex to do as part of this patch.
> >>
> >> I tried compiling at least one target per CPU directory and comparing
> >> the assembly output for parts of the GCC testsuite.  This is just a way
> >> of getting a flavour of how the pass performs; it obviously isn't a
> >> meaningful benchmark.  All targets seemed to improve on average:
> >>
> >> Target Tests   GoodBad   %Good   Delta  Median
> >> == =   ===   =   =  ==
> >> aarch64-linux-gnu   2215   1975240  89.16%   -4159  -1
> >> aarch64_be-linux-gnu1569   1483 86  94.52%  -10117  -1
> >> alpha-linux-gnu 1454   1370 84  94.22%   -9502  -1
> >> amdgcn-amdhsa   5122   4671451  91.19%  -35737  -1
> >> arc-elf 2166   1932234  89.20%  -37742  -1
> >> arm-linux-gnueabi   1953   1661292  85.05%  -12415  -1
> >> arm-linux-gnueabihf 1834   1549285  84.46%  -11137  -1
> >> avr-elf 4789   4330459  90.42% -441276  -4
> >> bfin-elf2795   2394401  85.65%  -19252  -1
> >> bpf-elf 3122   2928194  93.79%   -8785  -1
> >> c6x-elf 2227   1929298  86.62%  -17339  -1
> >> cris-elf3464   3270194  94.40%  -23263  -2
> >> csky-elf2915   2591324  88.89%  -22146  -1
> >> epiphany-elf2399   2304 95  96.04%  -28698  -2
> >> fr30-elf7712   7299413  94.64%  -99830  -2
> >> frv-linux-gnu   3332   2877455  86.34%  -25108  -1
> >> ft32-elf2775   2667108  96.11%  -25029  -1
> >> h8300-elf   3176   2862314  9

[PATCH] tree-optimization/115597 - allow CSE of two-operator VEC_PERM nodes

2024-06-23 Thread Richard Biener
The following makes sure to always CSE when there's SLP_TREE_SCALAR_STMTS
as otherwise a chain of two-operator node operations can result in
exponential behavior of the CSE process as likely seen when building
510.parest on aarch64.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

PR tree-optimization/115597
* tree-vect-slp.cc (vect_cse_slp_nodes): Allow to CSE
VEC_PERM nodes.
---
 gcc/tree-vect-slp.cc | 1 -
 1 file changed, 1 deletion(-)

diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
index 9465d94de1a..212d5f97f7d 100644
--- a/gcc/tree-vect-slp.cc
+++ b/gcc/tree-vect-slp.cc
@@ -6085,7 +6085,6 @@ static void
 vect_cse_slp_nodes (scalar_stmts_to_slp_tree_map_t *bst_map, slp_tree& node)
 {
   if (SLP_TREE_DEF_TYPE (node) == vect_internal_def
-  && SLP_TREE_CODE (node) != VEC_PERM_EXPR
   /* Besides some VEC_PERM_EXPR, two-operator nodes also
 lack scalar stmts and thus CSE doesn't work via bst_map.  Ideally
 we'd have sth that works for all internal and external nodes.  */
-- 
2.43.0


[PATCH] Fix test errors introduced with fix for PR115157

2024-06-23 Thread Martin Uecker


This should fix the test failures introduced by the fix for PR115157.

Tested on x86_64 and also tested with -m32.


Fix test errors introduced with fix for PR115157.

Fix tests introduced when fixing PR115157 that assume 
sizeof(enum)==sizeof(int)
by adding the flag -fno-short-enums.

gcc/testsuite/Changelog:
* gcc.dg/enum-alias-1.c: Add flag.
* gcc.dg/enum-alias-2.c: Add flag.
* gcc.dg/enum-alias-3.c: Add flag.
* gcc.dg/enum-alias-4.c: Add flag.

diff --git a/gcc/testsuite/gcc.dg/enum-alias-1.c 
b/gcc/testsuite/gcc.dg/enum-alias-1.c
index 8fa30eb7897..725d88580b8 100644
--- a/gcc/testsuite/gcc.dg/enum-alias-1.c
+++ b/gcc/testsuite/gcc.dg/enum-alias-1.c
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -fno-short-enums" } */
 
 enum E { E1 = -1, E2 = 0, E3 = 1 };
 
diff --git a/gcc/testsuite/gcc.dg/enum-alias-2.c 
b/gcc/testsuite/gcc.dg/enum-alias-2.c
index 7ca3f3b2db8..d988c5ee698 100644
--- a/gcc/testsuite/gcc.dg/enum-alias-2.c
+++ b/gcc/testsuite/gcc.dg/enum-alias-2.c
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -fno-short-enums" } */
 
 typedef int A;
 
diff --git a/gcc/testsuite/gcc.dg/enum-alias-3.c 
b/gcc/testsuite/gcc.dg/enum-alias-3.c
index 36a4f02a455..4dbf0c9293a 100644
--- a/gcc/testsuite/gcc.dg/enum-alias-3.c
+++ b/gcc/testsuite/gcc.dg/enum-alias-3.c
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-options "-O2 -flto" } */
+/* { dg-options "-O2 -flto -fno-short-enums" } */
 
 typedef int *A;
 
diff --git a/gcc/testsuite/gcc.dg/enum-alias-4.c 
b/gcc/testsuite/gcc.dg/enum-alias-4.c
index b78d0451e3e..a1a8613fca0 100644
--- a/gcc/testsuite/gcc.dg/enum-alias-4.c
+++ b/gcc/testsuite/gcc.dg/enum-alias-4.c
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -fno-short-enums" } */
 
 typedef int A;
 typedef int __attribute__ (( hardbool(0, 1) )) B;






[PATCH] tree-optimization/115599 - reassoc qsort comparator issue

2024-06-23 Thread Richard Biener
The compare_repeat_factors comparator fails qsort checking eventually
because it uses rf2->rank - rf1->rank to compare unsigned numbers
which causes issues for ranks that interpret negative as signed.

Fixed by re-writing the obvious way.  I've also fixed the count
comparison which suffers from truncation as count is 64bit signed
while the comparator result is 32bit int (that's a lot less likely
to hit in practice though).

The testcase from the PR is too large to include.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

PR tree-optimization/115599
* tree-ssa-reassoc.cc (compare_repeat_factors): Use explicit
compares to avoid truncations.
---
 gcc/tree-ssa-reassoc.cc | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/gcc/tree-ssa-reassoc.cc b/gcc/tree-ssa-reassoc.cc
index 4d9f5216d4c..d74352268b5 100644
--- a/gcc/tree-ssa-reassoc.cc
+++ b/gcc/tree-ssa-reassoc.cc
@@ -6414,10 +6414,17 @@ compare_repeat_factors (const void *x1, const void *x2)
   const repeat_factor *rf1 = (const repeat_factor *) x1;
   const repeat_factor *rf2 = (const repeat_factor *) x2;
 
-  if (rf1->count != rf2->count)
-return rf1->count - rf2->count;
+  if (rf1->count < rf2->count)
+return -1;
+  else if (rf1->count > rf2->count)
+return 1;
+
+  if (rf1->rank < rf2->rank)
+return 1;
+  else if (rf1->rank > rf2->rank)
+return -1;
 
-  return rf2->rank - rf1->rank;
+  return 0;
 }
 
 /* Look for repeated operands in OPS in the multiply tree rooted at
-- 
2.43.0


RE: [PATCH v1] Ifcvt: Add cond tree reconcile for truncated .SAT_SUB

2024-06-23 Thread Li, Pan2
> You need to refactor this to add to the stmts pattern def sequence
>  (look for append_pattern_def_seq uses for example)

Thanks Richard, really save my day, will have a try in v2.

Pan

-Original Message-
From: Richard Biener  
Sent: Saturday, June 22, 2024 9:19 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com; 
jeffreya...@gmail.com; rdapp@gmail.com
Subject: Re: [PATCH v1] Ifcvt: Add cond tree reconcile for truncated .SAT_SUB

On Fri, Jun 21, 2024 at 4:45 PM Li, Pan2  wrote:
>
> Thanks Richard for suggestion, tried the (convert? with below gimple stmt but 
> got a miss def ice.
> To double confirm, the *type_out should be the vector type of lhs, and we 
> only need to build
> one cvt stmt from itype to otype here. Or just return the call directly and 
> set the type_out to the v_otype?
>
> static gimple *
> vect_recog_build_binary_gimple_stmt (vec_info *vinfo, gimple *stmt,
>  internal_fn fn, tree *type_out,
>  tree lhs, tree op_0, tree op_1)
> {
>   tree itype = TREE_TYPE (op_0);
>   tree otype = TREE_TYPE (lhs);
>   tree v_itype = get_vectype_for_scalar_type (vinfo, itype);
>   tree v_otype = get_vectype_for_scalar_type (vinfo, otype);
>
>   if (v_itype != NULL_TREE && v_otype != NULL_TREE
> && direct_internal_fn_supported_p (fn, v_itype, OPTIMIZE_FOR_BOTH))
> {
>   gcall *call = gimple_build_call_internal (fn, 2, op_0, op_1);
>   tree itype_ssa = vect_recog_temp_ssa_var (itype, NULL);
>
>   gimple_call_set_lhs (call, itype_ssa);
>   gimple_call_set_nothrow (call, /* nothrow_p */ false);
>   gimple_set_location (call, gimple_location (stmt));
>
>   *type_out = v_otype;
>   gimple *new_stmt = call;
>
>   if (itype != otype)
> {
>   tree otype_ssa = vect_recog_temp_ssa_var (otype, NULL);
>   new_stmt = gimple_build_assign (otype_ssa, CONVERT_EXPR, itype_ssa);
> }
>
>   return new_stmt;

You need to refactor this to add to the stmts pattern def sequence
(look for append_pattern_def_seq uses for example)

> }
>
>   return NULL;
> }
>
> -cut the ice---
>
> zip.test.c: In function ‘test’:
> zip.test.c:4:6: error: missing definition
> 4 | void test (uint16_t *x, unsigned b, unsigned n)
>   |  ^~~~
> for SSA_NAME: patt_40 in statement:
> vect_cst__151 = [vec_duplicate_expr] patt_40;
> during GIMPLE pass: vect
> dump file: zip.test.c.180t.vect
> zip.test.c:4:6: internal compiler error: verify_ssa failed
> 0x1de0860 verify_ssa(bool, bool)
> 
> /home/pli/gcc/555/riscv-gnu-toolchain/gcc/__RISCV_BUILD__/../gcc/tree-ssa.cc:1203
> 0x1919f69 execute_function_todo
> 
> /home/pli/gcc/555/riscv-gnu-toolchain/gcc/__RISCV_BUILD__/../gcc/passes.cc:2096
> 0x1918b46 do_per_function
> 
> /home/pli/gcc/555/riscv-gnu-toolchain/gcc/__RISCV_BUILD__/../gcc/passes.cc:1688
> 0x191a116 execute_todo
>
> Pan
>
>
> -Original Message-
> From: Richard Biener 
> Sent: Friday, June 21, 2024 5:29 PM
> To: Li, Pan2 
> Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com; 
> jeffreya...@gmail.com; rdapp@gmail.com
> Subject: Re: [PATCH v1] Ifcvt: Add cond tree reconcile for truncated .SAT_SUB
>
> On Fri, Jun 21, 2024 at 10:50 AM Li, Pan2  wrote:
> >
> > Thanks Richard for comments.
> >
> > > to match this by changing it to
> >
> > > /* Unsigned saturation sub, case 2 (branch with ge):
> > >SAT_U_SUB = X >= Y ? X - Y : 0.  */
> > > (match (unsigned_integer_sat_sub @0 @1)
> > > (cond^ (ge @0 @1) (convert? (minus @0 @1)) integer_zerop)
> > >  (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type)
> > >   && types_match (type, @0, @1
> >
> > Do we need another name for this matching ? Add (convert? here may change 
> > the sematics of .SAT_SUB.
> > When we call gimple_unsigned_integer_sat_sub (lhs, ops, NULL), the 
> > converted value may be returned different
> > to the (minus @0 @1). Please correct me if my understanding is wrong.
>
> I think gimple_unsigned_integer_sat_sub (lhs, ...) simply matches
> (typeof LHS).SAT_SUB (ops[0], ops[1]) now, I don't think it's necessary to
> handle the case where typef LHS and typeof ops[0] are equal specially?
>
> > > and when using the gimple_match_* function make sure to consider
> > > that the .SAT_SUB (@0, @1) is converted to the type of the SSA name
> > > we matched?
> >
> > This may have problem for vector part I guess, require some additional 
> > change from vectorize_convert when
> > I try to do that in previous. Let me double check about it, and keep you 
> > posted.
>
> You are using gimple_unsigned_integer_sat_sub from pattern recognition, the
> thing to do is simply to add a conversion stmt to the pattern sequence in case
> the types differ?
>
> But maybe I'm missing something.
>
> Richard.
>
> > Pan
> >
> > -Original Message-
> > From: Richard Biener 
> > Sent: Friday, June 21, 2024 3:00 PM
> > To: Li, Pan2 

[committed][RISC-V][PR target/114139] Verify we have a CONST_INT before extracting INTVAL

2024-06-23 Thread Jeff Law
Run-of-the-mill checking issue.  We had something like (plus (reg) 
(reg)) and tried to extract INTVAL (XEXP (x, 1)) which of course blows 
up with checking on.


Fixed thusly.   Tested on riscv32-elf in my tester.  riscv64-elf is in 
flight, but won't finish for a while due to other tasks in flight.


Jeff
commit fd536b8412d4dae42aa04739c06f99a915be6261
Author: Jeff Law 
Date:   Sun Jun 23 08:26:25 2024 -0600

[committed][RISC-V][PR target/114139] Verify we have a CONST_INT before 
extracting INTVAL

Run-of-the-mill checking issue.  We had something like (plus (reg) (reg)) 
and
tried to extract INTVAL (XEXP (x, 1)) which of course blows up with checking
on.

Fixed thusly.   Tested on riscv32-elf in my tester.  riscv64-elf is in 
flight,
but won't finish for a while due to other tasks in flight.

PR target/114139
gcc/
* config/riscv/riscv.cc (riscv_macro_fusion_pair_p): Verify object
is a CONST_INT before looking at INTVAL.

gcc/testsuite/

* gcc.target/riscv/pr114139.c: New test.

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index c17141d909a..5c758b95327 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -9242,6 +9242,7 @@ riscv_macro_fusion_pair_p (rtx_insn *prev, rtx_insn *curr)
  && XINT (SET_SRC (prev_set), 1) == UNSPEC_AUIPC
  && (GET_CODE (SET_SRC (curr_set)) == LO_SUM
  || (GET_CODE (SET_SRC (curr_set)) == PLUS
+ && CONST_INT_P (XEXP (SET_SRC (curr_set), 1))
  && SMALL_OPERAND (INTVAL (XEXP (SET_SRC (curr_set), 1))
 
return true;
diff --git a/gcc/testsuite/gcc.target/riscv/pr114139.c 
b/gcc/testsuite/gcc.target/riscv/pr114139.c
new file mode 100644
index 000..1d4eeb65f5c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/pr114139.c
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fpic -mexplicit-relocs -mcpu=sifive-p450" } */
+
+static void *p;
+extern void *a[];
+void
+baz (void)
+{
+  p = 0;
+}
+
+void bar (void);
+void
+foo (int i)
+{
+  bar ();
+  a[i] = p;
+}
+
+
+double *d;
+void
+foobar (int i)
+{
+  for (; i; ++i)
+d[i] = 1;
+}


[to-be-committed][V2][RISC-V] Handle bit manipulation of SImode values

2024-06-23 Thread Jeff Law


Nothing actually changed, just attached the wrong patch last time...

--

Last patch in this round of bitmanip work...  At least I think I'm going 
to pause here and switch gears to other projects that need attention 🙂



This patch introduces the ability to generate bitmanip instructions for 
rv64 when operating on SI objects when we know something about the range 
of the bit position (due to masking of the position).


I've got note that the (7-pos % 8) bit position form was discovered by 
RAU in 500.perl.  I took that and expanded it to the simple (pos & mask) 
form as well as covering bset, binv and bclr.


As far as the implementation is concerned

This turns the recently added define_splits into define_insn_and_split 
constructs.  This allows combine to "see" enough RTL to realize a sign 
extension is unnecessary.  Otherwise we get undesirable sign extensions 
for the new testcases.


Second it adds new patterns for the logical operations.  Two patterns 
for IOR/XOR and two patterns for AND.


I think a key concept to keep in mind is that once we determine a Zbs 
operation is safe to perform on a SI value, we can rewrite the RTL in 
64bit form.  If we were ever to try and use range information at expand 
time for this stuff (and we probably should investigate that), that's 
the path I'd suggest.


This is notably cleaner than my original implementation which actually 
kept the more complex RTL form through final and emitted 2/3 
instructions (mask the bit position, then the bset/bclr/binv).



Tested in my tester, but waiting for pre-commit CI to report back before 
taking further action.


Jeff

gcc/


* config/riscv/bitmap.md (bset splitters): Turn into define_and_splits.
Don't depend on combine splitting the "andn with constant" form.
(bset, binv, bclr with masked bit position): New patterns.

gcc/testsuite
* gcc.target/riscv/binv-for-simode.c: New test.
* gcc.target/riscv/bset-for-simode.c: New test.
* gcc.target/riscv/bclr-for-simode.c: New test.


diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md
index 3eedabffca0..f403ba8dbba 100644
--- a/gcc/config/riscv/bitmanip.md
+++ b/gcc/config/riscv/bitmanip.md
@@ -615,37 +615,140 @@ (define_insn "*bsetdi_2"
 ;; shift constant.  With the limited range we know the SImode sign
 ;; bit is never set, thus we can treat this as zero extending and
 ;; generate the bsetdi_2 pattern.
-(define_split
-  [(set (match_operand:DI 0 "register_operand")
+(define_insn_and_split ""
+  [(set (match_operand:DI 0 "register_operand" "=r")
(any_extend:DI
 (ashift:SI (const_int 1)
(subreg:QI
- (and:DI (not:DI (match_operand:DI 1 "register_operand"))
+ (and:DI (not:DI (match_operand:DI 1 "register_operand" 
"r"))
  (match_operand 2 "const_int_operand")) 0
-   (clobber (match_operand:DI 3 "register_operand"))]
+   (clobber (match_scratch:X 3 "=&r"))]
   "TARGET_64BIT
&& TARGET_ZBS
&& (TARGET_ZBB || TARGET_ZBKB)
&& (INTVAL (operands[2]) & 0x1f) != 0x1f"
-   [(set (match_dup 0) (and:DI (not:DI (match_dup 1)) (match_dup 2)))
-(set (match_dup 0) (zero_extend:DI (ashift:SI
-  (const_int 1)
-  (subreg:QI (match_dup 0) 0])
+  "#"
+  "&& reload_completed"
+   [(set (match_dup 3) (match_dup 2))
+(set (match_dup 3) (and:DI (not:DI (match_dup 1)) (match_dup 3)))
+(set (match_dup 0) (zero_extend:DI
+(ashift:SI (const_int 1) (match_dup 4]
+  { operands[4] = gen_lowpart (QImode, operands[3]); }
+  [(set_attr "type" "bitmanip")])
 
-(define_split
-  [(set (match_operand:DI 0 "register_operand")
-   (any_extend:DI
+(define_insn_and_split ""
+  [(set (match_operand:DI 0 "register_operand" "=r")
+(any_extend:DI
 (ashift:SI (const_int 1)
(subreg:QI
- (and:DI (match_operand:DI 1 "register_operand")
+ (and:DI (match_operand:DI 1 "register_operand" "r")
  (match_operand 2 "const_int_operand")) 0]
   "TARGET_64BIT
&& TARGET_ZBS
&& (INTVAL (operands[2]) & 0x1f) != 0x1f"
-   [(set (match_dup 0) (and:DI (match_dup 1) (match_dup 2)))
-(set (match_dup 0) (zero_extend:DI (ashift:SI
-  (const_int 1)
-  (subreg:QI (match_dup 0) 0])
+  "#"
+  "&& 1"
+  [(set (match_dup 0) (and:DI (match_dup 1) (match_dup 2)))
+   (set (match_dup 0) (zero_extend:DI (ashift:SI
+(const_int 1)
+(subreg:QI (match_dup 0) 0]
+  { }
+  [(set_attr "type" "bitmanip")])
+
+;; Similarly two patterns for IOR/XOR generating bset/binv to
+;; manipulate a bit in a register
+(define_insn_and_split ""
+  [(set (match_operand:DI 0 "register_operand" "=r")
+   (any

Re: [PATCH 7/8] vect: Support multiple lane-reducing operations for loop reduction [PR114440]

2024-06-23 Thread Feng Xue OS
>> -  if (slp_node)
>> +  if (slp_node && SLP_TREE_LANES (slp_node) > 1)
> 
> Hmm, that looks wrong.  It looks like SLP_TREE_NUMBER_OF_VEC_STMTS is off
> instead, which is bad.
> 
>> nvectors = SLP_TREE_NUMBER_OF_VEC_STMTS (slp_node);
>>else
>> nvectors = vect_get_num_copies (loop_vinfo, vectype_in);
>> @@ -7478,6 +7472,152 @@ vect_reduction_update_partial_vector_usage 
>> (loop_vec_info loop_vinfo,
>>  }
>>  }
>>
>> +/* Check if STMT_INFO is a lane-reducing operation that can be vectorized in
>> +   the context of LOOP_VINFO, and vector cost will be recorded in COST_VEC.
>> +   Now there are three such kinds of operations: dot-prod/widen-sum/sad
>> +   (sum-of-absolute-differences).
>> +
>> +   For a lane-reducing operation, the loop reduction path that it lies in,
>> +   may contain normal operation, or other lane-reducing operation of 
>> different
>> +   input type size, an example as:
>> +
>> + int sum = 0;
>> + for (i)
>> +   {
>> + ...
>> + sum += d0[i] * d1[i];   // dot-prod 
>> + sum += w[i];// widen-sum 
>> + sum += abs(s0[i] - s1[i]);  // sad 
>> + sum += n[i];// normal 
>> + ...
>> +   }
>> +
>> +   Vectorization factor is essentially determined by operation whose input
>> +   vectype has the most lanes ("vector(16) char" in the example), while we
>> +   need to choose input vectype with the least lanes ("vector(4) int" in the
>> +   example) for the reduction PHI statement.  */
>> +
>> +bool
>> +vectorizable_lane_reducing (loop_vec_info loop_vinfo, stmt_vec_info 
>> stmt_info,
>> +   slp_tree slp_node, stmt_vector_for_cost 
>> *cost_vec)
>> +{
>> +  gimple *stmt = stmt_info->stmt;
>> +
>> +  if (!lane_reducing_stmt_p (stmt))
>> +return false;
>> +
>> +  tree type = TREE_TYPE (gimple_assign_lhs (stmt));
>> +
>> +  if (!INTEGRAL_TYPE_P (type) && !SCALAR_FLOAT_TYPE_P (type))
>> +return false;
>> +
>> +  /* Do not try to vectorize bit-precision reductions.  */
>> +  if (!type_has_mode_precision_p (type))
>> +return false;
>> +
>> +  if (!slp_node)
>> +return false;
>> +
>> +  for (int i = 0; i < (int) gimple_num_ops (stmt) - 1; i++)
>> +{
>> +  stmt_vec_info def_stmt_info;
>> +  slp_tree slp_op;
>> +  tree op;
>> +  tree vectype;
>> +  enum vect_def_type dt;
>> +
>> +  if (!vect_is_simple_use (loop_vinfo, stmt_info, slp_node, i, &op,
>> +  &slp_op, &dt, &vectype, &def_stmt_info))
>> +   {
>> + if (dump_enabled_p ())
>> +   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
>> +"use not simple.\n");
>> + return false;
>> +   }
>> +
>> +  if (!vectype)
>> +   {
>> + vectype = get_vectype_for_scalar_type (loop_vinfo, TREE_TYPE (op),
>> +slp_op);
>> + if (!vectype)
>> +   return false;
>> +   }
>> +
>> +  if (!vect_maybe_update_slp_op_vectype (slp_op, vectype))
>> +   {
>> + if (dump_enabled_p ())
>> +   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
>> +"incompatible vector types for invariants\n");
>> + return false;
>> +   }
>> +
>> +  if (i == STMT_VINFO_REDUC_IDX (stmt_info))
>> +   continue;
>> +
>> +  /* There should be at most one cycle def in the stmt.  */
>> +  if (VECTORIZABLE_CYCLE_DEF (dt))
>> +   return false;
>> +}
>> +
>> +  stmt_vec_info reduc_info = STMT_VINFO_REDUC_DEF (vect_orig_stmt 
>> (stmt_info));
>> +
>> +  /* TODO: Support lane-reducing operation that does not directly 
>> participate
>> + in loop reduction. */
>> +  if (!reduc_info || STMT_VINFO_REDUC_IDX (stmt_info) < 0)
>> +return false;
>> +
>> +  /* Lane-reducing pattern inside any inner loop of LOOP_VINFO is not
>> + recoginized.  */
>> +  gcc_assert (STMT_VINFO_DEF_TYPE (reduc_info) == vect_reduction_def);
>> +  gcc_assert (STMT_VINFO_REDUC_TYPE (reduc_info) == TREE_CODE_REDUCTION);
>> +
>> +  tree vectype_in = STMT_VINFO_REDUC_VECTYPE_IN (stmt_info);
>> +  int ncopies_for_cost;
>> +
>> +  if (SLP_TREE_LANES (slp_node) > 1)
>> +{
>> +  /* Now lane-reducing operations in a non-single-lane slp node should 
>> only
>> +come from the same loop reduction path.  */
>> +  gcc_assert (REDUC_GROUP_FIRST_ELEMENT (stmt_info));
>> +  ncopies_for_cost = 1;
>> +}
>> +  else
>> +{
>> +  ncopies_for_cost = vect_get_num_copies (loop_vinfo, vectype_in);
> 
> OK, so the fact that the ops are lane-reducing means they effectively
> change the VF for the result.  That's only possible as we tightly control
> code generation and "adjust" to the expected VF (by inserting the copies
> you mentioned above), but only up to the highest number of outputs
> created in the reduction chain.  In that sense instead of talking and 
> re

[to-be-commited][RISC-V] Fix unrecognizable pattern in riscv_expand_conditional_move()

2024-06-23 Thread Jeff Law
This is Artemiy's patch just reformatting so that pre-commit can apply 
and test it.  The patch looks good to me and it's spinning in my tester 
as well.


--

Presently, the code fragment:

int x[5];

void
d(int a, int b, int c) {
  for (int i = 0; i < 5; i++)
x[i] = (a != b) ? c : a;
}

causes an ICE when compiled with -O2 -march=rv32i_zicond:

test.c: In function 'd':
test.c: error: unrecognizable insn:
   11 | }
  | ^
(insn 8 5 9 2 (set (reg:SI 139 [ iftmp.0_2 ])
(if_then_else:SI (ne:SI (reg/v:SI 136 [ a ])
(reg/v:SI 137 [ b ]))
(reg/v:SI 136 [ a ])
(reg/v:SI 138 [ c ]))) -1
 (nil))
during RTL pass: vregs

This happens because, as part of one of the optimizations in
riscv_expand_conditional_move(), an if_then_else is generated with both
comparands being register operands, resulting in an unmatchable insn since
Zicond patterns require constant 0 as the second comparand.  Fix this by 
adding

a extra check before performing this optimization.

The code snippet mentioned above is also included in this patch as a new 
Zicond

testcase.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_expand_conditional_move): Add a
CONST0_RTX check.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/zicond-ice-3.c: New test.

Signed-off-by: Artemiy Volkov 
---
 gcc/config/riscv/riscv.cc |  3 ++-
 gcc/testsuite/gcc.target/riscv/zicond-ice-3.c | 11 +++
 2 files changed, 13 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/zicond-ice-3.cdiff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 5c758b95327..cca7ffde33a 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -5111,8 +5111,9 @@ riscv_expand_conditional_move (rtx dest, rtx op, rtx 
cons, rtx alt)
   /* reg, reg  */
   else if (REG_P (cons) && REG_P (alt))
{
- if ((code == EQ && rtx_equal_p (cons, op0))
+ if (((code == EQ && rtx_equal_p (cons, op0))
   || (code == NE && rtx_equal_p (alt, op0)))
+ && op1 == CONST0_RTX (mode))
{
  rtx cond = gen_rtx_fmt_ee (code, GET_MODE (op0), op0, op1);
  alt = force_reg (mode, alt);
diff --git a/gcc/testsuite/gcc.target/riscv/zicond-ice-5.c 
b/gcc/testsuite/gcc.target/riscv/zicond-ice-5.c
new file mode 100644
index 000..ac6049c9ae5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zicond-ice-5.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc_zicond -mabi=lp64d" { target { rv64 } } } */
+/* { dg-options "-march=rv32gc_zicond -mabi=ilp32f" { target { rv32 } } } */
+
+int x[5];
+
+void
+d(int a, int b, int c) {
+  for (int i = 0; i < 5; i++)
+x[i] = (a != b) ? c : a;
+}


[PATCH] Fix MinGW option -mcrtdll=

2024-06-23 Thread Pali Rohár
Add missing msvcr40* and msvcrtd* cases to CPP_SPEC and
document missing _UCRT macro and msvcr71* case.

Fixes commit 453cb585f0f8673a5d69d1b420ffd4b3f53aca00.

gcc/
 * config/i386/mingw-w64.h (CPP_SPEC): Add missing -mcrtdll=
 cases: msvcr40*, msvcrtd*.
 * config/mingw/mingw32.h (CPP_SPEC): Add missing -mcrtdll=
 cases: msvcr40*, msvcrtd*.
 * doc/invoke.texi: Add missing -mcrtdll= cases: msvcr40*,
 msvcrtd*, msvcr71*. Express wildcards with *. Document _UCRT.
---
 gcc/config/i386/mingw-w64.h |  2 ++
 gcc/config/mingw/mingw32.h  |  2 ++
 gcc/doc/invoke.texi | 13 +++--
 3 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/gcc/config/i386/mingw-w64.h b/gcc/config/i386/mingw-w64.h
index dde26413e221..0a9986c44d40 100644
--- a/gcc/config/i386/mingw-w64.h
+++ b/gcc/config/i386/mingw-w64.h
@@ -30,6 +30,8 @@ along with GCC; see the file COPYING3.  If not see
 "%{mcrtdll=msvcrt10*:-D__MSVCRT_VERSION__=0x100} " \
 "%{mcrtdll=msvcrt20*:-D__MSVCRT_VERSION__=0x200} " \
 "%{mcrtdll=msvcrt40*:-D__MSVCRT_VERSION__=0x400} " \
+"%{mcrtdll=msvcr40*:-D__MSVCRT_VERSION__=0x400} " \
+"%{mcrtdll=msvcrtd*:-D__MSVCRT_VERSION__=0x600} " \
 "%{mcrtdll=msvcrt-os*:-D__MSVCRT_VERSION__=0x700} " \
 "%{mcrtdll=msvcr70*:-D__MSVCRT_VERSION__=0x700} " \
 "%{mcrtdll=msvcr71*:-D__MSVCRT_VERSION__=0x701} " \
diff --git a/gcc/config/mingw/mingw32.h b/gcc/config/mingw/mingw32.h
index fa6e307476ca..da8e1e8949e7 100644
--- a/gcc/config/mingw/mingw32.h
+++ b/gcc/config/mingw/mingw32.h
@@ -99,6 +99,8 @@ along with GCC; see the file COPYING3.  If not see
 "%{mcrtdll=msvcrt10*:-D__MSVCRT_VERSION__=0x100} " \
 "%{mcrtdll=msvcrt20*:-D__MSVCRT_VERSION__=0x200} " \
 "%{mcrtdll=msvcrt40*:-D__MSVCRT_VERSION__=0x400} " \
+"%{mcrtdll=msvcr40*:-D__MSVCRT_VERSION__=0x400} " \
+"%{mcrtdll=msvcrtd*:-D__MSVCRT_VERSION__=0x600} " \
 "%{mcrtdll=msvcrt-os*:-D__MSVCRT_VERSION__=0x700} " \
 "%{mcrtdll=msvcr70*:-D__MSVCRT_VERSION__=0x700} " \
 "%{mcrtdll=msvcr71*:-D__MSVCRT_VERSION__=0x701} " \
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index c790e2f35184..d715430dc54f 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -36453,13 +36453,14 @@ enabled by default on those targets.
 @opindex mcrtdll
 @item -mcrtdll=@var{library}
 Preprocess, compile or link with specified C RunTime DLL @var{library}.
-This option adjust predefined macros @code{__CRTDLL__}, @code{__MSVCRT__}
-and @code{__MSVCRT_VERSION__} for specified CRT @var{library}, choose
-start file for CRT @var{library} and link with CRT @var{library}.
+This option adjust predefined macros @code{__CRTDLL__}, @code{__MSVCRT__},
+@code{_UCRT} and @code{__MSVCRT_VERSION__} for specified CRT @var{library},
+choose start file for CRT @var{library} and link with CRT @var{library}.
 Recognized CRT library names for proprocessor are:
-@code{crtdll}, @code{msvcrt10}, @code{msvcrt20}, @code{msvcrt40},
-@code{msvcrt-os}, @code{msvcr70}, @code{msvcr80}, @code{msvcr90},
-@code{msvcr100}, @code{msvcr110}, @code{msvcr120} and @code{ucrt}.
+@code{crtdll*}, @code{msvcrt10*}, @code{msvcrt20*}, @code{msvcrt40*},
+@code{msvcr40*}, @code{msvcrtd*}, @code{msvcrt-os*},
+@code{msvcr70*}, @code{msvcr71*}, @code{msvcr80*}, @code{msvcr90*},
+@code{msvcr100*}, @code{msvcr110*}, @code{msvcr120*} and @code{ucrt*}.
 If this options is not specified then the default MinGW import library
 @code{msvcrt} is used for linking and no other adjustment for
 preprocessor is done. MinGW import library @code{msvcrt} is just a
-- 
2.20.1



[C PATCH] C: Error message for incorrect use of static in array declarations

2024-06-23 Thread Martin Uecker


This adds an explicit error message for [static] and [static*] 
(the same as clang has) instead of the generic "error: expected
expression before ']' token", which is not entirely accurate.
For function definitions the subsequent error "[*] can not be
used outside function prototype scope" is then suppressed.


Bootstrapped and regression tested on x86_64.



commit 1157d04764eeeb51fa1098727813dbc092e11dd2
Author: Martin Uecker 
Date:   Sat Nov 4 14:39:19 2023 +0100

C: Error message for incorrect use of static in array declarations.

Add an explicit error messages when c99's static is
used without a size expression in an array declarator.

gcc/:
c/c-parser.cc (c_parser_direct_declarator_inner): Add
error message.

gcc/testsuite:
gcc.dg/c99-arraydecl-4.c: New test.

diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index e83e9c683f7..91b8d24ca78 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -4732,41 +4732,29 @@ c_parser_direct_declarator_inner (c_parser *parser, 
bool id_present,
false, false, false, false, cla_prefer_id);
   if (!quals_attrs->declspecs_seen_p)
quals_attrs = NULL;
-  /* If "static" is present, there must be an array dimension.
-Otherwise, there may be a dimension, "*", or no
-dimension.  */
-  if (static_seen)
+
+  star_seen = false;
+  if (c_parser_next_token_is (parser, CPP_MULT)
+ && c_parser_peek_2nd_token (parser)->type == CPP_CLOSE_SQUARE)
{
- star_seen = false;
- dimen = c_parser_expr_no_commas (parser, NULL);
+ star_seen = true;
+ c_parser_consume_token (parser);
}
-  else
+  else if (!c_parser_next_token_is (parser, CPP_CLOSE_SQUARE))
+   dimen = c_parser_expr_no_commas (parser, NULL);
+
+  if (static_seen && star_seen)
{
- if (c_parser_next_token_is (parser, CPP_CLOSE_SQUARE))
-   {
- dimen.value = NULL_TREE;
- star_seen = false;
-   }
- else if (c_parser_next_token_is (parser, CPP_MULT))
-   {
- if (c_parser_peek_2nd_token (parser)->type == CPP_CLOSE_SQUARE)
-   {
- dimen.value = NULL_TREE;
- star_seen = true;
- c_parser_consume_token (parser);
-   }
- else
-   {
- star_seen = false;
- dimen = c_parser_expr_no_commas (parser, NULL);
-   }
-   }
- else
-   {
- star_seen = false;
- dimen = c_parser_expr_no_commas (parser, NULL);
-   }
+ error_at (c_parser_peek_token (parser)->location,
+   "% may not be used with an unspecified "
+   "variable length array size");
+ /* Prevent further errors. */
+ star_seen = false;
}
+  else if (static_seen && NULL_TREE == dimen.value)
+   error_at (c_parser_peek_token (parser)->location,
+ "% may not be used without an array size");
+
   if (c_parser_next_token_is (parser, CPP_CLOSE_SQUARE))
c_parser_consume_token (parser);
   else
diff --git a/gcc/testsuite/gcc.dg/c99-arraydecl-4.c 
b/gcc/testsuite/gcc.dg/c99-arraydecl-4.c
new file mode 100644
index 000..bfc26196433
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c99-arraydecl-4.c
@@ -0,0 +1,15 @@
+/* { dg-do "compile" } */
+/* { dg-options "-std=c99 -pedantic-errors" } */
+
+void fo(char buf[static]); /* { dg-error "'static' may not be used without 
an array size" } */
+void fo(char buf[static]) { }  /* { dg-error "'static' may not be used without 
an array size" } */
+
+void fu(char buf[static *]);   /* { dg-error "'static' may not be used with an 
unspecified variable length array size" } */
+void fu(char buf[static *]) { }/* { dg-error "'static' may not be used 
with an unspecified variable length array size" } */
+
+void fe(int n, char buf[static n]);
+void fe(int n, char buf[static *]) { } /* { dg-error "'static' may not be used 
with an unspecified variable length array size" } */
+
+void fa(int *n, char buf[static *n]);
+void fa(int *n, char buf[static *n]) { }
+







Re: [PATCH] RISC-V: Fix unrecognizable pattern in riscv_expand_conditional_move()

2024-06-23 Thread Jeff Law




On 6/21/24 8:46 AM, Artemiy Volkov wrote:

Presently, the code fragment:

int x[5];

void
d(int a, int b, int c) {
   for (int i = 0; i < 5; i++)
 x[i] = (a != b) ? c : a;
}

causes an ICE when compiled with -O2 -march=rv32i_zicond:

test.c: In function 'd':
test.c: error: unrecognizable insn:
11 | }
   | ^
(insn 8 5 9 2 (set (reg:SI 139 [ iftmp.0_2 ])
 (if_then_else:SI (ne:SI (reg/v:SI 136 [ a ])
 (reg/v:SI 137 [ b ]))
 (reg/v:SI 136 [ a ])
 (reg/v:SI 138 [ c ]))) -1
  (nil))
during RTL pass: vregs

This happens because, as part of one of the optimizations in
riscv_expand_conditional_move(), an if_then_else is generated with both
comparands being register operands, resulting in an unmatchable insn since
Zicond patterns require constant 0 as the second comparand.  Fix this by adding
a extra check before performing this optimization.

The code snippet mentioned above is also included in this patch as a new Zicond
testcase.

gcc/ChangeLog:

 * config/riscv/riscv.cc (riscv_expand_conditional_move): Add a
 CONST0_RTX check.

gcc/testsuite/ChangeLog:

 * gcc.target/riscv/zicond-ice-3.c: New test.

I've pushed this to the trunk.  Thanks Artemiy!

Jeff



[PATCH] Fortran: fix passing of optional dummy as actual to optional argument [PR55978]

2024-06-23 Thread Harald Anlauf
Dear all,

the attached patch fixes issues exhibited by the testcase in comment#19 of 
PR55978.

First, when passing an allocatable optional dummy array to an optional dummy,
we need to prevent accessing the data component of the array when the argument
is not present, and pass a null pointer instead.  This is straightforward.

Second, the case of a missing pointer optional dummy array should have worked,
but the presence check surprisingly did not work as expected at -O0 or -Og,
but at higher optimization levels.  Interestingly, the dump-tree looked right,
but running under gdb or investigating the assembler revealed that the order
of tests in a logical AND expression was opposed to what the tree-dump looked
like.  Replacing TRUTH_AND_EXPR by TRUTH_ANDIF_EXPR and checking the optimized
dump confirmed that this does fix the issue.

Note that the tree-dump is not changed by this replacement.  Does this mean
thar AND and ANDIF currently are not differentiated at this level?

Regtested on x86_64-pc-linux-gnu.  OK for mainline?

Would it be ok to backport this to 14-branch, too?

Thanks,
Harald

From 94e4c66d8374a12be38637620f362acf1fba5343 Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Sun, 23 Jun 2024 22:36:43 +0200
Subject: [PATCH] Fortran: fix passing of optional dummy as actual to optional
 argument [PR55978]

gcc/fortran/ChangeLog:

	PR fortran/55978
	* trans-array.cc (gfc_conv_array_parameter): Do not dereference
	data component of a missing allocatable dummy array argument for
	passing as actual to optional dummy.  Harden logic of presence
	check for optional pointer dummy by using TRUTH_ANDIF_EXPR instead
	of TRUTH_AND_EXPR.

gcc/testsuite/ChangeLog:

	PR fortran/55978
	* gfortran.dg/optional_absent_12.f90: New test.
---
 gcc/fortran/trans-array.cc| 20 ++---
 .../gfortran.dg/optional_absent_12.f90| 30 +++
 2 files changed, 46 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/optional_absent_12.f90

diff --git a/gcc/fortran/trans-array.cc b/gcc/fortran/trans-array.cc
index 19d69aec9c0..26237f43bec 100644
--- a/gcc/fortran/trans-array.cc
+++ b/gcc/fortran/trans-array.cc
@@ -8703,6 +8703,10 @@ gfc_conv_array_parameter (gfc_se * se, gfc_expr * expr, bool g77,
 	&& (sym->backend_decl != parent))
 this_array_result = false;

+  /* Passing an optional dummy argument as actual to an optional dummy?  */
+  bool pass_optional;
+  pass_optional = fsym && fsym->attr.optional && sym && sym->attr.optional;
+
   /* Passing address of the array if it is not pointer or assumed-shape.  */
   if (full_array_var && g77 && !this_array_result
   && sym->ts.type != BT_DERIVED && sym->ts.type != BT_CLASS)
@@ -8740,6 +8744,14 @@ gfc_conv_array_parameter (gfc_se * se, gfc_expr * expr, bool g77,
 	  if (size)
 	array_parameter_size (&se->pre, tmp, expr, size);
 	  se->expr = gfc_conv_array_data (tmp);
+	  if (pass_optional)
+	{
+	  tree cond = gfc_conv_expr_present (sym);
+	  se->expr = build3_loc (input_location, COND_EXPR,
+ TREE_TYPE (se->expr), cond, se->expr,
+ fold_convert (TREE_TYPE (se->expr),
+		   null_pointer_node));
+	}
   return;
 }
 }
@@ -8989,8 +9001,8 @@ gfc_conv_array_parameter (gfc_se * se, gfc_expr * expr, bool g77,
 	  tmp = fold_build2_loc (input_location, NE_EXPR, logical_type_node,
  fold_convert (TREE_TYPE (tmp), ptr), tmp);

-	  if (fsym && fsym->attr.optional && sym && sym->attr.optional)
-	tmp = fold_build2_loc (input_location, TRUTH_AND_EXPR,
+	  if (pass_optional)
+	tmp = fold_build2_loc (input_location, TRUTH_ANDIF_EXPR,
    logical_type_node,
    gfc_conv_expr_present (sym), tmp);

@@ -9024,8 +9036,8 @@ gfc_conv_array_parameter (gfc_se * se, gfc_expr * expr, bool g77,
   tmp = fold_build2_loc (input_location, NE_EXPR, logical_type_node,
 			 fold_convert (TREE_TYPE (tmp), ptr), tmp);

-  if (fsym && fsym->attr.optional && sym && sym->attr.optional)
-	tmp = fold_build2_loc (input_location, TRUTH_AND_EXPR,
+  if (pass_optional)
+	tmp = fold_build2_loc (input_location, TRUTH_ANDIF_EXPR,
 			   logical_type_node,
 			   gfc_conv_expr_present (sym), tmp);

diff --git a/gcc/testsuite/gfortran.dg/optional_absent_12.f90 b/gcc/testsuite/gfortran.dg/optional_absent_12.f90
new file mode 100644
index 000..1e61d91fb6d
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/optional_absent_12.f90
@@ -0,0 +1,30 @@
+! { dg-do run }
+! { dg-additional-options "-fcheck=array-temps" }
+!
+! PR fortran/55978 - comment#19
+!
+! Test passing of (missing) optional dummy to optional array argument
+
+program test
+  implicit none
+  integer, pointer :: p(:) => null()
+  call one (p)
+  call one (null())
+  call one ()
+  call three ()
+contains
+  subroutine one (y)
+integer, pointer, optional, intent(in) :: y(:)
+call two (y)
+  end subroutine one
+
+  subroutine three (z)
+integer, allocatable, optional, intent(in) ::

[PATCH V2] [x86] Optimize a < 0 ? -1 : 0 to (signed)a >> 31.

2024-06-23 Thread liuhongt
> I think the check for TYPE_UNSIGNED should be of TREE_TYPE (@0) rather
> than type here.

Changed

> Or maybe you need `types_match (type, TREE_TYPE (@0))` too.
And use tree_nop_conversion_p (type, TREE_TYPE (@0)) and add view_convert to 
rshift.

Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
Ok for trunk?


Try to optimize x < 0 ? -1 : 0 into (signed) x >> 31
and x < 0 ? 1 : 0 into (unsigned) x >> 31.

Move the optimization did in ix86_expand_int_vcond to match.pd

gcc/ChangeLog:

PR target/114189
* match.pd: Simplify a < 0 ? -1 : 0 to (signed) >> 31 and a <
0 ? 1 : 0 to (unsigned) a >> 31 for vector integer type.

gcc/testsuite/ChangeLog:

* gcc.target/i386/avx2-pr115517.c: New test.
* gcc.target/i386/avx512-pr115517.c: New test.
* g++.target/i386/avx2-pr115517.C: New test.
* g++.target/i386/avx512-pr115517.C: New test.
* g++.dg/tree-ssa/pr88152-1.C: Adjust testcase.
---
 gcc/match.pd  | 31 
 gcc/testsuite/g++.dg/tree-ssa/pr88152-1.C |  2 +-
 gcc/testsuite/g++.target/i386/avx2-pr115517.C | 60 
 .../g++.target/i386/avx512-pr115517.C | 70 +++
 gcc/testsuite/gcc.target/i386/avx2-pr115517.c | 33 +
 .../gcc.target/i386/avx512-pr115517.c | 70 +++
 6 files changed, 265 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.target/i386/avx2-pr115517.C
 create mode 100644 gcc/testsuite/g++.target/i386/avx512-pr115517.C
 create mode 100644 gcc/testsuite/gcc.target/i386/avx2-pr115517.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512-pr115517.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 3d0689c9312..1d10451d0de 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -5927,6 +5927,37 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
(if (VECTOR_INTEGER_TYPE_P (type)
&& target_supports_op_p (type, MINMAX, optab_vector))
 (minmax @0 @1
+
+/* Try to optimize x < 0 ? -1 : 0 into (signed) x >> 31
+   and x < 0 ? 1 : 0 into (unsigned) x >> 31.  */
+(simplify
+  (vec_cond (lt @0 integer_zerop) integer_all_onesp integer_zerop)
+   (if (VECTOR_INTEGER_TYPE_P (TREE_TYPE (@0))
+   && !TYPE_UNSIGNED (TREE_TYPE (@0))
+   && tree_nop_conversion_p (type, TREE_TYPE (@0))
+   && target_supports_op_p (TREE_TYPE (@0), RSHIFT_EXPR, optab_scalar))
+(with
+  {
+   unsigned int prec = element_precision (TREE_TYPE (@0));
+  }
+(view_convert:type
+  (rshift @0 { build_int_cst (integer_type_node, prec - 1);})
+
+(simplify
+  (vec_cond (lt @0 integer_zerop) integer_onep integer_zerop)
+   (if (VECTOR_INTEGER_TYPE_P (TREE_TYPE (@0))
+   && !TYPE_UNSIGNED (TREE_TYPE (@0))
+   && tree_nop_conversion_p (type, TREE_TYPE (@0))
+   && target_supports_op_p (unsigned_type_for (TREE_TYPE (@0)),
+   RSHIFT_EXPR, optab_scalar))
+(with
+  {
+   unsigned int prec = element_precision (TREE_TYPE (@0));
+   tree utype = unsigned_type_for (TREE_TYPE (@0));
+  }
+(view_convert:type
+  (rshift (view_convert:utype @0)
+ { build_int_cst (integer_type_node, prec - 1);})
 #endif
 
 (for cnd (cond vec_cond)
diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr88152-1.C 
b/gcc/testsuite/g++.dg/tree-ssa/pr88152-1.C
index 423ec897c1d..21299b886f0 100644
--- a/gcc/testsuite/g++.dg/tree-ssa/pr88152-1.C
+++ b/gcc/testsuite/g++.dg/tree-ssa/pr88152-1.C
@@ -1,7 +1,7 @@
 // PR target/88152
 // { dg-do compile }
 // { dg-options "-O2 -std=c++14 -fdump-tree-forwprop1" }
-// { dg-final { scan-tree-dump-times " (?:<|>=) \{ 0\[, ]" 120 "forwprop1" } }
+// { dg-final { scan-tree-dump-times " (?:(?:<|>=) \{ 0\[, \]|>> 
(?:7|15|31|63))" 120 "forwprop1" } }
 
 template 
 using V [[gnu::vector_size (sizeof (T) * N)]] = T;
diff --git a/gcc/testsuite/g++.target/i386/avx2-pr115517.C 
b/gcc/testsuite/g++.target/i386/avx2-pr115517.C
new file mode 100644
index 000..ec000c57542
--- /dev/null
+++ b/gcc/testsuite/g++.target/i386/avx2-pr115517.C
@@ -0,0 +1,60 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx2 -O2" } */
+/* { dg-final { scan-assembler-times "vpsrlq" 2 } } */
+/* { dg-final { scan-assembler-times "vpsrld" 2 } } */
+/* { dg-final { scan-assembler-times "vpsrlw" 2 } } */
+
+typedef short v8hi __attribute__((vector_size(16)));
+typedef short v16hi __attribute__((vector_size(32)));
+typedef int v4si __attribute__((vector_size(16)));
+typedef int v8si __attribute__((vector_size(32)));
+typedef long long v2di __attribute__((vector_size(16)));
+typedef long long v4di __attribute__((vector_size(32)));
+
+v8hi
+foo (v8hi a)
+{
+  v8hi const1_op = __extension__(v8hi){1,1,1,1,1,1,1,1};
+  v8hi const0_op = __extension__(v8hi){0,0,0,0,0,0,0,0};
+  return a < const0_op ? const1_op : const0_op;
+}
+
+v16hi
+foo2 (v16hi a)
+{
+  v16hi const1_op = __extension__(v16hi){1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1};
+  v16hi const0_op = __extension__(v16hi){0,0,0,0,0,0,0,0

Re: [PATCH 01/11] Output CodeView data about variables

2024-06-23 Thread Jeff Law




On 6/17/24 6:17 PM, Mark Harmstone wrote:

Parse the DW_TAG_variable DIEs, and outputs S_GDATA32 (for global variables)
and S_LDATA32 (static global variables) symbols into the .debug$S section.

 gcc/
 * dwarf2codeview.cc (S_LDATA32, S_GDATA32): Define.
 (struct codeview_symbol): New structure.
 (sym, last_sym): New variables.
 (write_data_symbol): New function.
 (write_codeview_symbols): Call write_data_symbol.
 (add_variable, codeview_debug_early_finish): New functions.
 * dwarf2codeview.h (codeview_debug_early_finish): Prototype.
 * dwarf2out.cc
 (dwarf2out_early_finish): Call codeview_debug_early_finish.

Thanks.  I've pushed this to the trunk.

Just one question.  I note you use #defines for the various constants. 
Any reason not to use a const object?  At least with those you can print 
them in a debugger rather than having to look them up in the source code.


Just a thought.  Naturally I'll be working my way through the rest of 
this kit.


jeff


[to-be-committed][RISC-V][V2]

2024-06-23 Thread Jeff Law
This is primarily Sergei's work, my contributions were limited to 
merging his expander with the one that's on the trunk, allowing 
non-constant value and trivial testsuite adjustments due to option renaming.


I'm doing setmem first because it's the easiest.  The others will follow 
soon enough.


I've tested this in my system, waiting on pre-commit CI to render its 
verdict before moving forward.


Jeff

gcc/ChangeLog

* config/riscv/riscv-protos.h (riscv_vector::expand_vec_setmem): New
function declaration.

* config/riscv/riscv-string.cc (riscv_vector::expand_vec_setmem): New
function: this generates an inline vectorised memory set, if and only if
we know the entire operation can be performed in a single vector store.

* config/riscv/riscv.md (setmem): Try 
riscv_vector::expand_vec_setmem
for constant lengths.  Do not require operand 2 to be a constant.

gcc/testsuite/ChangeLog
* gcc.target/riscv/rvv/base/setmem-1.c: New tests
* gcc.target/riscv/rvv/base/setmem-2.c: New tests
* gcc.target/riscv/rvv/base/setmem-3.c: New tests

diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index d6473d0cd85..a3380d4250d 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -678,6 +678,7 @@ void expand_popcount (rtx *);
 void expand_rawmemchr (machine_mode, rtx, rtx, rtx, bool = false);
 bool expand_strcmp (rtx, rtx, rtx, rtx, unsigned HOST_WIDE_INT, bool);
 void emit_vec_extract (rtx, rtx, rtx);
+bool expand_vec_setmem (rtx, rtx, rtx);
 
 /* Rounding mode bitfield for fixed point VXRM.  */
 enum fixed_point_rounding_mode
diff --git a/gcc/config/riscv/riscv-string.cc b/gcc/config/riscv/riscv-string.cc
index 4702001bd9b..1ddebdcee3f 100644
--- a/gcc/config/riscv/riscv-string.cc
+++ b/gcc/config/riscv/riscv-string.cc
@@ -1516,4 +1516,93 @@ expand_strcmp (rtx result, rtx src1, rtx src2, rtx 
nbytes,
   return true;
 }
 
+/* Check we are permitted to vectorise a memory operation.
+   If so, return true and populate lmul_out.
+   Otherwise, return false and leave lmul_out unchanged.  */
+static bool
+check_vectorise_memory_operation (rtx length_in, HOST_WIDE_INT &lmul_out)
+{
+  /* If we either can't or have been asked not to vectorise, respect this.  */
+  if (!TARGET_VECTOR)
+return false;
+  if (!(stringop_strategy & STRATEGY_VECTOR))
+return false;
+
+  /* If we can't reason about the length, don't vectorise.  */
+  if (!CONST_INT_P (length_in))
+return false;
+
+  HOST_WIDE_INT length = INTVAL (length_in);
+
+  /* If it's tiny, default operation is likely better; maybe worth
+ considering fractional lmul in the future as well.  */
+  if (length < (TARGET_MIN_VLEN / 8))
+return false;
+
+  /* If we've been asked to use a specific LMUL,
+ check the operation fits and do that.  */
+  if (rvv_max_lmul != RVV_DYNAMIC)
+{
+  lmul_out = TARGET_MAX_LMUL;
+  return (length <= ((TARGET_MAX_LMUL * TARGET_MIN_VLEN) / 8));
+}
+
+  /* Find smallest lmul large enough for entire op.  */
+  HOST_WIDE_INT lmul = 1;
+  while ((lmul <= 8) && (length > ((lmul * TARGET_MIN_VLEN) / 8)))
+{
+  lmul <<= 1;
+}
+
+  if (lmul > 8)
+return false;
+
+  lmul_out = lmul;
+  return true;
+}
+
+/* Used by setmemdi in riscv.md.  */
+bool
+expand_vec_setmem (rtx dst_in, rtx length_in, rtx fill_value_in)
+{
+  HOST_WIDE_INT lmul;
+  /* Check we are able and allowed to vectorise this operation;
+ bail if not.  */
+  if (!check_vectorise_memory_operation (length_in, lmul))
+return false;
+
+  machine_mode vmode
+  = riscv_vector::get_vector_mode (QImode, BYTES_PER_RISCV_VECTOR * lmul)
+   .require ();
+  rtx dst_addr = copy_addr_to_reg (XEXP (dst_in, 0));
+  rtx dst = change_address (dst_in, vmode, dst_addr);
+
+  rtx fill_value = gen_reg_rtx (vmode);
+  rtx broadcast_ops[] = { fill_value, fill_value_in };
+
+  /* If the length is exactly vlmax for the selected mode, do that.
+ Otherwise, use a predicated store.  */
+  if (known_eq (GET_MODE_SIZE (vmode), INTVAL (length_in)))
+{
+  emit_vlmax_insn (code_for_pred_broadcast (vmode), UNARY_OP,
+ broadcast_ops);
+  emit_move_insn (dst, fill_value);
+}
+  else
+{
+  if (!satisfies_constraint_K (length_in))
+ length_in = force_reg (Pmode, length_in);
+  emit_nonvlmax_insn (code_for_pred_broadcast (vmode), UNARY_OP,
+ broadcast_ops, length_in);
+  machine_mode mask_mode
+ = riscv_vector::get_vector_mode (BImode, GET_MODE_NUNITS (vmode))
+ .require ();
+  rtx mask = CONSTM1_RTX (mask_mode);
+  emit_insn (gen_pred_store (vmode, dst, mask, fill_value, length_in,
+ get_avl_type_rtx (riscv_vector::NONVLMAX)));
+}
+
+  return true;
+}
+
 }
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 7a9454de430..78cf83c9252 100644
--

Re: [PATCH 02/11] Handle CodeView base types

2024-06-23 Thread Jeff Law




On 6/17/24 6:17 PM, Mark Harmstone wrote:

Adds a get_type_num function to translate type DIEs into CodeView
numbers, along with a hash table for this.  For now we just deal with
the base types (integers, Unicode chars, floats, and bools).

 gcc/
 * dwarf2codeview.cc (struct codeview_type): New structure.
 (struct die_hasher): Likewise.
 (types_htab): New variable.
 (codeview_debug_finish): Free types_htab if allocated.
 (get_type_num_base_type, get_type_num): New function.
 (add_variable): Call get_type_num.
 * dwarf2codeview.h (T_CHAR, T_SHORT, T_LONG, T_QUAD): Define.
 (T_UCHAR, T_USHORT, T_ULONG, T_UQUAD, T_BOOL08): Likewise.
 (T_REAL32, T_REAL64, T_REAL80, T_REAL128, T_RCHAR): Likewise.
 (T_WCHAR, T_INT4, T_UINT4, T_CHAR16, T_CHAR32, T_CHAR8): Likewise.

Thanks.  I've pushed this patch to the trunk.

jeff



Re: [PATCH 03/11] Handle typedefs for CodeView

2024-06-23 Thread Jeff Law




On 6/17/24 6:17 PM, Mark Harmstone wrote:

 gcc/
 * dwarf2codeview.cc (get_type_num): Handle typedefs.

Thanks.  I've pushed this to the trunk.

jeff


Re: [PATCH 04/11] Handle pointers for CodeView

2024-06-23 Thread Jeff Law




On 6/17/24 6:17 PM, Mark Harmstone wrote:

Translates DW_TAG_pointer_type DIEs into LF_POINTER symbols, which get
output into the .debug$T section.

 gcc/
 * dwarf2codeview.cc (FIRST_TYPE): Define.
 (struct codeview_custom_type): New structure.
 (custom_types, last_custom_type): New variables.
 (get_type_num): Prototype.
 (write_lf_pointer, write_custom_types): New functions.
 (codeview_debug_finish): Call write_custom_types.
 (add_custom_type, get_type_num_pointer_type): New functions.
 (get_type_num): Handle DW_TAG_pointer_type DIEs.
 * dwarf2codeview.h (T_VOID): Define.
 (CV_POINTER_32, CV_POINTER_64): Likewise.
 (T_32PVOID, T_64PVOID): Likewise.
 (CV_PTR_NEAR32, CV_PTR64, LF_POINTER): Likewise.

Thanks.  I've pushed this to the trunk.

jeff


Re: [PATCH 05/11] Handle const and varible modifiers for CodeView

2024-06-23 Thread Jeff Law




On 6/17/24 6:17 PM, Mark Harmstone wrote:

Translate DW_TAG_const_type and DW_TAG_volatile_type DIEs into
LF_MODIFIER symbols.

 gcc/
 * dwarf2codeview.cc
 (struct codeview_custom_type): Add lf_modifier to union.
 (write_cv_padding, write_lf_modifier): New functions.
 (write_custom_types): Call write_lf_modifier.
 (get_type_num_const_type): New function.
 (get_type_num_volatile_type): Likewise.
 (get_type_num): Handle DW_TAG_const_type and
 DW_TAG_volatile_type DIEs.
 * dwarf2codeview.h (MOD_const, MOD_volatile): Define.
 (LF_MODIFIER): Likewise.
---



@@ -903,6 +908,76 @@ write_lf_pointer (codeview_custom_type *t)
asm_fprintf (asm_out_file, "%LLcv_type%x_end:\n", t->num);
  }
  
+/* All CodeView type definitions have to be aligned to a four-byte boundary,

+   so write some padding bytes if necessary.  These have to be specific values:
+   f3, f2, f1.  */
Consider changing the magic numbers to a #define or const object or an 
enum as a follow-up.




+
+  ct = (codeview_custom_type *) xmalloc (sizeof (codeview_custom_type));
So presumably you're freeing these objects elsewhere?  I see the free 
(custom_types), but I don' see where you free an subobjects.  Did I miss 
something?


I'll go ahead and commit, but please double check for memory leaks.

Jeff


Re: [PATCH 06/11] Handle enums for CodeView

2024-06-23 Thread Jeff Law




On 6/17/24 6:17 PM, Mark Harmstone wrote:

Translates DW_TAG_enumeration_type DIEs into LF_ENUM symbols.

 gcc/
 * dwarf2codeview.cc (MAX_FIELDLIST_SIZE): Define.
 (struct codeview_integer): New structure.
 (struct codeview_subtype): Likewise
 (struct codeview_custom_type): Add lf_fieldlist and lf_enum
 to union.
 (write_cv_integer, cv_integer_len): New functions.
 (write_lf_fieldlist, write_lf_enum): Likewise.
 (write_custom_types): Call write_lf_fieldlist and write_lf_enum.
 (add_enum_forward_def): New function.
 (get_type_num_enumeration_type): Likewise.
 (get_type_num): Handle DW_TAG_enumeration_type DIEs.
 * dwarf2codeview.h (LF_FIELDLIST, LF_INDEX, LF_ENUMERATE): Define.
 (LF_ENUM, LF_CHAR, LF_SHORT, LF_USHORT, LF_LONG): Likewise.
 (LF_ULONG, LF_QUADWORD, LF_UQUADWORD): Likewise.
 (CV_ACCESS_PRIVATE, CV_ACCESS_PROTECTED): Likewise.
 (CV_ACCESS_PUBLIC, CV_PROP_FWDREF): Likewise.
---
  gcc/dwarf2codeview.cc | 524 ++
  gcc/dwarf2codeview.h  |  17 ++
  2 files changed, 541 insertions(+)




@@ -978,6 +1022,292 @@ write_lf_modifier (codeview_custom_type *t)
asm_fprintf (asm_out_file, "%LLcv_type%x_end:\n", t->num);
  }
  
+/* Write a CodeView extensible integer.  If the value is non-negative and

+   < 0x8000, the value gets written directly as an uint16_t.  Otherwise, we
+   output two bytes for the integer type (LF_CHAR, LF_SHORT, ...), and the
+   actual value follows.  */

As a follow-up, can you include a brief description of the return value?

Pushed to the trunk.  Thanks!

jeff


Re: [PATCH version 2] rs6000, altivec-1-runnable.c update the, require-effective-target

2024-06-23 Thread Kewen.Lin
Hi,

on 2024/6/22 00:15, Carl Love wrote:
> GCC maintainers:
> 
> version 2, update the dg options per the feedback.  Retested the patch on 
> Power 10 with no regressions.
> 
> This patch updates the dg options.
> 
> The patch has been tested on Power 10 with no regression failures.
> 
> Please let me know if this patch is acceptable for mainline.  Thanks.
> 
> Carl 
> 
> -- 
> rs6000, altivec-1-runnable.c update the require-effective-target
> 
> Update the dg test directives.
> 
> gcc/testsuite/ChangeLog:gcc/testsuite/ChangeLog:
>   * gcc.target/powerpc/altivec-1-runnable.c: Change the
>   require-effective-target for the test.
> ---
>  gcc/testsuite/gcc.target/powerpc/altivec-1-runnable.c | 7 ---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.target/powerpc/altivec-1-runnable.c 
> b/gcc/testsuite/gcc.target/powerpc/altivec-1-runnable.c
> index da8ebbc30ba..3f084c91798 100644
> --- a/gcc/testsuite/gcc.target/powerpc/altivec-1-runnable.c
> +++ b/gcc/testsuite/gcc.target/powerpc/altivec-1-runnable.c
> @@ -1,6 +1,7 @@
> -/* { dg-do compile { target powerpc*-*-* } } */
> -/* { dg-require-effective-target powerpc_altivec_ok } */
> -/* { dg-options "-maltivec" } */
> +/* { dg-do run { target vmx_hw } } */
> +/* { dg-do compile { target { ! vmx_hw } } } */
> +/* { dg-options "-O2 -maltivec" } */
> +/* { dg-require-effective-target powerpc_altivec } */

This one needs rebasing, "powerpc_altivec" has been adjusted on trunk.

BR,
Kewen

>  
>  #include 
>  


Re: [PATCH][_Hashtable] Fix some implementation inconsistencies

2024-06-23 Thread François Dumont

Hi

Still no time ?

Thanks


On 06/06/2024 19:02, François Dumont wrote:

No chance ?

On 22/05/2024 06:50, François Dumont wrote:

Ping ?

On 13/05/2024 06:33, François Dumont wrote:

libstdc++: [_Hashtable] Fix some implementation inconsistencies

    Get rid of the different usages of the mutable keyword except in
    _Prime_rehash_policy where it is preserved for abi compatibility 
reason.


    Fix comment to explain that we need the computation of bucket 
index noexcept

    to be able to rehash the container when needed.

    For Standard instantiations through std::unordered_xxx 
containers we already
    force caching of hash code when hash functor is not noexcep so 
it is guarantied.


    The static_assert purpose in _Hashtable on _M_bucket_index is 
thus limited

    to usages of _Hashtable with exotic _Hashtable_traits.

    libstdc++-v3/ChangeLog:

    * include/bits/hashtable_policy.h 
(_NodeBuilder<>::_S_build): Remove

    const qualification on _NodeGenerator instance.
(_ReuseOrAllocNode<>::operator()(_Args&&...)): Remove const 
qualification.

    (_ReuseOrAllocNode<>::_M_nodes): Remove mutable.
    (_Insert_base<>::_M_insert_range): Remove _NodeGetter 
const qualification.
    (_Hash_code_base<>::_M_bucket_index(const 
_Hash_node_value<>&, size_t)):
    Simplify noexcept declaration, we already static_assert 
that _RangeHash functor

    is noexcept.
    * include/bits/hashtable.h: Rework comments. Remove 
const qualifier on

    _NodeGenerator& arguments.

Tested under Linux x64, ok to commit ?

François
diff --git a/libstdc++-v3/include/bits/hashtable.h 
b/libstdc++-v3/include/bits/hashtable.h
index 361da2b3b4d..7ed68bb6c3c 100644
--- a/libstdc++-v3/include/bits/hashtable.h
+++ b/libstdc++-v3/include/bits/hashtable.h
@@ -49,7 +49,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 using __cache_default
   =  __not_<__and_,
-  // Mandatory to have erase not throwing.
+  // Mandatory for the rehash process.
   __is_nothrow_invocable>>;
 
   // Helper to conditionally delete the default constructor.
@@ -484,7 +484,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   template
void
-   _M_assign(_Ht&&, const _NodeGenerator&);
+   _M_assign(_Ht&&, _NodeGenerator&);
 
   void
   _M_move_assign(_Hashtable&&, true_type);
@@ -922,7 +922,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   template
std::pair
-   _M_insert_unique(_Kt&&, _Arg&&, const _NodeGenerator&);
+   _M_insert_unique(_Kt&&, _Arg&&, _NodeGenerator&);
 
   template
static __conditional_t<
@@ -942,7 +942,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   template
std::pair
-   _M_insert_unique_aux(_Arg&& __arg, const _NodeGenerator& __node_gen)
+   _M_insert_unique_aux(_Arg&& __arg, _NodeGenerator& __node_gen)
{
  return _M_insert_unique(
_S_forward_key(_ExtractKey{}(std::forward<_Arg>(__arg))),
@@ -951,7 +951,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   template
std::pair
-   _M_insert(_Arg&& __arg, const _NodeGenerator& __node_gen,
+   _M_insert(_Arg&& __arg, _NodeGenerator& __node_gen,
  true_type /* __uks */)
{
  using __to_value
@@ -962,7 +962,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   template
iterator
-   _M_insert(_Arg&& __arg, const _NodeGenerator& __node_gen,
+   _M_insert(_Arg&& __arg, _NodeGenerator& __node_gen,
  false_type __uks)
{
  using __to_value
@@ -975,7 +975,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
iterator
_M_insert(const_iterator, _Arg&& __arg,
- const _NodeGenerator& __node_gen, true_type __uks)
+ _NodeGenerator& __node_gen, true_type __uks)
{
  return
_M_insert(std::forward<_Arg>(__arg), __node_gen, __uks).first;
@@ -985,7 +985,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
iterator
_M_insert(const_iterator, _Arg&&,
- const _NodeGenerator&, false_type __uks);
+ _NodeGenerator&, false_type __uks);
 
   size_type
   _M_erase(true_type __uks, const key_type&);
@@ -1415,7 +1415,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   void
   _Hashtable<_Key, _Value, _Alloc, _ExtractKey, _Equal,
 _Hash, _RangeHash, _Unused, _RehashPolicy, _Traits>::
-  _M_assign(_Ht&& __ht, const _NodeGenerator& __node_gen)
+  _M_assign(_Ht&& __ht, _NodeGenerator& __node_gen)
   {
__buckets_ptr __buckets = nullptr;
if (!_M_buckets)
@@ -1657,8 +1657,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 ~_Hashtable() noexcept
 {
   // Getting a bucket index from a node shall not throw because it is used
-  // in methods (erase, swap...) that shall not throw. Need a complete
-  // type to check this, so do it in the destructor not a

Scopus indexed Journal (Engineering)

2024-06-23 Thread Shen Baojun
*Dear Colleagues! Warm greetings!*



*WELCOME TO THE JOURNALOF SOUTHWEST JIAOTONG UNIVERSITY( Vol. 59 , No 4 -
2024)*



*Journal of Southwest Jiaotong University, (ISSN: 0258-2724) *is a
peer-reviewed, multidisciplinary specialized international journal aimed at
promoting research worldwide in fields of civil engineering, railway
rolling stock, electrical engineering, electronic technology and electric
traction, traffic and transport engineering, mechanical engineering,
information science and technology, environmental science, materials
science and engineering, applied physics, etc.

Journal of Southwest Jiaotong University is abstracted and Indexed in
Scopus, Current Contents, Geobase, Chemical Abstracts, etc.

Visit this Link  for further details.



  <#> View last articles 


You may submit your manuscript via online submission system or email:
*editor@jsju_org*
Please kindly let us know if you have any doubt.
Respectfully,
Jessie Wang
Secretary of the Editorial Office


Journal of Southwest Jiaotong University
21,West park, High-tech Zone, Chengdu, China, 611756
tel: +86-28-47609207


Re: [PATCH version 4] rs6000, altivec-2-runnable.c update the, require-effective-target

2024-06-23 Thread Kewen.Lin
Hi Carl,

on 2024/6/22 00:15, Carl Love wrote:
> GCC maintainers:
> 
> version 4:  Additional dg option updates per the feedback.  Retested the 
> patch on Power 10, no regressions.
> 
> version 3:  Updated per the feedback from Peter, Kewen and Segher.  Note, 
> Peter suggested the -mdejagnu-cpu= value must be power7.  
> The test fails if -mdejagnu-cpu= is set to power7, needs to be power8.  Patch 
> has been retested on a Power 10 box, it succeeds
> with 2 passes and no fails.
> 
> Per the additional feedback after patch: 
> 
>   commit c892525813c94b018464d5a4edc17f79186606b7
>   Author: Carl Love 
>   Date:   Tue Jun 11 14:01:16 2024 -0400
> 
>   rs6000, altivec-2-runnable.c should be a runnable test
> 
>   The test case has "dg-do compile" set not "dg-do run" for a runnable
>   test.  This patch changes the dg-do command argument to run.
> 
>   gcc/testsuite/ChangeLog:gcc/testsuite/ChangeLog:
>   * gcc.target/powerpc/altivec-2-runnable.c: Change dg-do
>   argument to run.
> 
> was approved and committed, I have updated the dg-require-effective-target
> and dg-options as requested so the test will compile with -O2 on a 
> machine that has a minimum support of Power 8 vector hardware.
> 
> The patch has been tested on Power 10 with no regression failures.
> 
> Please let me know if this patch is acceptable for mainline.  Thanks.

OK, thanks!

BR,
Kewen

> 
> Carl 
> 
> --
> rs6000, altivec-2-runnable.c update the require-effective-target
> 
> The test requires a minimum of Power8 vector HW and a compile level
> of -O2.  Update the dg test directives.
> 
> gcc/testsuite/ChangeLog:gcc/testsuite/ChangeLog:
>   * gcc.target/powerpc/altivec-2-runnable.c: Change the
>   require-effective-target for the test.
> ---
>  gcc/testsuite/gcc.target/powerpc/altivec-2-runnable.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.target/powerpc/altivec-2-runnable.c 
> b/gcc/testsuite/gcc.target/powerpc/altivec-2-runnable.c
> index 17b23eb9d50..660669f69fd 100644
> --- a/gcc/testsuite/gcc.target/powerpc/altivec-2-runnable.c
> +++ b/gcc/testsuite/gcc.target/powerpc/altivec-2-runnable.c
> @@ -1,6 +1,6 @@
> -/* { dg-do run } */
> -/* { dg-options "-mvsx" } */
> -/* { dg-additional-options "-mdejagnu-cpu=power8" { target { ! has_arch_pwr8 
> } } } */
> +/* { dg-do run { target p8vector_hw } } */
> +/* { dg-do compile { target { ! p8vector_hw } } } */
> +/* { dg-options "-O2  -mdejagnu-cpu=power8" } */
>  /* { dg-require-effective-target powerpc_vsx } */
>  
>  #include