[Bug sanitizer/114494] false-positive with -O2 -Wstringop-overflow=2 -fsanitize=address

2024-03-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114494

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #4 from Andrew Pinski  ---
Dup of bug 99673.

```
  _14 = &MEM[(struct ip_header *)saved_buffer_5(D) + 4B].ip_ver_len;
...
  _3 = _14 + _2;
...
  MEM[(char * {ref-all})_3] = _10;
```

Without -fsanitize=address, there is no `&MEM[(struct ip_header
*)saved_buffer_5(D) + 4B].ip_ver_len` but rather just `eth_payload_data_6 =
saved_buffer_5(D) + 4`.

See the duplicate bug for more analysis of the issue.

*** This bug has been marked as a duplicate of bug 99673 ***

[Bug sanitizer/99673] [11/13/14 Regression] bogus -Wstringop-overread warning with address sanitizer due to member address substitution

2024-03-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99673

Andrew Pinski  changed:

   What|Removed |Added

 CC||akihiko.odaki at daynix dot com

--- Comment #14 from Andrew Pinski  ---
*** Bug 114494 has been marked as a duplicate of this bug. ***

[Bug tree-optimization/88443] [meta-bug] bogus/missing -Wstringop-overflow warnings

2024-03-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88443
Bug 88443 depends on bug 114494, which changed state.

Bug 114494 Summary: false-positive with -O2 -Wstringop-overflow=2 
-fsanitize=address
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114494

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

[Bug c++/114167] Capturing a auto..., then unpacking it in a lambda taking Ts..., confuses GCC

2024-03-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114167

Andrew Pinski  changed:

   What|Removed |Added

 CC||andipeer at gmx dot net

--- Comment #1 from Andrew Pinski  ---
*** Bug 114495 has been marked as a duplicate of this bug. ***

[Bug c++/114495] Capture error in lambda fold

2024-03-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114495

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #1 from Andrew Pinski  ---
Dup.

*** This bug has been marked as a duplicate of bug 114167 ***

[Bug c++/114167] Capturing a auto..., then unpacking it in a lambda taking Ts..., confuses GCC

2024-03-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114167

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Keywords||rejects-valid
   Last reconfirmed||2024-03-27

--- Comment #2 from Andrew Pinski  ---
Confirmed.

[Bug c++/114479] [14 Regression] std::is_array_v changed from false to true in GCC 14

2024-03-27 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114479

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P1

[Bug target/114414] [14 Regression] 15-18% exec time slowdown of 433.milc on Zen2

2024-03-27 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114414

--- Comment #1 from Richard Biener  ---
I'll note the performance is now again close to that of 12/13 but improvements
have been lost (so it might be not called a 14 regression).

[Bug middle-end/114482] remove_unreachable_eh_regions could use a work queue instead of being recusive

2024-03-27 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114482

Richard Biener  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2024-03-27
 Status|UNCONFIRMED |NEW

--- Comment #1 from Richard Biener  ---
Confirmed, looks quite easy to do.

[Bug libstdc++/114477] The user-defined constructor of filter_view::iterator is not fully compliant with the standard

2024-03-27 Thread de34 at live dot cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114477

--- Comment #5 from Jiang An  ---
(In reply to 康桓瑋 from comment #0)
> Since P3059R0 is closed (although I feel bad about this)

BTW, now I think this is somehow unfortunate.
P3059 behaved like a follow-up paper of P2711 IMO. Both papers effectively
suggested that "some design choices of C++23 views are better, let's apply them
to C++20 views".

[Bug web/114496] New: Documentation: "Non-Bugs" page should update/mention something about -Wsign-conversion

2024-03-27 Thread Explorer09 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114496

Bug ID: 114496
   Summary: Documentation: "Non-Bugs" page should update/mention
something about -Wsign-conversion
   Product: gcc
   Version: 13.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: web
  Assignee: unassigned at gcc dot gnu.org
  Reporter: Explorer09 at gmail dot com
  Target Milestone: ---

Section 14.8 of GCC manual
"Certain Changes We Don’t Want to Make"
(https://gcc.gnu.org/onlinedocs/gcc-13.2.0/gcc/Non-bugs.html)

This item:

> Warning about assigning a signed value to an unsigned variable.
> Such assignments must be very common; warning about them would cause more 
> annoyance than good.

Perhaps GCC should clarify this a bit. GCC now has the "-Wsign-conversion"
option that warns about similar signed-to-unsigned conversions. So what feature
in particular did GCC choose not to implement? The difference between this
"non-bug" item and the existing "-Wsign-conversion" is what I wish to see.

[Bug libstdc++/114477] The user-defined constructor of filter_view::iterator is not fully compliant with the standard

2024-03-27 Thread hewillk at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114477

--- Comment #6 from 康桓瑋  ---
(In reply to Jiang An from comment #5)
> (In reply to 康桓瑋 from comment #0)
> > Since P3059R0 is closed (although I feel bad about this)
> 
> BTW, now I think this is somehow unfortunate.
> P3059 behaved like a follow-up paper of P2711 IMO. Both papers effectively
> suggested that "some design choices of C++23 views are better, let's apply
> them to C++20 views".

You are absolutely right. Is there any way to reopen it? I recall that it was
just because the committee didn't want to spend time on it.

[Bug other/114496] Documentation: "Non-Bugs" page should update/mention something about -Wsign-conversion

2024-03-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114496

--- Comment #1 from Andrew Pinski  ---
Confirmed.

[Bug tree-optimization/114485] [13/14 Regression] Wrong code with -O3 -march=rv64gcv on riscv or `-O3 -march=armv9-a` for aarch64

2024-03-27 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114485

--- Comment #3 from Richard Biener  ---
Huh.

  _75 = [vec_duplicate_expr] pretmp_34;
  _76 = -_75;
  _77 = VEC_PERM_EXPR <_75, _76, { 0, POLY_INT_CST [4, 4], 1, POLY_INT_CST [5,
4], 2, POLY_INT_CST [6, 4], ... }>;

  # c_lsm.7_8 = PHI <_2(9), pretmp_34(19)>
  vect__2.17_79 = -_77;
  _2 = -c_lsm.7_8;

   [local count: 94607391]:
  # i_101 = PHI 
  # vect__2.17_102 = PHI 
  # loop_mask_103 = PHI 
  # vect_iftmp.24_104 = PHI 
  _68 = ni_gap.12_67;
  _93 = .EXTRACT_LAST (loop_mask_103, vect_iftmp.24_104);
  iftmp.1_59 = _93;
  _82 = .EXTRACT_LAST (loop_mask_103, vect__2.17_102);

it looks OK to me?  But maybe the poly-int-cst permute is wrong?  Should
be an interleave.

[Bug tree-optimization/114485] [13/14 Regression] Wrong code with -O3 -march=rv64gcv on riscv or `-O3 -march=armv9-a` for aarch64

2024-03-27 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114485

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P2

[Bug other/114496] Documentation: "Non-Bugs" page should update/mention something about -Wsign-conversion

2024-03-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114496

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed||2024-03-27
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #2 from Andrew Pinski  ---
.

[Bug c++/114480] g++: internal compiler error: Segmentation fault signal terminated program cc1plus

2024-03-27 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114480

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
  Known to fail||14.0
   Keywords||ra
 Ever confirmed|0   |1
 CC||vmakarov at gcc dot gnu.org
   Last reconfirmed|2024-03-26 00:00:00 |2024-03-27

--- Comment #10 from Richard Biener  ---
I see on x86_64-linux w/ release checking

 tree SSA rewrite   :  76.99 ( 31%)   0.09 (  5%)  77.11 ( 31%)
   96M (  9%)
 integrated RA  :  92.31 ( 37%)   0.15 (  8%)  92.49 ( 37%)
  105M ( 10%)
 LRA create live ranges :  54.01 ( 22%)   0.00 (  0%)  54.02 ( 22%)
  885k (  0%)
 TOTAL  : 246.34  1.88248.43   
 1039M
246.34user 2.02system 4:08.92elapsed 99%CPU (0avgtext+0avgdata
3287072maxresident)k
70416inputs+0outputs (110major+1229628minor)pagefaults 0swaps

tree SSA rewrite is interesting, probably bitmap slowness and cache dependent.

With -O1:

 tree PTA   :  85.65 ( 14%)   0.21 (  3%)  85.89 ( 14%)
  348M (  2%)
 tree SSA rewrite   :  76.05 ( 13%)   0.10 (  1%)  76.14 ( 12%)
   96M (  1%)
 tree SSA incremental   : 181.52 ( 30%)   0.03 (  0%) 181.50 ( 30%)
10031k (  0%)
 expand vars:  66.72 ( 11%)   0.00 (  0%)  66.74 ( 11%)
 6132k (  0%)
 expand :  64.33 ( 11%)   0.02 (  0%)  64.39 ( 11%)
  172M (  1%)
 TOTAL  : 603.55  7.72611.61   
19327M
603.55user 7.83system 10:11.78elapsed 99%CPU (0avgtext+0avgdata
19809792maxresident)k
21520inputs+0outputs (48major+5102514minor)pagefaults 0swaps

definitely "interesting" testcase.

The profile for -O0 shows IDF compute (that's SSA rewrite, a usual suspect)
and other bits that might be interesting for the RA part.

Samples: 1M of event 'cycles:u', Event count (approx.): 1332096582355   
Overhead   Samples  Command  Shared Object   Symbol 
  24.78%243663  cc1plus  cc1plus [.] compute_idf
  11.29%115134  cc1plus  cc1plus [.] make_hard_regno_dead
  10.29%104126  cc1plus  cc1plus [.] process_bb_node_lives
   5.29% 53680  cc1plus  cc1plus [.] mark_pseudo_regno_live
   4.95% 50051  cc1plus  cc1plus [.] mark_ref_dead
   3.95% 40075  cc1plus  cc1plus [.]
update_allocno_pressure
   2.73% 27977  cc1plus  cc1plus [.]
lra_create_live_ranges_
   2.48% 25136  cc1plus  cc1plus [.] inc_register_pressure
   2.37% 24268  cc1plus  cc1plus [.] update_pseudo_point
   2.23% 21976  cc1plus  cc1plus [.] mergesort
   2.19% 22208  cc1plus  cc1plus [.] make_object_dead
   2.09% 21316  cc1plus  cc1plus [.] sparseset_clear_bit
   1.99% 20181  cc1plus  cc1plus [.] bitmap_set_bit

I'll note this was all tested on trunk, GCC 11 might behave even worse and
quite some deep recursion issues have been fixed in newer releases.

[Bug target/114487] ICE when building libsdl2 on -mfpmath=sse x86 with LTO

2024-03-27 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114487

Richard Biener  changed:

   What|Removed |Added

Summary|[14 regression] ICE when|ICE when building libsdl2
   |building libsdl2 on |on -mfpmath=sse x86 with
   |-mfpmath=sse x86 with LTO   |LTO
  Known to fail||12.3.1, 13.2.1, 14.0, 7.5.0
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2024-03-27
   Keywords||lto
 Ever confirmed|0   |1

--- Comment #1 from Richard Biener  ---
I can confirm the ICE.  We're expanding

;; gamma_4 = SDLTest_RandomUnitFloat ();

(call_insn/u 5 4 6 (set (reg:SF 20 xmm0)
(call (mem:QI (symbol_ref:SI ("SDLTest_RandomUnitFloat") [flags 0x3] 
) [0
SDLTest_RandomUnitFloat S1 A8])
(const_int 0 [0]))) "testautomation-testautomation_pixels.i":15:17
-1
 (expr_list:REG_CALL_DECL (symbol_ref:SI ("SDLTest_RandomUnitFloat") [flags
0x3]  )
(expr_list:REG_EH_REGION (const_int 0 [0])
(nil)))
(nil))

(insn 6 5 0 (set (reg/v:SF 99 [ gamma ])
(reg:SF 20 xmm0)) "testautomation-testautomation_pixels.i":15:17 -1
 (nil))

I'm not sure what's wrong - looks like a target issue to me.

I'll note -mmmx -msse isn't necessary, -msse2 is enough.  Likewise -fPIC
isn't required.  I've reproduced with -m32 added on x86_64.

GCC 13/12 are also broken the same way and GCC 7 doesn't terminate compiling.

So I'm not sure whether this is a regression (the reduced testcase, that is).
I can't find a version that works.

[Bug c/114493] [11/12/13/14 Regression] internal compiler error: in fld_incomplete_type_of, at ipa-free-lang-data.cc:257

2024-03-27 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114493

Richard Biener  changed:

   What|Removed |Added

 CC||hubicka at gcc dot gnu.org
   Priority|P3  |P2

[Bug target/114302] [14 Regression] GCN regressions after: vect: Tighten vect_determine_precisions_from_range [PR113281]

2024-03-27 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114302

--- Comment #1 from Richard Sandiford  ---
The decision to stop narrowing division was deliberate, see the comments in
PR113281 for details.  Is the purpose of the test to check vectorisation
quality, or to check for the right ABI routines?

[Bug target/114481] [14 Regression] 14% exec time slowdown of 433.milc on aarch64

2024-03-27 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114481

--- Comment #1 from Richard Biener  ---
Looks like speed is back to 12/13, so possibly not a regression.

[Bug target/114302] [14 Regression] GCN regressions after: vect: Tighten vect_determine_precisions_from_range [PR113281]

2024-03-27 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114302

--- Comment #2 from Andrew Stubbs  ---
The execution test checks that each of the libgcc routines work correctly, and
the scan assembler tests make sure that we're getting coverage of all of them.

In this case, the failure indicates that we're not testing the routine we were
aiming for (but I think it does execute correctly and give a good result).

[Bug target/114302] [14 Regression] GCN regressions after: vect: Tighten vect_determine_precisions_from_range [PR113281]

2024-03-27 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114302

--- Comment #3 from Richard Sandiford  ---
Ah, ok.  If the main aim is to test the libgcc routines, it might be safer to
use something like:

typedef char v64qi __attribute__((vector_size(64)));
v64qi f(v64qi x, v64qi y) { return x / y; }

instead of relying on vectorisation.

[Bug middle-end/88670] [meta-bug] generic vector extension issues

2024-03-27 Thread ebotcazou at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88670
Bug 88670 depends on bug 112787, which changed state.

Bug 112787 Summary: Codegen regression of large GCC vector extensions when 
enabling SVE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112787

   What|Removed |Added

 Status|REOPENED|RESOLVED
 Resolution|--- |FIXED

[Bug target/112787] Codegen regression of large GCC vector extensions when enabling SVE

2024-03-27 Thread ebotcazou at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112787

Eric Botcazou  changed:

   What|Removed |Added

 Status|REOPENED|RESOLVED
 Resolution|--- |FIXED

--- Comment #14 from Eric Botcazou  ---
> They have both been backported, @Eric the tests should be passing again now.

Confirmed, thanks a lot!

[Bug target/114487] ICE when building libsdl2 on -mfpmath=sse x86 with LTO

2024-03-27 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114487

--- Comment #2 from Uroš Bizjak  ---
(In reply to Richard Biener from comment #1)

> (insn 6 5 0 (set (reg/v:SF 99 [ gamma ])
> (reg:SF 20 xmm0)) "testautomation-testautomation_pixels.i":15:17 -1
>  (nil))
> 
> I'm not sure what's wrong - looks like a target issue to me.

We are working with SFmode, so -msse is enough to trigger the bug.

This is known issue. GCC assumes that at least moves of all hard registers are
working, which is not the case when LTO-compiling
testautomation-testautomation_pixels.i with SDL_test_fuzzer.o (that enables and
uses XMM registers via -msse -mfpmath=sse).

It looks to me that the compiler hits this part in function_value_32 when
LTO-compiling:

--cut here--
  /* Override FP return register with %xmm0 for local functions when
 SSE math is enabled or for functions with sseregparm attribute.  */
  if ((fn || fntype) && (mode == SFmode || mode == DFmode))
{
  int sse_level = ix86_function_sseregparm (fntype, fn, false);
  if (sse_level == -1)
{
  error ("calling %qD with SSE calling convention without "
 "SSE/SSE2 enabled", fn);
  sorry ("this is a GCC bug that can be worked around by adding "
 "attribute used to function called");
}
  else if ((sse_level >= 1 && mode == SFmode)
   || (sse_level == 2 && mode == DFmode))
regno = FIRST_SSE_REG;
}
--cut here--

Adding -msse to the second compilation works OK, removing -mfpmath=sse from the
first compilation also works OK.

[Bug target/114487] ICE when building libsdl2 on -mfpmath=sse x86 with LTO

2024-03-27 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114487

--- Comment #3 from Uroš Bizjak  ---
(In reply to Uroš Bizjak from comment #2)
> Adding -msse to the second compilation works OK, removing -mfpmath=sse from
> the first compilation also works OK.

Which makes this PR a LTO reincarnation of PR66047.

[Bug target/114487] ICE when building libsdl2 on -mfpmath=sse x86 with LTO

2024-03-27 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114487

--- Comment #4 from Uroš Bizjak  ---
(In reply to Uroš Bizjak from comment #3)
> (In reply to Uroš Bizjak from comment #2)
> > Adding -msse to the second compilation works OK, removing -mfpmath=sse from
> > the first compilation also works OK.
> 
> Which makes this PR a LTO reincarnation of PR66047.

Please see the FIXME in ix86_function_sseregparm:

  /* Refuse to produce wrong code when local function with SSE enabled
 is called from SSE disabled function.
 FIXME: We need a way to detect these cases cross-ltrans partition
 and avoid using SSE calling conventions on local functions called
 from function with SSE disabled.  For now at least delay the
 warning until we know we are going to produce wrong code.
 See PR66047  */

[Bug libstdc++/114477] The user-defined constructor of filter_view::iterator is not fully compliant with the standard

2024-03-27 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114477

--- Comment #7 from Jonathan Wakely  ---
The notes say it was closed because you didn't want to work on it.

https://github.com/cplusplus/papers/issues/1726#issuecomment-2014094319

It sounds like the Ranges study group supported the direction. If you want to
pursue it, just submit a revised paper. I think it would be sufficient to
expand the discussion of the libstdc++ implementation to point out that it
can't break GCC users because those constructors don't exist anyway. The MSVC
implementation should be inspected, and advice from MSVC STL maintainers
sought, to determine the impact on their users. Does range-v3 provide those
constructors? What about other third-party ranges libraries?

[Bug target/114416] calling convention incompatibility with vendor compiler for V9

2024-03-27 Thread jakub.kulik at oracle dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114416

Jakub Kulik  changed:

   What|Removed |Added

 CC||jakub.kulik at oracle dot com

--- Comment #6 from Jakub Kulik  ---
Thank you for the proposed fix! I tested it with several programs that I used
to find/reproduce the issue and it seems to work now (I talked about this with
Rainer initially).

As for the ABI being potentially unclear, I am in no way a SPARCv9 ABI expert,
so I asked internally, and was told that the ABI should be clear about this
case:

"""
See page 3P-10 (PDF page 46) where it says this:

%f0,%f1,%f2,%f3
(%d0, %d2)
(%q0)
Floating-point return values appear in the floating-point registers.
Single-precision values occupy %f0; double-precision values occupy %d0;
quad-precision values occupy %q0. (Refer to the SPARCTM Architecture Manual,
Version 9 for details on the register numbering scheme). Otherwise, these are
scratch registers.

and

%f0 through %f7
(%d0 through %d6)
(%q0 and %q4)
Floating-point fields from structure return values with a total size of 32
bytes or less appear in the floating-point registers.

Then on page 3P-13 (PDF page 49) it says this:
Structure or Union return values
Structure and union return types up to thirty-two bytes in size are returned in
registers. The registers are assigned as if the value was being passed as the
first argument to a function with a known prototype.

So we have to refer back to "Structure and Union arguments" on page 3P-12 (PDF
page 48) where it says:

"Structure or union types are always left-justified, whether stored in
registers or memory. *The individual fields of a structure (or containing
storage unit in the case of bit fields) are subject to promotion into registers
based on their type using the same rules as apply to scalar values* (with the
addition that a single-precision floating-point number assigned to the left
half of an argument slot will be promoted into the corresponding even-numbered
float register.)." [sic; emphasis added.] 
"""

[Bug target/114416] calling convention incompatibility with vendor compiler for V9

2024-03-27 Thread jakub.kulik at oracle dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114416

--- Comment #7 from Jakub Kulik  ---
Hmm, I just realized that you referred to the same sections, so my previous
comment might not make it clearer...

[Bug target/114416] calling convention incompatibility with vendor compiler for V9

2024-03-27 Thread ebotcazou at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114416

--- Comment #8 from Eric Botcazou  ---
> Hmm, I just realized that you referred to the same sections, so my previous
> comment might not make it clearer...

Yes, the fields in question have array types so the rules about scalar values
do not obviously apply to them.  This is a bit of circular reasoning but, if
the rule had been crystal clear, GCC would have implemented it at some point
during the last quarter of century.  My interpretation is that the writers of
the ABI document probably overlooked the specific cases of arrays, which cannot
appear as types of standalone parameters but can as types of fields in
structures.

[Bug target/114416] calling convention incompatibility with vendor compiler for V9

2024-03-27 Thread ebotcazou at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114416

--- Comment #9 from Eric Botcazou  ---
> Thank you for the proposed fix! I tested it with several programs that I
> used to find/reproduce the issue and it seems to work now (I talked about
> this with Rainer initially).

OK, thanks for the testing!

[Bug fortran/113885] [13/14 Regression] ice in gimplify_expr, at gimplify.cc:18658 with finalization

2024-03-27 Thread pault at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113885

Paul Thomas  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |pault at gcc dot gnu.org

--- Comment #2 from Paul Thomas  ---
Created attachment 57820
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57820&action=edit
Draft fix for this PR

Many thanks for the report.

The attachment needs some cleaning up and testing with other variants that
might generate the problem.

In fact, this is a double regression since the testcase below does not give the
right result for 'x' in the calls to test1 and test2.

The first regression is associated with the derived type having zero components
messing up the finalization calls. Strictly, this is not a regression since the
older versions of gfortran did not attempt the finalization.

The second regression is due to the attempt to place finalization calls in the
correct place relative to the evaluation of the rhs and the assignment to the
lhs. This is the cause of the incorrect results for the testcase below. I
believe that the correct output is:
after test1 x =2   3
no. final calls =4
after test2 x =6   8
no. final calls =   12

nagfor agrees but ifort gives 3 and 8 respectively for the no. of
finalizations.

To my astonishment, given the current stage of the fix, it even regtests OK :-)

Paul

module types
  type t
 integer :: i
   contains
 final :: finalize
  end type t
  integer :: ctr = 0
contains
  impure elemental subroutine finalize(x)
type(t), intent(inout) :: x
ctr = ctr + 1
  end subroutine finalize
end module types

impure elemental function elem(x)
  use types
  type(t), intent(in) :: x
  type(t) :: elem
  elem%i = x%i + 1
end function elem

impure elemental function elem2(x, y)
  use types
  type(t), intent(in) :: x, y
  type(t) :: elem2
  elem2%i = x%i + y%i
end function elem2

subroutine test1(x)
  use types
  interface
 impure elemental function elem(x)
   use types
   type(t), intent(in) :: x
   type(t) :: elem
 end function elem
  end interface
  type(t) :: x(:)
  x = elem(x)
end subroutine test1

subroutine test2(x)
  use types
  interface
 impure elemental function elem(x)
   use types
   type(t), intent(in) :: x
   type(t) :: elem
 end function elem
 impure elemental function elem2(x, y)
   use types
   type(t), intent(in) :: x, y
   type(t) :: elem2
 end function elem2
  end interface
  type(t) :: x(:)
  x = elem2(elem(x), elem(x))
end subroutine test2

program test113885
  use types
  interface
subroutine test1(x)
  use types
  type(t) :: x(:)
end subroutine
subroutine test2(x)
  use types
  type(t) :: x(:)
end subroutine
  end interface
  type(t) :: x(2) = [t(1),t(2)]
  call test1 (x)
  print "(a, 2i4)", "after test1 x = ", x
  print "(a, i4)", "no. final calls = ", ctr
  call test2 (x)
  print "(a, 2i4)", "after test2 x = ", x
  print "(a, i4)", "no. final calls = ",ctr
end

[Bug c++/114497] New: Alias CTAD crashes

2024-03-27 Thread hokein.wu at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114497

Bug ID: 114497
   Summary: Alias CTAD crashes
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hokein.wu at gmail dot com
  Target Milestone: ---

Crashes on gcc trunk.

See https://gcc.godbolt.org/z/EErMqe44o

```
template  typename T>
struct K {};

template 
class Foo {};

template  typename TTP,
  int...N>
using Bar = Foo, N...>; 

template 
class Container {};
Bar t = Foo, 1>();

```

[Bug tree-optimization/114057] [14 Regression] 435.gromacs fails verification with -Ofast -march={znver2,znver4} and PGO after r14-7272-g57f611604e8bab

2024-03-27 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114057

--- Comment #12 from Richard Biener  ---
OK, so I think the change is that we get to "correctly" notice

-vec.h:380:9: note: node (external) 0x6a2e9d8 (max_nunits=2, refcnt=1)
vector(2) float
-vec.h:380:9: note: stmt 0 _164 = MEM[(const real *)_27 + 8B];
-vec.h:380:9: note: stmt 1 _158 = MEM[(const real *)_27];
+vec.h:380:9: note: node (external) 0x5a823a8 (max_nunits=2, refcnt=1)
vector(2) float
+vec.h:380:9: note: [l] stmt 0 _164 = MEM[(const real *)_27 + 8B];
+vec.h:380:9: note: [l] stmt 1 _158 = MEM[(const real *)_27];

for the loads we do not handle because of gaps and promoted external.  That
leads to extra costs.

But also

+vec.h:380:9: note: node 0x5a81770 (max_nunits=2, refcnt=2) vector(2) float
 vec.h:380:9: note: op template: x_160 = _158 - _159;
 vec.h:380:9: note: stmt 0 x_160 = _158 - _159;
-vec.h:380:9: note: [l] stmt 1 y_163 = _161 - _162;
+vec.h:380:9: note: stmt 1 y_163 = _161 - _162;

so y_163 isn't considered live for some reason.  We find

_123 = _117 * y_163;

is vectorized as part of a reduction.  On the costing side we then see

-_161 - _162 1 times scalar_stmt costs 12 in body
-MEM[(const real *)_27 + 4B] 1 times scalar_load costs 12 in body
-MEM[(const real *)_24 + 4B] 1 times scalar_load costs 12 in body

which is the live (and dependent) stmts no longer costed on the scalar
side but also

+MEM[(const real *)_27 + 8B] 1 times vec_to_scalar costs 4 in epilogue
+MEM[(const real *)_24 + 8B] 1 times vec_to_scalar costs 4 in epilogue

costed in the vector epilog.  This is because we're conservative as we
don't really know whether we'll be able to code-generate the live
operation.  The costing side here is also not in sync as can be seen
from the _161 - _162 op removed.

I should also note that the setting of PURE_SLP is done a bit too early,
before we analyze operations and eventually throw away instances or
prune it by promoting ops external.

For reductions we also falsely claim all root stmts are vectorized - we
do have remain ops.  Fixing this restores the LIVE on them and in some
way restores vectorization.

I'm going to test this as fix for now.

[Bug target/114490] Optimization: x86 "shl" condition codes never reused

2024-03-27 Thread Explorer09 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114490

--- Comment #4 from Kang-Che Sung  ---
1. I just read "AMD64 Architecture Programmer's Manual - Volume 3:
General-Purpose and System Instructions"
(https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/programmer-references/24594.pdf)

It has a clearer wording in the "SAL / SHL" section:

"If the shift count is 0, no flags are modified."

Just mention for reference.

2. I still don't believe there is no chance of optimizing this thing, but it
requires GCC to track the state of the FLAGS register. I don't know if GCC can
do this internally. If GCC can't do this for now, that's OK for me (the example
I posted can be rewritten to another pattern that might produce even smaller
code in x86). But maybe label this as a WONTFIX and not INVALID?

[Bug c++/114497] Alias CTAD crashes

2024-03-27 Thread centurionn009 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114497

Centurion  changed:

   What|Removed |Added

 CC||centurionn009 at gmail dot com

--- Comment #1 from Centurion  ---
Looks like P114377 dublicate. Here is same problem with missed TEMPLATE_DECL in
condition: parm gets nullptr from DECL_INITIAL, instead of getting
TEMPLATE_TEMPLATE_PARM from TREE_TYPE.

https://gcc.gnu.org/pipermail/gcc-patches/2024-March/648338.html

[Bug tree-optimization/114057] [14 Regression] 435.gromacs fails verification with -Ofast -march={znver2,znver4} and PGO after r14-7272-g57f611604e8bab

2024-03-27 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114057

--- Comment #13 from GCC Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:0b02da5b99e89347f5f8bf875ec8318f84adff18

commit r14-9687-g0b02da5b99e89347f5f8bf875ec8318f84adff18
Author: Richard Biener 
Date:   Wed Mar 27 11:37:16 2024 +0100

tree-optimization/114057 - handle BB reduction remain defs as LIVE

The following makes sure to record the scalars we add to the BB
reduction vectorization result as scalar uses for the purpose of
computing live lanes.  This restores vectorization in the
bondfree.c TU of 435.gromacs.

PR tree-optimization/114057
* tree-vect-slp.cc (vect_bb_slp_mark_live_stmts): Mark
BB reduction remain defs as scalar uses.

[Bug tree-optimization/114057] [14 Regression] 435.gromacs fails verification with -Ofast -march={znver2,znver4} and PGO after r14-7272-g57f611604e8bab

2024-03-27 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114057

--- Comment #14 from Richard Biener  ---
I wasn't able to reproduce the miscompare on a Zen4 machine (but with
-march=znver2).  But the original vectorizataion of bondfree.c should be
restored and thus the miscompare gone.  I'll verify tomorrow when I'm back
at the machine I was able to reproduce the issue.

[Bug libstdc++/90745] [11/12/13/14 Regression] std::tuple::operator= parameter causes error outside immediate context

2024-03-27 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90745

Jonathan Wakely  changed:

   What|Removed |Added

 Resolution|--- |INVALID
 Status|NEW |RESOLVED

--- Comment #7 from Jonathan Wakely  ---
Slightly reduced:

#include 

template 
struct Expr
{
Op op;
std::tuple t;

using R = decltype(op(*(std::get<0>(t).p), *(std::get<1>(t).p)));
operator R() { return op(*(std::get<0>(t).p), *(std::get<1>(t).p)); }
};

template  inline auto
expr(Op && op, P0 && p0, P1 && p1)
{
return Expr { op, { p0, p1 } };
}

template 
struct cell
{
T * p = nullptr;

template  decltype(auto) operator =(X && x)
{
expr([](auto && y, auto && x)
 {
 y = x;
 },
*this, x);
}
};

int main()
{
  cell n;
  cell c;
  n = c;
}


I think GCC is correct to reject this, other compilers agree.

Overload resolution for 'n = c' considers cell::operator=(X&&) as well as
cell's implicit assignment ops, but it _also_ considers cell::operator=(X&&) due to ADL.

Because that overload has a decltype(auto) return type it has to instantiate
the body, which instantiates Expr, cell>
which triggers the error in the std::tuple::operator= outside the immediate
context.

[Bug libstdc++/90745] [11/12/13/14 Regression] std::tuple::operator= parameter causes error outside immediate context

2024-03-27 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90745

--- Comment #8 from Jonathan Wakely  ---
Replacing delctype(auto) on the cell::operator=(X&&) function (or constraining
that function to not be instantiated for non-assignable cells) will fix the
code.

[Bug libstdc++/100381] [11/12/13/14 Regression] new static_assert((std::__is_complete_or_unbounded(...)) failure from g++ 11.1.0

2024-03-27 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100381

Jonathan Wakely  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #5 from Jonathan Wakely  ---
The code is definitely invalid.

Your code instantiates std::function where CallParameter
is IndexGroupsAndNames which is an incomplete type. Constructing a
std::function checks that the callable argument is invocable with the argument
types in the std::function's call signature, which is void(CallParameter).

That is done using std::is_invocable_r. The spec for std::is_invocable_r has this requirement:

"Fn, R, and all types in the template parameter pack ArgTypes shall be complete
types, cv void, or arrays of unknown bound."

So you're trying to construct std::function which needs
to check if you can call a function with an argument of type IncompleteType,
which cannot be determined because the type might be non-copyable.

Using std::function would work, so would ensuring
the type is complete before trying to use it in a std::function call signature.

[Bug target/114302] [14 Regression] GCN regressions after: vect: Tighten vect_determine_precisions_from_range [PR113281]

2024-03-27 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114302

--- Comment #4 from Andrew Stubbs  ---
Yes, that's what the simd-math-3* tests do.

The simd-math-5* tests are explicitly supposed to be doing this in the context
of the autovectorizer.

If these tests are being compiled as (newly) intended then we should change the
expected results.

So, questions:

1. Are the new results actually correct? (So far I only know that being
different is expected.)

2. Is there some other testcase form that would exercise the previously
intended routines?

3. Is the new behaviour configurable? I don't think the 16-bit shift bug ever
existed on GCN (in which "short" vectors actually have excess bits in each
lane, much like scalar registers do).

[Bug target/114490] Optimization: x86 "shl" condition codes never reused

2024-03-27 Thread Explorer09 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114490

Kang-Che Sung  changed:

   What|Removed |Added

 Resolution|INVALID |---
 Status|RESOLVED|UNCONFIRMED

--- Comment #5 from Kang-Che Sung  ---
(Trying to mark this bug as UNCONFIRMED again in the hope of getting some
attention. I'm not sure if this is the right way of using GCC's Bugzilla here.)

[Bug target/114302] [14 Regression] GCN regressions after: vect: Tighten vect_determine_precisions_from_range [PR113281]

2024-03-27 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114302

--- Comment #5 from Richard Sandiford  ---
(In reply to Andrew Stubbs from comment #4)
> Yes, that's what the simd-math-3* tests do.
Ah, OK.

> The simd-math-5* tests are explicitly supposed to be doing this in the
> context of the autovectorizer.
> 
> If these tests are being compiled as (newly) intended then we should change
> the expected results.
> 
> So, questions:
> 
> 1. Are the new results actually correct? (So far I only know that being
> different is expected.)
I believe so.  We now do the division in 32 bits, as in the original gimple.

> 2. Is there some other testcase form that would exercise the previously
> intended routines?
It should be possible in languages that don't have C's integer
promotion rules, if you're up for some Ada or Rust.

> 3. Is the new behaviour configurable? I don't think the 16-bit shift bug> 
> ever existed on GCN (in which "short" vectors actually have excess bits in
> each lane, much like scalar registers do).
Not AFAIK.  The problem is that the gimple→gimple transformation changes
the gimple-level semantics of the code.  Shifts by out-of-range values
are undefined rather than target-defined.  (And in other cases that's useful,
because it means we don't need to preserve whatever value the target
happens to give for an out-of-range shift.)

[Bug libstdc++/114498] New: Consider deprecating then removing TR1 headers

2024-03-27 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114498

Bug ID: 114498
   Summary: Consider deprecating then removing TR1 headers
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: minor
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: redi at gcc dot gnu.org
  Target Milestone: ---

We should decide whether we want to keep std::tr1::shared_ptr etc. forever.

Those headers are virtually unmaintained, and just increase testing burden.

They do provide functionality that isn't otherwise available from libstdc++ in
C++98/C++03 mode. But does anybody care? Is anybody stuck in C++98/C++03 mode,
but also upgrading to modern GCC versions, and also relying on non-standard TR1
components that aren't actually in the C++03 standard? Can they just use Boost
instead?

At some point we might want to have the same discussion for
 and LFTSv1 components, and 

[Bug target/114487] ICE when building libsdl2 on -mfpmath=sse x86 with LTO

2024-03-27 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114487

Richard Biener  changed:

   What|Removed |Added

 CC||hubicka at gcc dot gnu.org,
   ||rguenth at gcc dot gnu.org

--- Comment #5 from Richard Biener  ---
Ah.  But note in this case we only have a single ltrans unit.  We might be
confused by the fact that wrong prototype will result in a
call through a float(*)() type while the actual local function doesn't
use any FP registers and has void return.

OTOH the 'void' can be exchanged for 'float' in SDL_test_fuzzer.i and the
issue still reproduces (with a single LTRANS unit).

[Bug target/112919] LoongArch: Alignments in tune parameters are not precise and they regress performance

2024-03-27 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112919

--- Comment #18 from Xi Ruoyao  ---
(In reply to chenglulu from comment #17)

> The results of spec2006 on LA464 are:
> -falign-labels=4 -falign-functions=32 -falign-loops=16 -falign-jumps=16

Would you send a patch for them or prefer I to do it?

[Bug libstdc++/100667] [11/12/13/14 Regression] std::tuple cannot be constructed from A&&, if A not defined (only forward declared)

2024-03-27 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100667

--- Comment #6 from Jonathan Wakely  ---
(In reply to Piotr Nycz from comment #0)
> It looks that std library code start requiring this to pass:
> std::is_nothrow_constructible...

Indeed, that's what the standard requires (Clang and MSVC reject this the same
way). The standard also says that using traits like is_constructible requires
complete types.

However, that's clearly silly for is_constructible because we know that
reference binding is valid for any A whether it's complete or not.

This is the subject of:
https://cplusplus.github.io/LWG/issue2939

[Bug target/114487] ICE when building libsdl2 on -mfpmath=sse x86 with LTO

2024-03-27 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114487

--- Comment #6 from Richard Biener  ---
I'd say ix86_function_sseregparm should be decided at a specific point and
recorded for later use.  Alternatively there needs to be a (target) IPA
phase where we can mark functions we cannot turn into sseregparm.

[Bug other/111966] GCN '--with-arch=[...]' not considered for 'mkoffload' default 'elf_arch'

2024-03-27 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111966

Thomas Schwinge  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |burnus at gcc dot 
gnu.org
 CC||rguenth at gcc dot gnu.org
 Status|REOPENED|ASSIGNED

--- Comment #5 from Thomas Schwinge  ---
Tobias is working on this.

[Bug libstdc++/114498] Consider deprecating then removing TR1 headers

2024-03-27 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114498

--- Comment #1 from Richard Biener  ---
I'd say deprecating them for a release aka hiding behind a
-D_YES_I_WANT_TR1_HEADERS and otherwise issueing #error and then axing them
should be OK.

Preferably tell people about a suitable replacement (or point to an URL)
within that #error.

If you are quick you can do the deprecation for GCC 14 ...

Are the headers usable with -std=c++11 or later?  Only allowing them with
-std=c++98/c++03 might be another option, so at least conflicts with new
features shouldn't be an issue then.

[Bug libstdc++/100667] [11/12/13/14 Regression] std::tuple cannot be constructed from A&&, if A not defined (only forward declared)

2024-03-27 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100667

--- Comment #7 from Jonathan Wakely  ---
(In reply to Viktor Ostashevskyi from comment #1)
> I have another example, but probably related:

No, this is a completely different problem. See Bug 102257

[Bug bootstrap/112534] [14 regression] build failure after r14-5424-gdb50aea6259545 using gcc 4.8.5

2024-03-27 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112534

Richard Biener  changed:

   What|Removed |Added

   Last reconfirmed||2024-03-27
 Status|UNCONFIRMED |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |arsen at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #13 from Richard Biener  ---
Looks like the patch is still not reviewed.  I haven't had issues with using
system gettext though.

[Bug c++/102257] call of overloaded 'tuple' is ambiguous

2024-03-27 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102257

--- Comment #5 from Jonathan Wakely  ---
I think this is the same bug, reduced from Bug 100667 comment 1 (where it
wasn't related):

struct allocator_arg_t { explicit allocator_arg_t() = default; };
class string{};
class Foo{};

struct tuple
{
  template
tuple(allocator_arg_t, const Alloc&) { }

  template
  tuple(const string&, const Foo&) { }
};

tuple bar()
{
return { {}, Foo{}};
}


Clang and EDG accept this.

[Bug target/53192] Incorrect arguments to AVX2's gather intrinsics

2024-03-27 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53192

--- Comment #8 from Jakub Jelinek  ---
Looking at other intrinsics with {,unsigned }__int64{, const} * arguments, I
see
void _mm_maskstore_epi64 (__int64* mem_addr, __m128i mask, __m128i a)
void _mm256_maskstore_epi64 (__int64* mem_addr, __m256i mask, __m256i a)
unsigned __int64 _mulx_u64 (unsigned __int64 a, unsigned __int64 b, unsigned
__int64* hi)
int _rdrand64_step (unsigned __int64* val)
unsigned char _addcarry_u64 (unsigned char c_in, unsigned __int64 a, unsigned
__int64 b, unsigned __int64 * out)
unsigned char _addcarryx_u64 (unsigned char c_in, unsigned __int64 a, unsigned
__int64 b, unsigned __int64 * out)
int _rdseed64_step (unsigned __int64 * val)
unsigned char _subborrow_u64 (unsigned char c_in, unsigned __int64 a, unsigned
__int64 b, unsigned __int64 * out)
__m128i _mm_i32gather_epi64 (__int64 const* base_addr, __m128i vindex, const
int scale)
__m128i _mm_mask_i32gather_epi64 (__m128i src, __int64 const* base_addr,
__m128i vindex, __m128i mask, const int scale)
__m256i _mm256_i32gather_epi64 (__int64 const* base_addr, __m128i vindex, const
int scale)
__m256i _mm256_mask_i32gather_epi64 (__m256i src, __int64 const* base_addr,
__m128i vindex, __m256i mask, const int scale)
__m128i _mm_i64gather_epi64 (__int64 const* base_addr, __m128i vindex, const
int scale)
__m128i _mm_mask_i64gather_epi64 (__m128i src, __int64 const* base_addr,
__m128i vindex, __m128i mask, const int scale)
__m256i _mm256_i64gather_epi64 (__int64 const* base_addr, __m256i vindex, const
int scale)
__m256i _mm256_mask_i64gather_epi64 (__m256i src, __int64 const* base_addr,
__m256i vindex, __m256i mask, const int scale)
__m128i _mm_maskload_epi64 (__int64 const* mem_addr, __m128i mask)
__m256i _mm256_maskload_epi64 (__int64 const* mem_addr, __m256i mask)
in the intrinsic guide.  And both GCC and LLVM consistently use long
long/unsigned long long pointers for all; of those.  And that type isn't
predefined by either of the compilers, so I'd just say that the hypothetical
__int64/unsigned __int64 is long long/unsigned long long on Linux, not
int64_t/uint64_t.

[Bug c++/102257] call of overloaded 'tuple' is ambiguous

2024-03-27 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102257

--- Comment #6 from Jonathan Wakely  ---
(In reply to Andrew Pinski from comment #2)
> See https://wg21.link/cwg1228 this might be invalid code and GCC is correct
> in rejecting it.

So dup of PR 84849 ?

[Bug target/53192] Incorrect arguments to AVX2's gather intrinsics

2024-03-27 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53192

--- Comment #9 from Jakub Jelinek  ---
(In reply to Andrew Pinski from comment #7)
> The other option is to change how intrinsics work on x86 and use resolve
> overloads inside the backend like how aarch64, arm and rs6000 backends all
> handle intrinsics these days.

Ugh no, that is terrible.  Not being able to actually figure out what the
header provides as intrinsics, with what arguments etc. from anything but
documentation is bad.

[Bug tree-optimization/110311] [14 Regression] regression in tree-optimizer

2024-03-27 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #58 from Richard Biener  ---
Thus fixed.

DeepLearn 2024: early registration April 5

2024-03-27 Thread IRDTA via Gcc-bugs

*To be removed from our mailing list, please respond to this message with 
UNSUBSCRIBE in the subject line*

--

**

11th INTERNATIONAL SCHOOL ON DEEP LEARNING
(and the Future of Artificial Intelligence)

DeepLearn 2024

Porto – Maia, Portugal

July 15-19, 2024

https://deeplearn.irdta.eu/2024/

**

Co-organized by:

University of Maia

Institute for Research Development, Training and Advice – IRDTA
Brussels/London

**

Early registration: April 5, 2024

**

SCOPE:

DeepLearn 2024 will be a research training event with a global scope aiming at 
updating participants on the most recent advances in the critical and fast 
developing area of deep learning. Previous events were held in Bilbao, Genova, 
Warsaw, Las Palmas de Gran Canaria, Guimarães, Las Palmas de Gran Canaria, 
Luleå, Bournemouth, Bari and Las Palmas de Gran Canaria.

Deep learning is a branch of artificial intelligence covering a spectrum of 
current frontier research and industrial innovation that provides more 
efficient algorithms to deal with large-scale data in a huge variety of 
environments: computer vision, neurosciences, speech recognition, language 
processing, human-computer interaction, drug discovery, health informatics, 
medical image analysis, recommender systems, advertising, fraud detection, 
robotics, games, finance, biotechnology, physics experiments, biometrics, 
communications, climate sciences, geographic information systems, signal 
processing, genomics, materials design, video technology, social systems, etc. 
etc.

The field is also raising a number of relevant questions about robustness of 
the algorithms, explainability, transparency, and important ethical concerns at 
the frontier of current knowledge that deserve careful multidisciplinary 
discussion.

Most deep learning subareas will be displayed, and main challenges identified 
through 16 four-hour and a half courses, 2 keynote lectures, 1 round table and 
a few hackathon-type competitions among students, which will tackle the most 
active and promising topics. Renowned academics and industry pioneers will 
lecture and share their views with the audience. The organizers are convinced 
that outstanding speakers will attract the brightest and most motivated 
students. Face to face interaction and networking will be main ingredients of 
the event. It will be also possible to fully participate in vivo remotely.

ADDRESSED TO:

Graduate students, postgraduate students and industry practitioners will be 
typical profiles of participants. However, there are no formal pre-requisites 
for attendance in terms of academic degrees, so people less or more advanced in 
their career will be welcome as well.

Since there will be a variety of levels, specific knowledge background may be 
assumed for some of the courses.

Overall, DeepLearn 2024 is addressed to students, researchers and practitioners 
who want to keep themselves updated about recent developments and future 
trends. All will surely find it fruitful to listen to and discuss with major 
researchers, industry leaders and innovators.

VENUE:

DeepLearn 2024 will take place in Porto, the second largest city in Portugal, 
recognized by UNESCO in 1996 as a World Heritage Site. The venue will be:

University of Maia
Avenida Carlos de Oliveira Campos - Castlo da Maia
4475-690 Maia
Porto, Portugal

https://www.umaia.pt/en

STRUCTURE:

3 courses will run in parallel during the whole event. Participants will be 
able to freely choose the courses they wish to attend as well as to move from 
one to another.

All lectures will be videorecorded. Participants will be able to watch them 
again for 45 days after the event.

An open session will give participants the opportunity to present their own 
work in progress in 5 minutes. Also companies will be able to present their 
technical developments for 10 minutes.

This year’s edition of the school will schedule hands-on activities including 
mini-hackathons, where participants will work in teams to tackle several 
machine learning challenges.

Full live online participation will be possible. The organizers highlight, 
however, the importance of face to face interaction and networking in this kind 
of research training event.

KEYNOTE SPEAKERS:

Jiawei Han (University of Illinois Urbana-Champaign), How Can Large Language 
Models Contribute to Effective Text Mining?

Katia Sycara (Carnegie Mellon University), Effective Multi Agent Teaming

PROFESSORS AND COURSES:

Luca Benini (Swiss Federal Institute of Technology Zurich), 
[intermediate/advanced] Open Hardware Platforms for Edge Machine Learning

Gustau Camps-Valls (University of València), [intermediate] AI for Earth, 
Climate, and Sustainability

Nitesh Chawla (University of Notre Dame), [introductory/intermediate] 
Intr

[Bug target/113652] [14 regression] Failed bootstrap on ppc unrecognized opcode: `lfiwzx' with -mcpu=7450

2024-03-27 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113652

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P2

--- Comment #18 from Richard Biener  ---
Would be nice to fix but not a blocker since it will go away with release
checking or persists on branches when checking is enabled.  P2.

[Bug libstdc++/112858] nvptx: 'unresolved symbol __cxa_thread_atexit_impl'

2024-03-27 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112858

Richard Biener  changed:

   What|Removed |Added

Summary|[14 Regression] nvptx:  |nvptx: 'unresolved symbol
   |'unresolved symbol  |__cxa_thread_atexit_impl'
   |__cxa_thread_atexit_impl'   |
   Target Milestone|14.0|---
   Assignee|aoliva at gcc dot gnu.org  |unassigned at gcc dot 
gnu.org

--- Comment #9 from Richard Biener  ---
I'm assuming it is not.

[Bug target/114499] New: MVE: scatter base offset constraints incorrect

2024-03-27 Thread kevin.bracey at alifsemi dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114499

Bug ID: 114499
   Summary: MVE: scatter base offset constraints incorrect
   Product: gcc
   Version: 13.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: kevin.bracey at alifsemi dot com
  Target Milestone: ---

An attempt to use

uint32x4_t base;
float32x4_t value;
vstrwq_scatter_base_wb_f32(&base, -sizeof(float), value);

Generates an "unsupported" error. It does not accept -4 as a valid offset, but
it should. It's looking for a multiple of 8 from -1016 to +1016, not a multiple
of 4 from -508 to +508 as it should.

Looking at mve.md, I see a number of scatter/gather_base operations have
incorrect constraints; they're rather random.

Offsets for VLDRW/VSTRW are always 7-bit with a sign bit, representing +/-0 to
+/-127*memory size. So the W and D base forms all take -508 to 508 multiples of
4 ("O"?) or -1016 to +1016 multiples of 8 ("Ri").

The "Rl" constraint was wrongly added for just
mve_vstrwq_scatter_base_wb_p_fv4sf
(https://github.com/gcc-mirror/gcc/commit/ae180f26109bfaebb4ab0f4d45035fd075cf02c8),
and it is not required. If it was really needed for a halfword instruction its
range should be -254 to +254. It seems that mve_vector_mem_operand() handles
this range correctly for non-scatter/gather.

Some corrections I think are needed are:

mve_vldrwq_gather_base_v4si i -> O
mve_vldrwq_gather_base_v2di i -> Ri
mve_vldrwq_gather_base_z_v2di i -> Ri
mve_vldrwq_gather_base_fv4sf i -> O
mve_vldrwq_gather_base_z_fv4sf i -> O
mve_vldrwq_gather_base_wb_v4si Ri -> O
mve_vldrwq_gather_base_wb_z_v4si Ri -> O
mve_vldrwq_gather_base_wb_fv4sf  Ri -> O
mve_vldrwq_gather_base_wb_z_fv4sf  Ri -> O

mve_vstrwq_scatter_base_v4si i -> O
mve_vstrwq_scatter_base_fv4sf i -> O
mve_vstrwq_scatter_base_wb_v4si Ri -> O
mve_vstrwq_scatter_base_wb_p_v4si Ri -> O
mve_vstrwq_scatter_base_wb_fv4sf Ri -> O
mve_vstrwq_scatter_base_wb_p_fv4sf Rl -> O

But I don't know that that's exhaustive.

[Bug target/112919] LoongArch: Alignments in tune parameters are not precise and they regress performance

2024-03-27 Thread chenglulu at loongson dot cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112919

--- Comment #19 from chenglulu  ---
(In reply to Xi Ruoyao from comment #18)
> (In reply to chenglulu from comment #17)
> 
> > The results of spec2006 on LA464 are:
> > -falign-labels=4 -falign-functions=32 -falign-loops=16 -falign-jumps=16
> 
> Would you send a patch for them or prefer I to do it?

I'll send a patch tomorrow.

[Bug libstdc++/100667] [11/12/13/14 Regression] std::tuple cannot be constructed from A&&, if A not defined (only forward declared)

2024-03-27 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100667

--- Comment #8 from Jonathan Wakely  ---
(In reply to Jonathan Wakely from comment #6)
> (In reply to Piotr Nycz from comment #0)
> > It looks that std library code start requiring this to pass:
> > std::is_nothrow_constructible...
> 
> Indeed, that's what the standard requires (Clang and MSVC reject this the
> same way). The standard also says that using traits like is_constructible
> requires complete types.
> 
> However, that's clearly silly for is_constructible because we know
> that reference binding is valid for any A whether it's complete or not.
> 
> This is the subject of:
> https://cplusplus.github.io/LWG/issue2939

We can fix this in std::tuple by adding && to the source objects in all
is_constructible and is_convertible conditions. Or we could fix it in the type
traits themselves.

But I think it would be best to fix it in the compiler, so that we always allow
directly binding T&& or const T& to T, even if T is incomplete. Otherwise we'll
be playing whackamole all over the library.

[Bug libstdc++/100667] [11/12/13/14 Regression] std::tuple cannot be constructed from A&&, if A not defined (only forward declared)

2024-03-27 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100667

--- Comment #9 from Jonathan Wakely  ---
The changes in
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p1285r0.pdf mean that
it's only undefined if the result of is_constructible_v would change
were T completed. So there's no benefit to enforcing the completeness
requirement here, it's not UB anyway.

[Bug c++/114377] [13/14 Regression] GCC crashes on an example of CTAD for alias templates

2024-03-27 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114377

--- Comment #3 from Patrick Palka  ---
*** Bug 114497 has been marked as a duplicate of this bug. ***

[Bug c++/114497] Alias CTAD crashes

2024-03-27 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114497

Patrick Palka  changed:

   What|Removed |Added

 CC||ppalka at gcc dot gnu.org
 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #2 from Patrick Palka  ---
Great, let's close this as a dup then.

*** This bug has been marked as a duplicate of bug 114377 ***

[Bug tree-optimization/114485] [13/14 Regression] Wrong code with -O3 -march=rv64gcv on riscv or `-O3 -march=armv9-a` for aarch64

2024-03-27 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114485

--- Comment #4 from Robin Dapp  ---
Yes, the vectorization looks ok.  The extracted live values are not used
afterwards and therefore the whole vectorized loop is being thrown away.
Then we do one iteration of the epilogue loop, inverting the original c and end
up with -8 instead of 8.  This is pretty similar to what's happening in the
related PR.

We properly populate the phi in question in slpeel_update_phi_nodes_for_guard1:

c_lsm.7_64 = PHI <_56(23), pretmp_34(17)>

but vect_update_ivs_after_vectorizer changes that into

c_lsm.7_64 = PHI .

Just as a test, commenting out

  if (!LOOP_VINFO_EARLY_BREAKS_VECT_PEELED (loop_vinfo))
vect_update_ivs_after_vectorizer (loop_vinfo, niters_vector_mult_vf,
  update_e);

at least makes us keep the VEC_EXTRACT and not fail anymore.

[Bug libstdc++/100667] [11/12/13/14 Regression] std::tuple cannot be constructed from A&&, if A not defined (only forward declared)

2024-03-27 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100667

--- Comment #10 from Jonathan Wakely  ---
Oh interestingly __is_constructible(Incomplete&&, Incomplete) is already
allowed, but __is_nothrow_constructible and __is_convertible give errors:


__is_nothrow_constructible(Incomplete&&, Incomplete)

: In function ‘int main()’:
:117:68: error: invalid use of incomplete type ‘struct Incomplete’
[-fpermissive]
:110:8: note: forward declaration of ‘struct Incomplete’

__is_convertible(Incomplete, Incomplete&&)

: In function ‘int main()’:
:117:58: error: invalid use of incomplete type ‘struct Incomplete’
[-fpermissive]
:110:8: note: forward declaration of ‘struct Incomplete’

[Bug c++/114377] [13/14 Regression] GCC crashes on an example of CTAD for alias templates

2024-03-27 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114377

Patrick Palka  changed:

   What|Removed |Added

   Keywords||patch

--- Comment #4 from Patrick Palka  ---
Patch from Centurion:
https://gcc.gnu.org/pipermail/gcc-patches/2024-March/648338.html

[Bug tree-optimization/112303] [14 Regression] ICE on valid code at -O3 on x86_64-linux-gnu: verify_flow_info failed since r14-3459-g0c78240fd7d519

2024-03-27 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112303

--- Comment #13 from Jakub Jelinek  ---
Created attachment 57821
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57821&action=edit
gcc14-pr112303.patch

This patch fixes the ICE for me.
Seems we already did something like that in other spots (e.g. in apply_scale).

[Bug target/101523] Huge number of combine attempts

2024-03-27 Thread sarah.kriesch at opensuse dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101523

--- Comment #49 from Sarah Julia Kriesch  ---
(In reply to Sam James from comment #44)
> I'm really curious as to if there's other test cases which could be shared,
> as Andreas mentioned distributions were complaining about this even. That's
> unlikely if it's a single degenerate case.
> 
> Even listing some example package names could help.

Sorry for the late response! I am a volunteer and went through all constraints
files from the last few years (I added to multiple packages). Most
memory-related issues have been already resolved.
But I found some easter eggs for you today:

1) nodejs21 with 11,5GB on s390x, 2,5GB on x86, 3,7GB on PPCle, 2,5GB on
aarch64 and 2,4GB on armv7:
https://build.opensuse.org/package/show/devel:languages:nodejs/nodejs21

2) PDAL with 9,7GB on s390x, 2,2GB on x86 and 2,2GB on aarch64:
https://build.opensuse.org/package/show/openSUSE:Factory:zSystems/PDAL

3) python-numpy with 15,2GB on s390x, 8,6GB on PPCle, 9,3GB on x86,1,9 on
armv7, 9,3GB on aarch64:
https://build.opensuse.org/package/show/devel:languages:python:numeric/python-numpy

I wish you a happy Eastern!

[Bug c++/111426] [11/12/13/14 Regression] "error: use of deleted function" printed twice

2024-03-27 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111426

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek  ---
These are two different diagnostics.
One error is about using deleted D::D() and explains why it is deleted, the
other is about using deleted D::~D() and explains why it is deleted.
Do you mean we shouldn't diagnose using deleted D::~D() when we already
diagnosed some other special member (D::D() in that case)?

[Bug target/114492] Invalid use of gcc_assert (notably in gcc/config/aarch64/aarch64-ldp-fusion.cc)

2024-03-27 Thread hp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114492

--- Comment #3 from Hans-Peter Nilsson  ---
(In reply to Andrew Pinski from comment #1)
> >Please be advised that the argument is *not* evaluated with release checking
> 
> Actually it is evaluated with release checking as release checking enables
> assert checking.

Ah, I should have followed ENABLE_ASSERT_CHECKING.  Still worrisome.

> The 2 I see which might be an issue is:
>   gcc_assert (crtl->ssa->verify_insn_changes (changes));
> 
> gcc_assert (rtl_ssa::restrict_movement_ignoring (*changes[i],
> is_changing));

(Four instances, two each of these two.)

[Bug c++/92067] __is_constructible(incomplete_type) should make the program ill-formed

2024-03-27 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92067

--- Comment #7 from Jonathan Wakely  ---
(In reply to Jason Merrill from comment #3)
> Hmm? but the standard says that a precondition for std::is_constructible is
> the type being complete, and we enforce that with a static_assert (since
> PR71579).  Why would it be a problem for the builtin to enforce it as well?

It's a problem when the standard is wrong :-)
https://cplusplus.github.io/LWG/issue2939

We certainly need a complete type for is_constructible and
is_constructible and for is_assignable (which is PR 109997),
but we don't need a complete type for:

__is_constructible(T&&, T) // currently true
__is_constructible(T&, T)  // currently false
__is_nothrow_constructible(T&&, T) // currently ill-formed!
__is_convertible(T, T&&)   // currently ill-formed!

The last two seem wrong, we should be able to give a correct answer. See PR
100667.

https://wg21.link/p1285r0 changed the library traits to say that it's only
undefined to instantiate the traits with incomplete types if "instantiation
could yield a different result were T hypothetically completed".

So std::is_constructible_v is not undefined (because completing T
doesn't change the fact that you can always bind T&& to an rvalue of type T)
and so we should not reject it.

[Bug libstdc++/114498] Consider deprecating then removing TR1 headers

2024-03-27 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114498

--- Comment #2 from Jonathan Wakely  ---
(In reply to Richard Biener from comment #1)
> I'd say deprecating them for a release aka hiding behind a
> -D_YES_I_WANT_TR1_HEADERS and otherwise issueing #error and then axing them
> should be OK.
> 
> Preferably tell people about a suitable replacement (or point to an URL)
> within that #error.

Yeah, we can use a deprecated attribute with a suggestion. We already have
_GLIBCXX_DEPRECATED_SUGGEST for that. And if we remove them completely, we can
keep the headers and use #error to give suggestions.

> If you are quick you can do the deprecation for GCC 14 ...

I'm not going to try, there are higher priority things I can do for 14 :-)

> Are the headers usable with -std=c++11 or later?

Yes, but mostly not useful, because most of the features were added to C++11
anyway, e.g. std::tr1::shared_ptr is superseded by std::shared_ptr,
std::tr1::function is superseded by std::function, etc.

> Only allowing them with
> -std=c++98/c++03 might be another option, so at least conflicts with new
> features shouldn't be an issue then.

Conflicts aren't an issue. The only real reason to remove them is just to have
less code to maintain + test. But the maintenance burden is very low, so it's
just the time spent testing them in a testsuite that's already large and slow.

[Bug sanitizer/97696] ICE since ASAN_MARK does not handle poly_int sized varibales

2024-03-27 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97696

--- Comment #6 from GCC Commits  ---
The releases/gcc-13 branch has been updated by Richard Sandiford
:

https://gcc.gnu.org/g:86b80b049167d28a9ef43aebdfbb80ae5deb0888

commit r13-8501-g86b80b049167d28a9ef43aebdfbb80ae5deb0888
Author: Richard Sandiford 
Date:   Wed Mar 27 15:30:19 2024 +

asan: Handle poly-int sizes in ASAN_MARK [PR97696]

This patch makes the expansion of IFN_ASAN_MARK let through
poly-int-sized objects.  The expansion itself was already generic
enough, but the tests for the fast path were too strict.

gcc/
PR sanitizer/97696
* asan.cc (asan_expand_mark_ifn): Allow the length to be a
poly_int.

gcc/testsuite/
PR sanitizer/97696
* gcc.target/aarch64/sve/pr97696.c: New test.

(cherry picked from commit fca6f6fddb22b8665e840f455a7d0318d4575227)

[Bug c++/111075] [14 Regression] ICE on g++.dg/torture/tail-padding1.C on darwin

2024-03-27 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111075

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #3 from Jakub Jelinek  ---
I can reproduce with a cross, doesn't ICE with -Os -fno-elide-constructors, on
x86_64-linux doesn't ICE with/without -Os -f{,no-}elide-constructors.
The ICE is on
3025  /* We used to shortcut trivial constructor/op= here, but nowadays
3026 we can only get a trivial function here with
-fno-elide-constructors.  */
3027  gcc_checking_assert (!trivial_fn_p (fun)
3028   || !flag_elide_constructors
3029   /* We don't elide constructors when processing
3030  a noexcept-expression.  */
3031   || cp_noexcept_operand);
where fun is X::X(X const&) and is trivial.

[Bug other/114496] Documentation: "Non-Bugs" page should update/mention something about -Wsign-conversion

2024-03-27 Thread egallager at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114496

--- Comment #3 from Eric Gallager  ---
Maybe the update could be just to clarify the "EnabledBy" rules for the
warning? i.e., something like "-Wsign-conversion is only and will only ever be
enabled by -Wconversion in C, and we will never have it enabled by -Wall or
-Wextra (unlike -Wsign-compare, which is enabled by -Wall in C++, and -Wextra
in C)."
(and maybe also include something about the new -Warith-conversion flag too?)

[Bug jit/102824] building pdf/dvi documentation for libgccjit fails

2024-03-27 Thread egallager at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102824

--- Comment #13 from Eric Gallager  ---
(In reply to Iain Sandoe from comment #12)
> what input is this waiting for at the moment?

>From checking the bug history, it looks like Martin Liška was the one to put
this in the WAITING status, which came along with this comment:

(In reply to Martin Liška from comment #7)
> Well, running 'make latexpdf' works if you jump into gcc/jit/docs folder. Do
> I miss something?

...which I thought we'd answered, but to make it a bit more clear: we shouldn't
have to do that to get the jit docs to build properly. They should build
properly when doing `make dvi` and/or `make pdf` from the top-level, rather
than requiring their own special procedures.

[Bug c++/111426] [11/12/13/14 Regression] "error: use of deleted function" printed twice

2024-03-27 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111426

--- Comment #3 from Marek Polacek  ---
I meant that g++5 emitted

111426.C:7:3: error: use of deleted function ‘D::D()’
 D d;
   ^
111426.C:6:7: note: ‘D::D()’ is implicitly deleted because the default
definition would be ill-formed:
 class D : public X { };
   ^
111426.C:6:7: error: use of deleted function ‘X::~X()’
111426.C:3:3: note: declared here
   ~X() = delete;
   ^

which seems more user-friendly than 4 errors, and saying that X::~X() is
deleted twice.  clang++ emits only one error.

But, maybe it's not that bad after all.  Feel free to close this.

[Bug libgcc/113402] Incorrect symbol versions for __builtin_nested_func_ptr_created, __builtin_nested_func_ptr in libgcc_s.so.1

2024-03-27 Thread egallager at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113402

Eric Gallager  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED
 CC||egallager at gcc dot gnu.org

--- Comment #11 from Eric Gallager  ---
(In reply to dave.anglin from comment #10)
> Warning is fixed on hppa.

OK, closing as FIXED, then.

[Bug go/114500] New: go.test/test/fixedbugs/issue23781.go FAILs

2024-03-27 Thread ro at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114500

Bug ID: 114500
   Summary: go.test/test/fixedbugs/issue23781.go FAILs
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: go
  Assignee: ian at airs dot com
  Reporter: ro at gcc dot gnu.org
  Target Milestone: ---
  Host: i386-pc-solaris2.11
Target: amd64-pc-solaris2.11
 Build: i386-pc-solaris2.11

The go.test/test/fixedbugs/issue23781.go test FAILs on Solaris/x86 with a
32-bit-default
compiler, but targetting 64-bit x86:

FAIL: go.test/test/fixedbugs/issue23781.go   -O  (test for excess errors)

Excess errors:
/vol/gcc/src/hg/master/local/gcc/testsuite/go.test/test/fixedbugs/issue23781.go:10:17:
error: index value overflow

The test just PASSes when using a 64-bit-default compiler instead.

[Bug rtl-optimization/101523] Huge number of combine attempts

2024-03-27 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101523

--- Comment #50 from GCC Commits  ---
The master branch has been updated by Segher Boessenkool :

https://gcc.gnu.org/g:839bc42772ba7af66af3bd16efed4a69511312ae

commit r14-9692-g839bc42772ba7af66af3bd16efed4a69511312ae
Author: Segher Boessenkool 
Date:   Wed Mar 27 14:09:52 2024 +

combine: Don't combine if I2 does not change

In some cases combine will "combine" an I2 and I3, but end up putting
exactly the same thing back as I2 as was there before.  This is never
progress, so we shouldn't do it, it will lead to oscillating behaviour
and the like.

If we want to canonicalise things, that's fine, but this is not the
way to do it.

2024-03-27  Segher Boessenkool  

PR rtl-optimization/101523
* combine.cc (try_combine): Don't do a 2-insn combination if
it does not in fact change I2.

[Bug c++/111075] [13/14 Regression] ICE on g++.dg/torture/tail-padding1.C on darwin

2024-03-27 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111075

Jakub Jelinek  changed:

   What|Removed |Added

Summary|[14 Regression] ICE on  |[13/14 Regression] ICE on
   |g++.dg/torture/tail-padding |g++.dg/torture/tail-padding
   |1.C on darwin   |1.C on darwin
   Target Milestone|14.0|13.3

--- Comment #4 from Jakub Jelinek  ---
Actually, it doesn't seem to be a regression from 13.x, if one builds 13 branch
with --enable-checking=yes rather than --enable-checking=release, it ICEs too.
12 branch doesn't ICE though.
So P2 is right.

[Bug tree-optimization/109925] [11/12/13/14 Regression] Wrong code at -O2 on x86_64-linux-gnu since GCC-12

2024-03-27 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109925

--- Comment #4 from Jakub Jelinek  ---
Doesn't reproduce on the trunk since
r14-4089-gd45ddc2c04e471d0dcee016b6edacc00b8341b16
Doesn't reproduce on 13 branch either, the PR113372 fixed it there.
So, I think we should just add the testcase to the testsuite and remove 13/14
markers.
Will handle that.
PR113372 hasn't been backported to 12/11 yet.

[Bug rtl-optimization/101523] Huge number of combine attempts

2024-03-27 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101523

--- Comment #51 from Segher Boessenkool  ---
(In reply to Richard Biener from comment #46)
> Maybe combine already knows that it just "keeps i2" rather than replacing it?

It never does that.  Instead, it thinks it is making a new I2, but it ends up
to be exactly the same instruction.  This is not a good thing to do, combine
can change the whole thing back to the previous shape for example, when it
feels like it (combine does not make canonical forms ever!)

> When !newi2pat we seem to delete i2.  Anyway, somebody more familiar with
> combine should produce a good(TM) patch.

Yes, the most common combinations delete I2, they combine 2->1 or 3->1 or 4->1.
When this isn't possible combine tries to combine to two instructions, it has
various strategies for this: the backend can do it explicitly (via a
define_split),
or it can break apart the expression that was the src in the one set that was
the ->1 result, hoping that the two instructions it gets that way are valid
insns.  It tries only one way to do this, and it isn't very smart about it,
just very heuristic.

[Bug rtl-optimization/101523] Huge number of combine attempts

2024-03-27 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101523

Segher Boessenkool  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #52 from Segher Boessenkool  ---
Fixed.  (On trunk only, no backports planned, this goes back aaages).

[Bug rtl-optimization/114452] Functions invoked through compile-time table of function pointers not inlined

2024-03-27 Thread jamborm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114452

Martin Jambor  changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|DUPLICATE   |---
   Last reconfirmed||2024-03-27
 Ever confirmed|0   |1

--- Comment #4 from Martin Jambor  ---
This does not look like a duplicate of PR 111573.

Nevertheless, it is not quite obvious what to do here.  Inlining
happens before unrolling and I am not sure we'd consider unrolling in
early optimizations.  And without unrolling, the load from the array
is not easy to fold.

In this testcase all (well, both) functions referenced from the array
are semantically equivalent which is recognized by ICF but making it
be able to pass this information to the inliner would be
non-trivial... and is this the common case worth optimizing for?

[Bug rtl-optimization/114452] Functions invoked through compile-time table of function pointers not inlined

2024-03-27 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114452

Xi Ruoyao  changed:

   What|Removed |Added

 Status|REOPENED|NEW
   Keywords||missed-optimization

[Bug c++/111075] [13/14 Regression] ICE on g++.dg/torture/tail-padding1.C on darwin

2024-03-27 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111075

--- Comment #5 from Jakub Jelinek  ---
Started with r13-6145-gb2287a4d9a640fdc2caef6a067830ea65044deb7
I must say I have no idea what is different from this POV on Darwin vs. Linux.

[Bug sanitizer/97696] ICE since ASAN_MARK does not handle poly_int sized varibales

2024-03-27 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97696

--- Comment #7 from GCC Commits  ---
The releases/gcc-12 branch has been updated by Richard Sandiford
:

https://gcc.gnu.org/g:51e1629bc11f0ae4b8050712b26521036ed360aa

commit r12-10296-g51e1629bc11f0ae4b8050712b26521036ed360aa
Author: Richard Sandiford 
Date:   Wed Mar 27 17:38:09 2024 +

asan: Handle poly-int sizes in ASAN_MARK [PR97696]

This patch makes the expansion of IFN_ASAN_MARK let through
poly-int-sized objects.  The expansion itself was already generic
enough, but the tests for the fast path were too strict.

gcc/
PR sanitizer/97696
* asan.cc (asan_expand_mark_ifn): Allow the length to be a
poly_int.

gcc/testsuite/
PR sanitizer/97696
* gcc.target/aarch64/sve/pr97696.c: New test.

(cherry picked from commit fca6f6fddb22b8665e840f455a7d0318d4575227)

[Bug bootstrap/31418] Bootstrap failure with -O2 -funroll-loops -funsafe-math-optimizations options on PPC

2024-03-27 Thread meissner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=31418

Michael Meissner  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED
 CC||meissner at gcc dot gnu.org

--- Comment #2 from Michael Meissner  ---
I built the current GCC 14 development compiler using -O2 -funroll-loops
-funsafe-math-optimizations, and it built fine.  I suspect it had been fixed
ages ago.

[Bug target/54412] minimal 32-byte stack alignment with -mavx on 64-bit Windows

2024-03-27 Thread dimula73 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412

--- Comment #42 from Dmitry Kazakov  ---
Hi, Avraham!

> Does it remain true that the only option to get around this bug without 
> killing all AVX2 is to pass "-Wa,-muse-unaligned-vector-move" when compiling 
> using GCC on Windows 64? Thank you

I'm not sure about your particular issue, but in our case we used to manage to
workaround this issue by passing AVX2-related structures by reference (or
const-reference, when possible).

[Bug libstdc++/100667] [11/12/13/14 Regression] std::tuple cannot be constructed from A&&, if A not defined (only forward declared)

2024-03-27 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100667

--- Comment #11 from Jonathan Wakely  ---
(In reply to Jonathan Wakely from comment #8)
> But I think it would be best to fix it in the compiler, so that we always
> allow directly binding T&& or const T& to T, even if T is incomplete.
> Otherwise we'll be playing whackamole all over the library.

Actually the workarounds would only be needed in :

--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -1187,6 +1187,16 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
"template argument must be a complete class or an unbounded array");
 };

+  template
+struct is_nothrow_constructible<_Tp&, _Up>
+: __is_nothrow_constructible_impl<_Tp&, __add_rval_ref_t<_Up>>
+{ };
+
+  template
+struct is_nothrow_constructible<_Tp&&, _Up>
+: __is_nothrow_constructible_impl<_Tp&&, __add_rval_ref_t<_Up>>
+{ };
+
   /// is_nothrow_default_constructible
   template
 struct is_nothrow_default_constructible
@@ -1496,7 +1506,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #if _GLIBCXX_USE_BUILTIN_TRAIT(__is_convertible)
   template
 struct is_convertible
-: public __bool_constant<__is_convertible(_From, _To)>
+: public __bool_constant<__is_convertible(__add_rval_ref_t<_From>, _To)>
 { };
 #else
   template
 inline constexpr bool is_nothrow_convertible_v
-  = __is_nothrow_convertible(_From, _To);
+  = __is_nothrow_convertible(__add_rval_ref_t<_From>, _To);

   /// is_nothrow_convertible
   template



I think this should be OK but I haven't tested it yet.

[Bug tree-optimization/112303] [14 Regression] ICE on valid code at -O3 on x86_64-linux-gnu: verify_flow_info failed since r14-3459-g0c78240fd7d519

2024-03-27 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112303

--- Comment #14 from Jan Hubicka  ---
> This patch fixes the ICE for me.
> Seems we already did something like that in other spots (e.g. in apply_scale).

In general if the overflow happens, some pass must have misbehaved and
do something crazy when updating profile.  But indeed we probably ought
to cap here instead of randomly getting to uninitialized. It may make
sense to make these enable checking only ICEs.   I will look into why
the overflow happens.

Honza

[Bug libstdc++/113663] [MinGW] std::filesystem::hard_link_count always returns 1

2024-03-27 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113663

Jonathan Wakely  changed:

   What|Removed |Added

   Keywords||patch
URL||https://gcc.gnu.org/piperma
   ||il/gcc-patches/2024-January
   ||/644473.html

--- Comment #5 from Jonathan Wakely  ---
I've only just noticed you submitted a patch for this:
https://gcc.gnu.org/pipermail/gcc-patches/2024-January/644473.html

Libstdc++ patches need to be CC'd to the libstdc++ list, or they won't be seen
by the right people. I've found it now though, so I'll review it ASAP, thanks.

[Bug rtl-optimization/93565] [11/12/13/14 regression] Combine duplicates instructions

2024-03-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93565

--- Comment #29 from Andrew Pinski  ---
Looking back at this one, I (In reply to Wilco from comment #8)
> Here is a much simpler example:
> 
> void f (int *p, int y)
> {
>   int a = y & 14;
>   *p = a | p[a];
> }
After r14-9692-g839bc42772ba7af66af3bd16efed4a69511312ae, we now get:
f:
.LFB0:
.cfi_startproc
and w2, w1, 14
mov x1, x2
ldr w2, [x0, x2, lsl 2]
orr w1, w2, w1
str w1, [x0]
ret
.cfi_endproc

There is an extra move still but the duplicated and is gone. (with
-frename-registers added, the move is gone as REE is able to remove the zero
extend but then there is a life range conflict so can't remove the move too).

So maybe this should be closed as fixed for GCC 14 and the cost changes for clz
reverted.

```
Trying 7 -> 9:
7: r105:SI=r115:SI&0xe
  REG_DEAD r115:SI
9: r110:DI=zero_extend(r105:SI)
Failed to match this instruction:
(parallel [
(set (reg:DI 110 [ _1 ])
(and:DI (subreg:DI (reg:SI 115) 0)
(const_int 14 [0xe])))
(set (reg/v:SI 105 [ a ])
(and:SI (reg:SI 115)
(const_int 14 [0xe])))
])
Failed to match this instruction:
(parallel [
(set (reg:DI 110 [ _1 ])
(and:DI (subreg:DI (reg:SI 115) 0)
(const_int 14 [0xe])))
(set (reg/v:SI 105 [ a ])
(and:SI (reg:SI 115)
(const_int 14 [0xe])))
])
Successfully matched this instruction:
(set (reg/v:SI 105 [ a ])
(and:SI (reg:SI 115)
(const_int 14 [0xe])))
Successfully matched this instruction:
(set (reg:DI 110 [ _1 ])
(and:DI (subreg:DI (reg:SI 115) 0)
(const_int 14 [0xe])))
allowing combination of insns 7 and 9
original costs 4 + 4 = 8
replacement costs 4 + 4 = 8
i2 didn't change, not doing this
```

  1   2   >