date:20241120

Re: Use branch prediction from .gcda files

2024-11-20 Thread Richard Biener via Gcc

On Tue, Nov 19, 2024 at 11:56 AM Kamil Belter via Gcc  wrote:
>
> Hello,
>
> I would like to set branch prediction based on .gcda files (I know I
> could have it automatically with -fprofile-use, but with my specific
> use case I can't do it).
>
> I've tried to use gcov-dump but I can't find any spec how to interpret
> this output.

The info doesn't represent a profile but just counter data - you have to
reconstruct a profile from that.

> I've also tried to use -fprofile-use with -fdump-rtl-all,
> -fdump-tree-all, -fdump-ipa-all, -fopt-info and
> -ftree-vectorizer-verbose=n flags and see GCC debug output

You probably want to restrict that to -fdump-ipa-profile which is the
point where the .gcda files are read and applied in a -fprofile-use
build.

> I got ~200
> new debug files per source file. I see that some of the files contain
> some format of source code. Some conditions are marked with percentage
> values, for example:
>
>   if (_11 != 0)
> goto ; [27.00%]
>   else
> goto ; [73.00%]
>
>[local count: 51244186]:
>
> I'm guessing that this is prediction, but I found in doc that GCC uses
> its own heuristics for branch prediction. How to be sure if I'm
> looking at predictions from gcda files?

Internally we know which counts were based on feedback and which not
but I don't think we dump that info.

Richard.

> And side question what does
> "local count" mean? If this is relevant information I'm working with
> ARM.
>
> I would very appreciate your help with my questions.
>
> Best regards,
> Kamil Belter

Re: libdiagnostics name clash

2024-11-20 Thread Joel Sherrill via Gcc

On Wed, Nov 20, 2024, 3:50 PM Mark Wielaard  wrote:

> On Wed, Nov 20, 2024 at 04:11:16PM -0500, David Malcolm via Gcc wrote:
> > I merged libdiagnostics to GCC trunk on Monday:
> >   https://gcc.gnu.org/wiki/libdiagnostics
> >
> > Unfortunately I've since discovered there's at least one libdiagnostics
> > .so already in Debian:
> > https://tracker.debian.org/pkg/diagnostics
> >
> https://packages.debian.org/search?searchon=contents&keywords=libdiag&mode=filename&suite=unstable&arch=any
> >
> > so I've been asked to change the name.
> >
> > I'd prefer to avoid having "gcc" in the name.
> >
> > Some name ideas:
> >
> > * libdiag
> > * libgdiagnostics (where we can be ambiguous about what the "g" stands
> > for)
> > * libgdiag (less typing)
> > * libcomplain
> > * libcomplaint
> > * libwhining
> > * libwhine (but sounds like the Windows compat software)
> >
> > Any ideas?
>
> Cool, a naming bikeshed! My suggestion[s] would be some variant of:
>
> [lib][g](code|lang)diag[nostics]
>
> Have fun picking a color! :)
>

Shouldn't "not the Debian version" be encoded? 

Sorry. Couldn't resist.

>
> Cheers,
>
> Mark
>

Re: libdiagnostics name clash

2024-11-20 Thread Mark Wielaard

On Wed, Nov 20, 2024 at 04:11:16PM -0500, David Malcolm via Gcc wrote:
> I merged libdiagnostics to GCC trunk on Monday:
>   https://gcc.gnu.org/wiki/libdiagnostics
> 
> Unfortunately I've since discovered there's at least one libdiagnostics
> .so already in Debian:
> https://tracker.debian.org/pkg/diagnostics
> https://packages.debian.org/search?searchon=contents&keywords=libdiag&mode=filename&suite=unstable&arch=any
> 
> so I've been asked to change the name.
> 
> I'd prefer to avoid having "gcc" in the name.
> 
> Some name ideas:
> 
> * libdiag
> * libgdiagnostics (where we can be ambiguous about what the "g" stands
> for)
> * libgdiag (less typing)
> * libcomplain
> * libcomplaint
> * libwhining
> * libwhine (but sounds like the Windows compat software)
> 
> Any ideas?

Cool, a naming bikeshed! My suggestion[s] would be some variant of:

[lib][g](code|lang)diag[nostics]

Have fun picking a color! :)

Cheers,

Mark

libdiagnostics name clash

2024-11-20 Thread David Malcolm via Gcc

I merged libdiagnostics to GCC trunk on Monday:
  https://gcc.gnu.org/wiki/libdiagnostics

Unfortunately I've since discovered there's at least one libdiagnostics
.so already in Debian:
https://tracker.debian.org/pkg/diagnostics
https://packages.debian.org/search?searchon=contents&keywords=libdiag&mode=filename&suite=unstable&arch=any

so I've been asked to change the name.

I'd prefer to avoid having "gcc" in the name.

Some name ideas:

* libdiag
* libgdiagnostics (where we can be ambiguous about what the "g" stands
for)
* libgdiag (less typing)
* libcomplain
* libcomplaint
* libwhining
* libwhine (but sounds like the Windows compat software)

Any ideas?

Thanks
Dave

Re: Understanding peephole2

2024-11-20 Thread Richard Biener via Gcc

On Wed, Nov 20, 2024 at 11:29 AM Georg-Johann Lay via Gcc
 wrote:
>
> Consider the following RTL peephole from avr.md:
>
> (define_peephole2   ; avr.md:5387
>[(match_scratch:QI 3 "d")
> (parallel [(set (match_operand:ALL4 0 "register_operand" "")
> (ashift:ALL4 (match_operand:ALL4 1
> "register_operand" "")
>  (match_operand:QI 2 "const_int_operand"
> "")))
>(clobber (reg:CC REG_CC))])]
>""
>[(parallel [(set (match_dup 0)
> (ashift:ALL4 (match_dup 1)
>  (match_dup 2)))
>(clobber (match_dup 3))
>(clobber (reg:CC REG_CC))])])
>
> As far as I understand, its purpose is to provide a QImode
> scratch register provided such a scratch is available.
>
> However, in the .peephole2 RTL dump with -da I see the following:
>
> Splitting with gen_peephole2_100 (avr.md:5387)
> ...
> (insn 24 8 15 2 (parallel [
>  (set (reg:SI 22 r22 [orig:47 _3 ] [47])
>   (ashift:SI (reg:SI 20 r20 [orig:48 x ] [48])
>  (const_int 7 [0x7])))
>  (clobber (reg:QI 24 r24))
>  (clobber (reg:CC 36 cc))
>  ])
>   (nil))
>
> That is, the scratch r24:QI is overlapping the output in
> r22:SI.  All hard registers are 8-bit regs and hence r22:SI
> extends from r22...r25.
>
> A scratch that overlaps the operands is pretty much useless
> or even plain wrong.  recog.cc::peep2_find_free_register()
> has this comment:  /* Don't use registers set or clobbered by the insn.  */
>
>from = peep2_buf_position (peep2_current + from);
>to = peep2_buf_position (peep2_current + to);
>
>gcc_assert (peep2_insn_data[from].insn != NULL_RTX);
>REG_SET_TO_HARD_REG_SET (live, peep2_insn_data[from].live_before);
>
>while (from != to)
>  {
>gcc_assert (peep2_insn_data[from].insn != NULL_RTX);
>
>/* Don't use registers set or clobbered by the insn.  */
>FOR_EACH_INSN_DEF (def, peep2_insn_data[from].insn)
> SET_HARD_REG_BIT (live, DF_REF_REGNO (def));
>
>from = peep2_buf_position (from + 1);
>  }
>
> So it this bogus in that it assumes all registers extend only
> over one hard reg?

Yes, looks like a bug to me.

>
> FYI, the purpose is to provide a scratch without increasing the register
> pressure (which "match_scratch" would do).  Therefore, the RTL peephole
> is used instead of forcing reload to come up with a scratch.
>
> More specifically, I see this with
>
> $ avr-gcc bogus-peep2.c -S -Os -da
>
> long ashl32_7 (int i, long x)
> {
>  return x << 7;
> }
>
> with the attached WIP patch atop trunk b222ee10045d.
>
> Johann
>
> Target: avr
> Configured with: ../../source/gcc-master/configure --target=avr
> --disable-nls --with-dwarf2 --with-gnu-as --with-gnu-ld --disable-shared
>   --enable-languages=c,c++
> Thread model: single
> Supported LTO compression algorithms: zlib
> gcc version 15.0.0 20241119 (experimental) (GCC)

Request to Contribute to GNU Compiler Collection Project for GSoC Preparation

2024-11-20 Thread Vaibhav Arora via Gcc

Dear GNU Compiler Collection  Team,

I hope this email finds you well. My name is Vaibhav Arora, and I am an
enthusiastic software developer with a strong interest in contributing to
open-source projects. I am writing to express my keen interest in GNU
Compiler Collection, and to seek your guidance on how I can start
contributing to your codebase.

As a newcomer to the Google Summer of Code (GSoC) program, I am eager to
gain experience and develop my skills by working on impactful projects. I
believe that contributing to GNU Compiler Collection would be an invaluable
opportunity for me to learn from experienced developers and to make
meaningful contributions to a project that plays a crucial role in software
internationalization.

I have a solid foundation in C++, and I am committed to dedicating my time
and effort to contribute effectively.

Could you please provide me with some guidance on how to get started? Any
resources, documentation, or initial tasks that you could point me towards
would be greatly appreciated. Additionally, if there are any ongoing
projects or issues that would be suitable for a newcomer, I would be eager
to take them on.

Thank you for considering my request. I am looking forward to the
possibility of contributing to the GNU Compiler Collection and preparing
myself for a potential GSoC application in 2025.

Best regards,

Vaibhav arora
 7vaibhavarora2...@gmail.com

Re: Understanding peephole2

2024-11-20 Thread Georg-Johann Lay via Gcc


Am 20.11.24 um 11:33 schrieb Richard Biener:

On Wed, Nov 20, 2024 at 11:29 AM Georg-Johann Lay via Gcc
 wrote:


Consider the following RTL peephole from avr.md:

(define_peephole2   ; avr.md:5387
[(match_scratch:QI 3 "d")
 (parallel [(set (match_operand:ALL4 0 "register_operand" "")
 (ashift:ALL4 (match_operand:ALL4 1
"register_operand" "")
  (match_operand:QI 2 "const_int_operand"
"")))
(clobber (reg:CC REG_CC))])]
""
[(parallel [(set (match_dup 0)
 (ashift:ALL4 (match_dup 1)
  (match_dup 2)))
(clobber (match_dup 3))
(clobber (reg:CC REG_CC))])])

As far as I understand, its purpose is to provide a QImode
scratch register provided such a scratch is available.

However, in the .peephole2 RTL dump with -da I see the following:

Splitting with gen_peephole2_100 (avr.md:5387)
...
(insn 24 8 15 2 (parallel [
  (set (reg:SI 22 r22 [orig:47 _3 ] [47])
   (ashift:SI (reg:SI 20 r20 [orig:48 x ] [48])
  (const_int 7 [0x7])))
  (clobber (reg:QI 24 r24))
  (clobber (reg:CC 36 cc))
  ])
   (nil))

That is, the scratch r24:QI is overlapping the output in
r22:SI.  All hard registers are 8-bit regs and hence r22:SI
extends from r22...r25.

A scratch that overlaps the operands is pretty much useless
or even plain wrong.  recog.cc::peep2_find_free_register()
has this comment:  /* Don't use registers set or clobbered by the insn.  */

from = peep2_buf_position (peep2_current + from);
to = peep2_buf_position (peep2_current + to);

gcc_assert (peep2_insn_data[from].insn != NULL_RTX);
REG_SET_TO_HARD_REG_SET (live, peep2_insn_data[from].live_before);

while (from != to)
  {
gcc_assert (peep2_insn_data[from].insn != NULL_RTX);

/* Don't use registers set or clobbered by the insn.  */
FOR_EACH_INSN_DEF (def, peep2_insn_data[from].insn)
 SET_HARD_REG_BIT (live, DF_REF_REGNO (def));

from = peep2_buf_position (from + 1);
  }

So it this bogus in that it assumes all registers extend only
over one hard reg?


Yes, looks like a bug to me.


Is this eligible for a PR? ...due to the early WIP patch
that's required.

Johan


FYI, the purpose is to provide a scratch without increasing the register
pressure (which "match_scratch" would do).  Therefore, the RTL peephole
is used instead of forcing reload to come up with a scratch.

More specifically, I see this with

$ avr-gcc bogus-peep2.c -S -Os -da

long ashl32_7 (int i, long x)
{
  return x << 7;
}

with the attached WIP patch atop trunk b222ee10045d.

Johann

Target: avr
Configured with: ../../source/gcc-master/configure --target=avr
--disable-nls --with-dwarf2 --with-gnu-as --with-gnu-ld --disable-shared
   --enable-languages=c,c++
Thread model: single
Supported LTO compression algorithms: zlib
gcc version 15.0.0 20241119 (experimental) (GCC)

Re: Understanding peephole2 PR117699

2024-11-20 Thread Georg-Johann Lay via Gcc


Am 20.11.24 um 11:33 schrieb Richard Biener:

On Wed, Nov 20, 2024 at 11:29 AM Georg-Johann Lay via Gcc
 wrote:


Consider the following RTL peephole from avr.md:

(define_peephole2   ; avr.md:5387
[(match_scratch:QI 3 "d")
 (parallel [(set (match_operand:ALL4 0 "register_operand" "")
 (ashift:ALL4 (match_operand:ALL4 1
"register_operand" "")
  (match_operand:QI 2 "const_int_operand"
"")))
(clobber (reg:CC REG_CC))])]
""
[(parallel [(set (match_dup 0)
 (ashift:ALL4 (match_dup 1)
  (match_dup 2)))
(clobber (match_dup 3))
(clobber (reg:CC REG_CC))])])

As far as I understand, its purpose is to provide a QImode
scratch register provided such a scratch is available.

However, in the .peephole2 RTL dump with -da I see the following:

Splitting with gen_peephole2_100 (avr.md:5387)
...
(insn 24 8 15 2 (parallel [
  (set (reg:SI 22 r22 [orig:47 _3 ] [47])
   (ashift:SI (reg:SI 20 r20 [orig:48 x ] [48])
  (const_int 7 [0x7])))
  (clobber (reg:QI 24 r24))
  (clobber (reg:CC 36 cc))
  ])
   (nil))

That is, the scratch r24:QI is overlapping the output in
r22:SI.  All hard registers are 8-bit regs and hence r22:SI
extends from r22...r25.

A scratch that overlaps the operands is pretty much useless
or even plain wrong.  recog.cc::peep2_find_free_register()
has this comment:  /* Don't use registers set or clobbered by the insn.  */

from = peep2_buf_position (peep2_current + from);
to = peep2_buf_position (peep2_current + to);

gcc_assert (peep2_insn_data[from].insn != NULL_RTX);
REG_SET_TO_HARD_REG_SET (live, peep2_insn_data[from].live_before);

while (from != to)
  {
gcc_assert (peep2_insn_data[from].insn != NULL_RTX);

/* Don't use registers set or clobbered by the insn.  */
FOR_EACH_INSN_DEF (def, peep2_insn_data[from].insn)
 SET_HARD_REG_BIT (live, DF_REF_REGNO (def));

from = peep2_buf_position (from + 1);
  }

So it this bogus in that it assumes all registers extend only
over one hard reg?


Yes, looks like a bug to me.


Reported as https://gcc.gnu.org/PR117699

Johann


FYI, the purpose is to provide a scratch without increasing the register
pressure (which "match_scratch" would do).  Therefore, the RTL peephole
is used instead of forcing reload to come up with a scratch.

More specifically, I see this with

$ avr-gcc bogus-peep2.c -S -Os -da

long ashl32_7 (int i, long x)
{
  return x << 7;
}

with the attached WIP patch atop trunk b222ee10045d.

Johann

Target: avr
Configured with: ../../source/gcc-master/configure --target=avr
--disable-nls --with-dwarf2 --with-gnu-as --with-gnu-ld --disable-shared
   --enable-languages=c,c++
Thread model: single
Supported LTO compression algorithms: zlib
gcc version 15.0.0 20241119 (experimental) (GCC)

Understanding peephole2

2024-11-20 Thread Georg-Johann Lay via Gcc


Consider the following RTL peephole from avr.md:

(define_peephole2   ; avr.md:5387
  [(match_scratch:QI 3 "d")
   (parallel [(set (match_operand:ALL4 0 "register_operand" "")
   (ashift:ALL4 (match_operand:ALL4 1 
"register_operand" "")
(match_operand:QI 2 "const_int_operand" 
"")))

  (clobber (reg:CC REG_CC))])]
  ""
  [(parallel [(set (match_dup 0)
   (ashift:ALL4 (match_dup 1)
(match_dup 2)))
  (clobber (match_dup 3))
  (clobber (reg:CC REG_CC))])])

As far as I understand, its purpose is to provide a QImode
scratch register provided such a scratch is available.

However, in the .peephole2 RTL dump with -da I see the following:

Splitting with gen_peephole2_100 (avr.md:5387)
...
(insn 24 8 15 2 (parallel [
(set (reg:SI 22 r22 [orig:47 _3 ] [47])
 (ashift:SI (reg:SI 20 r20 [orig:48 x ] [48])
(const_int 7 [0x7])))
(clobber (reg:QI 24 r24))
(clobber (reg:CC 36 cc))
])
 (nil))

That is, the scratch r24:QI is overlapping the output in
r22:SI.  All hard registers are 8-bit regs and hence r22:SI
extends from r22...r25.

A scratch that overlaps the operands is pretty much useless
or even plain wrong.  recog.cc::peep2_find_free_register()
has this comment:  /* Don't use registers set or clobbered by the insn.  */

  from = peep2_buf_position (peep2_current + from);
  to = peep2_buf_position (peep2_current + to);

  gcc_assert (peep2_insn_data[from].insn != NULL_RTX);
  REG_SET_TO_HARD_REG_SET (live, peep2_insn_data[from].live_before);

  while (from != to)
{
  gcc_assert (peep2_insn_data[from].insn != NULL_RTX);

  /* Don't use registers set or clobbered by the insn.  */
  FOR_EACH_INSN_DEF (def, peep2_insn_data[from].insn)
SET_HARD_REG_BIT (live, DF_REF_REGNO (def));

  from = peep2_buf_position (from + 1);
}

So it this bogus in that it assumes all registers extend only
over one hard reg?

FYI, the purpose is to provide a scratch without increasing the register
pressure (which "match_scratch" would do).  Therefore, the RTL peephole
is used instead of forcing reload to come up with a scratch.

More specifically, I see this with

$ avr-gcc bogus-peep2.c -S -Os -da

long ashl32_7 (int i, long x)
{
return x << 7;
}

with the attached WIP patch atop trunk b222ee10045d.

Johann

Target: avr
Configured with: ../../source/gcc-master/configure --target=avr 
--disable-nls --with-dwarf2 --with-gnu-as --with-gnu-ld --disable-shared 
 --enable-languages=c,c++

Thread model: single
Supported LTO compression algorithms: zlib
gcc version 15.0.0 20241119 (experimental) (GCC)
diff --git a/gcc/config/avr/avr-protos.h b/gcc/config/avr/avr-protos.h
index d316e0182a2..f2474f46d6b 100644
--- a/gcc/config/avr/avr-protos.h
+++ b/gcc/config/avr/avr-protos.h
@@ -169,6 +169,8 @@ extern rtx cc_reg_rtx;
 extern rtx ccn_reg_rtx;
 extern rtx cczn_reg_rtx;
 
+extern bool avr_shift_is_3op;
+
 #endif /* RTX_CODE */
 
 #ifdef REAL_VALUE_TYPE
diff --git a/gcc/config/avr/avr.cc b/gcc/config/avr/avr.cc
index 508e2d147bf..8b8801e44ec 100644
--- a/gcc/config/avr/avr.cc
+++ b/gcc/config/avr/avr.cc
@@ -229,6 +229,14 @@ bool avr_need_clear_bss_p = false;
 bool avr_need_copy_data_p = false;
 bool avr_has_rodata_p = false;
 
+/* Whether some shift insn alternatives are a 3-operand insn
+   or a 2-operand insn.  This is used when shift insns are
+   split by ???.  The splitted alternatives allow the source
+   and the destination register of the shift to be different
+   right from the start, because the split will split off
+   a byte-shift which allows 3 operands.  */
+bool avr_shift_is_3op = false;
+
 
 /* Transform UP into lowercase and write the result to LO.
You must provide enough space for LO.  Return LO.  */
@@ -437,6 +445,8 @@ avr_set_core_architecture (void)
 static void
 avr_option_override (void)
 {
+  avr_shift_is_3op = true;
+
   /* caller-save.cc looks for call-clobbered hard registers that are assigned
  to pseudos that cross calls and tries so save-restore them around calls
  in order to reduce the number of stack slots needed.
@@ -6590,7 +6600,7 @@ avr_out_cmp_ext (rtx xop[], rtx_code code, int *plen)
 
 
 /* Generate asm equivalent for various shifts.  This only handles cases
-   that are not already carefully hand-optimized in ?sh??i3_out.
+   that are not already carefully hand-optimized in ?sh3_out.
 
OPERANDS[0] resp. %0 in TEMPL is the operand to be shifted.
OPERANDS[2] is the shift count as CONST_INT, MEM or REG.
diff --git a/gcc/config/avr/avr.md b/gcc/config/avr/avr.md
index 04d838ef8a7..73306f6b85a 100644
--- a/gcc/config/avr/avr.md
+++ b/gcc/config/avr/avr.md
@@ -184,73 +184,75 @@ (define_attr "adjust_len"
 ;; no_xmega: non-XMEGA core  xmega : XMEGA core
 ;; no_adiw:  ISA has no ADIW, SBIW   adiw  : ISA has ADIW, SBIW

RE: [RFC] Enabling SVE with offloading to nvptx

2024-11-20 Thread Prathamesh Kulkarni via Gcc



> -Original Message-
> From: Gcc  On Behalf
> Of Prathamesh Kulkarni via Gcc
> Sent: 14 November 2024 13:59
> To: Andrew Stubbs ; Jakub Jelinek 
> Cc: Richard Biener ; Richard Biener
> ; gcc@gcc.gnu.org; Thomas Schwinge
> 
> Subject: RE: [RFC] Enabling SVE with offloading to nvptx
> 
> External email: Use caution opening links or attachments
> 
> 
> > -Original Message-
> > From: Andrew Stubbs 
> > Sent: 12 November 2024 20:23
> > To: Prathamesh Kulkarni ; Jakub Jelinek
> > 
> > Cc: Richard Biener ; Richard Biener
> > ; gcc@gcc.gnu.org; Thomas Schwinge
> > 
> > Subject: Re: [RFC] Enabling SVE with offloading to nvptx
> >
> > External email: Use caution opening links or attachments
> >
> >
> > On 12/11/2024 06:01, Prathamesh Kulkarni via Gcc wrote:
> > >
> > >
> > >> -Original Message-
> > >> From: Jakub Jelinek 
> > >> Sent: 04 November 2024 21:44
> > >> To: Prathamesh Kulkarni 
> > >> Cc: Richard Biener ; Richard Biener
> > >> ; gcc@gcc.gnu.org; Thomas Schwinge
> > >> 
> > >> Subject: Re: [RFC] Enabling SVE with offloading to nvptx
> > >>
> > >> External email: Use caution opening links or attachments
> > >>
> > >>
> > >> On Sat, Nov 02, 2024 at 03:53:34PM +, Prathamesh Kulkarni
> > wrote:
> > >>> The attached patch adds a new bitfield needs_max_vf_lowering to
> > >> loop,
> > >>> and sets that in expand_omp_simd for loops that need delayed
> > >> lowering
> > >>> of safelen and omp simd arrays.  The patch defines a new macro
> > >>> OMP_COMMON_MAX_VF (arbitrarily set to 16), as a placeholder
> value
> > >> for
> > >>> max_vf (instead of INT_MAX), and is later replaced by
> appropriate
> > >>> max_vf during omp_adjust_max_vf pass.  Does that look OK ?
> > >>
> > >> No.
> > >> The thing is, if user doesn't specify safelen, it defaults to
> > >> infinity (which we represent as INT_MAX), if user specifies it,
> > then
> > >> that is the maximum for it (currently in OpenMP specification it
> is
> > >> just an integral value, so can't be a poly int).
> > >> And then the lowering uses the max_vf as another limit, what the
> hw
> > >> can do at most and sizes the magic arrays with it.  So, one needs
> > to
> > >> use minimum of what user specified and what the hw can handle.
> > >> So using 16 as some magic value is just wrong, safelen(16) can be
> > >> specified in the source as well, or safelen(8), or safelen(32) or
> > >> safelen(123).
> > >>
> > >> Thus, the fact that the hw minimum hasn't been determined yet
> needs
> > >> to be represented in some other flag, not in loop->safelen value,
> > and
> > >> before that is determined, loop->safelen should then represent
> what
> > >> the user wrote (or was implied) and the later pass should use
> > minimum
> > >> from loop->safelen and the picked hw maximum.  Of course if the
> > >> picked hw maximum is POLY_INT-ish, the big question is how to
> > compare
> > >> that against the user supplied integer value, either one can just
> > >> handle the INT_MAX (aka
> > >> infinity) special case, or say query the backend on what is the
> > >> maximum value of the POLY_INT at runtime and only use the
> POLY_INT
> > if
> > >> it is always known to be smaller or equal to the user supplied
> > >> safelen.
> > >>
> > >> Another thing (already mentioned in the thread Andrew referenced)
> > is
> > >> that max_vf is used in two separate places.  One is just to size
> of
> > >> the magic arrays and one of the operands of the minimum (the
> other
> > is
> > >> user specified safelen).  In this case, it is generally just fine
> > to
> > >> pick later value than strictly necessary (as long as it is never
> > >> larger than user supplied safelen).
> > >> The other case is simd modifier on schedule clause.  That value
> > >> should better be the right one or slightly larger, but not too
> > much.
> > >> I think currently we just use the INTEGER_CST we pick as the
> > maximum,
> > >> if this sizing is deferred, maybe it needs to be another internal
> > >> function that asks the value (though, it can refer to a loop vf
> in
> > >> another function, which complicates stuff).
> > >>
> > >> Regarding Richi's question, I'm afraid the OpenMP simd loop
> > lowering
> > >> can't be delayed until some later pass.
> > > Hi Jakub,
> > > Thanks for the suggestions! The attached patch makes the following
> > changes:
> > > (1) Delays setting of safelen for offloading by introducing a new
> > > bitfield needs_max_vf_lowering in loop, which is true with
> > offloading enabled, and safelen is then set to min(safelen, max_vf)
> > for the target later in omp_device_lower pass.
> > > Comparing user-specified safelen with poly_int max_vf may not be
> > > always possible at compile-time (say 32 and 16+16x), and even if
> we
> > determine runtime VL based on -mcpu flags, I guess relying on that
> > won't be portable ?
> > > The patch works around this by taking constant_lower_bound
> (max_vf),
> > > and comparing it with safelen instead, with the downside that
> > constant_lower

Re: libdiagnostics name clash

2024-11-20 Thread Eli Zaretskii via Gcc

> Date: Wed, 20 Nov 2024 16:11:16 -0500
> From: David Malcolm via Gcc 
> 
> I merged libdiagnostics to GCC trunk on Monday:
>   https://gcc.gnu.org/wiki/libdiagnostics
> 
> Unfortunately I've since discovered there's at least one libdiagnostics
> .so already in Debian:
> https://tracker.debian.org/pkg/diagnostics
> https://packages.debian.org/search?searchon=contents&keywords=libdiag&mode=filename&suite=unstable&arch=any
> 
> so I've been asked to change the name.
> 
> I'd prefer to avoid having "gcc" in the name.

Why do prefer not to mention "gcc" in the name?

> Some name ideas:
> 
> * libdiag
> * libgdiagnostics (where we can be ambiguous about what the "g" stands
> for)
> * libgdiag (less typing)
> * libcomplain
> * libcomplaint
> * libwhining
> * libwhine (but sounds like the Windows compat software)
> 
> Any ideas?

FWIW, libgdiag sounds best (but libgccdiag or libdiag-gcc would be
even better).

HTH

Re: Use branch prediction from .gcda files

Re: libdiagnostics name clash

Re: libdiagnostics name clash

libdiagnostics name clash

Re: Understanding peephole2

Request to Contribute to GNU Compiler Collection Project for GSoC Preparation

Re: Understanding peephole2

Re: Understanding peephole2 PR117699

Understanding peephole2

RE: [RFC] Enabling SVE with offloading to nvptx

Re: libdiagnostics name clash

11 matches

Site Navigation

Mail list logo

Footer information