from:"Andi Kleen"

Re: [PATCH] testsuite: Disable musttail tests if target uses SJLJ exceptions

2025-07-11 Thread Andi Kleen

Dimitar Dimitrov writes: > A few tests started failing recently on pru-unknown-elf because it uses > SJLJ implementation for exceptions: > FAIL: g++.dg/ext/musttail3.C -std=c++11 (test for excess errors) > .../gcc/gcc/testsuite/g++.dg/ext/musttail3.C:12:34: error: cannot > tail-call: caller

Re: make autprofiledbootstrap with LTO meaningful

2025-07-11 Thread Andi Kleen

On Fri, Jul 11, 2025 at 12:14:46PM +0200, Jan Hubicka wrote: > Hello, > currently autoprofiled bootstrap produces auto-profiles for cc1 and > cc1plus binaries. Those are used to build respective frontend files. > For backend cc1plus.fda is used. This does not work well with LTO > bootstrap where

Re: [committed] i386: Introduce crc_revsi4 expanders [PR120719]

2025-06-27 Thread Andi Kleen

On Fri, Jun 27, 2025 at 08:11:29AM +0200, Uros Bizjak wrote: > On Fri, Jun 27, 2025 at 7:27 AM Andi Kleen wrote: > > > > Uros Bizjak writes: > > > > > Introduce crc_revsi4 expanders to generate CRC32 instruction when > > > using > > > __

Re: [committed] i386: Introduce crc_revsi4 expanders [PR120719]

2025-06-26 Thread Andi Kleen

Uros Bizjak writes: > Introduce crc_revsi4 expanders to generate CRC32 instruction when using > __builtin_rev_crc32_data* builtins with 0x1EDC6F41 poylnomial and -mcrc32. > > PR target/120719 > > gcc/ChangeLog: > > * config/i386/i386.md (crc_revsi4): New expander. > > gcc/testsuite/Change

Re: [AUTOFDO][AARCH64] Add support for profilebootstrap

2025-06-06 Thread Andi Kleen

On 2025-06-06 12:42, Jan Hubicka wrote: Hi, also after fixing this issue my bootstrap failes with: Permission error mapping pages. Consider increasing /proc/sys/kernel/perf_event_mlock_kb, or try again with a smaller value of -m/--mmap_pages. (current value: 4294967295,0) Permission error mappin

Re: [AUTOFDO][AARCH64] Add support for profilebootstrap

2025-05-15 Thread Andi Kleen

On Wed, May 14, 2025 at 02:46:15AM +, Kugan Vivekanandarajah wrote: > Adding Eugene and Andi to CC as Sam suggested. > > > On 13 May 2025, at 12:57 am, Richard Sandiford > wrote: > > > > External email: Use caution opening links or attachments > > > > > > Kugan Vivekanandarajah writes: > >>

Re: [PATCH 2/3] x86: Add a pass to fold tail call

2025-05-06 Thread Andi Kleen

On 2025-05-06 09:48, H.J. Lu wrote: On Mon, May 5, 2025 at 9:56 PM Andi Kleen wrote: On Mon, May 05, 2025 at 06:20:40AM -0700, Andi Kleen wrote: > > If the branch edge destination is a basic block with only a direct > > sibcall, change the jcc target to the sibcall target, d

Re: [PATCH 2/3] x86: Add a pass to fold tail call

2025-05-05 Thread Andi Kleen

On Mon, May 05, 2025 at 06:20:40AM -0700, Andi Kleen wrote: > > If the branch edge destination is a basic block with only a direct > > sibcall, change the jcc target to the sibcall target, decrement the > > destination basic block entry label use count and redirect the edge >

Re: [PATCH 2/3] x86: Add a pass to fold tail call

2025-05-05 Thread Andi Kleen

> If the branch edge destination is a basic block with only a direct > sibcall, change the jcc target to the sibcall target, decrement the > destination basic block entry label use count and redirect the edge > to the exit basic block. Call delete_unreachable_blocks to delete > the unreachable bas

[PATCH] Add diffsummary.py to contrib

2025-04-29 Thread Andi Kleen

This adds an automatic downloader for the latest test results from the mailing list archive and supports diffing test_summary to it. Useful if you don't want to run your own baseline. contrib/ChangeLog: * diffsummary.py: New file. --- contrib/diffsummary.py | 104

Re: [PATCH] Add a bootstrap-native build config

2025-04-25 Thread Andi Kleen

On 2025-04-23 10:18, Richard Biener wrote: On Tue, Apr 22, 2025 at 5:43 PM Andi Kleen wrote: On 2025-04-22 13:22, Richard Biener wrote: > On Sat, Apr 12, 2025 at 5:09 PM Andi Kleen wrote: >> >> From: Andi Kleen >> >> ... that uses -march=native -mtune=native to bu

Re: [PATCH] asf: Enable pass at O2 or higher

2025-04-22 Thread Andi Kleen

On Wed, Jan 29, 2025 at 10:33:14AM +0100, Christoph Müllner wrote: > The avoid-store-forwarding pass is disabled by default and therefore > in the risk of bit-rotting. This patch addresses this by enabling > the pass at O2 or higher. > > The assembly patterns in `bitfield-bitint-abi-align16.c` an

Re: [PHASE1 PATCH] Use optimize free lists for alloc_pages

2025-04-22 Thread Andi Kleen

On Tue, Apr 22, 2025 at 01:27:34PM +0200, Richard Biener wrote: > I assume this passed bootstrap & regtest? Yes it did > > This is OK for trunk after we've released GCC 15.1. Thanks. Andi

[PATCH] Add a bootstrap-native build config

2025-04-12 Thread Andi Kleen

From: Andi Kleen ... that uses -march=native -mtune=native to build a compiler optimized for the host. config/ChangeLog: * bootstrap-native.mk: New file. gcc/ChangeLog: * doc/install.texi: Document bootstrap-native. --- config/bootstrap-native.mk | 1 + gcc/doc/install.texi

[PATCH] Add diffsummary.py to contrib

2025-04-11 Thread Andi Kleen

This adds an automatic downloader for the latest test results from the mailing list archive and supports diffing test_summary to it. Useful if you don't want to run your own baseline. contrib/ChangeLog: * diffsummary.py: New file. --- contrib/diffsummary.py | 104

[PHASE1 PATCH] Use optimize free lists for alloc_pages

2025-04-11 Thread Andi Kleen

Right now ggc has a single free list for multiple sizes. In some cases the list can get mixed by orders and then the allocator may spend a lot of time walking the free list to find the right sizes. This patch splits the free list into multiple free lists by order which allows O(1) access in most c

[PATCH v3] Don't instrument exit edges after musttail

2025-04-05 Thread Andi Kleen

When -fprofile-generate is used musttail often fails because the compiler adds instrumentation after the tail calls. This patch prevents adding exit extra edges after musttail because for a tail call the execution leaves the function and can never come back even on a unwind or exception. This is

[PATCH] PR119482: Avoid mispredictions in bitmap_set_bit

2025-04-01 Thread Andi Kleen

From: Andi Kleen This isn't a regression, but it's a very simple patch with high performance improvement, so perhaps suitable in the current stage. --- bitmap_set_bit checks the original value of the bit to return it to the caller and then only writes the new value back if it cha

Re: Patch ping [PATCH] tailc: Don't fail musttail calls if they use or could use local arguments, instead warn [PR119376]

2025-04-01 Thread Andi Kleen

> I'd like to ping the > https://gcc.gnu.org/pipermail/gcc-patches/2025-March/679182.html > patch. > I know it is quite controversial and if clang wouldn't be the first > to implement this I'd certainly not go that way; I am willing to change > the warning option names or move the maybe one from -W

Re: [PATCH] testsuite: Fix up musttail2.C test

2025-03-26 Thread Andi Kleen

> You're right (although I don't remember which targets are > non-external_musttail). Several flavors of ARM and Power at least.

Re: [PATCH] tailc: Only diagnose musttail failures during tailc or musttail passes [PR119376]

2025-03-26 Thread Andi Kleen

Jakub Jelinek writes: > --- gcc/testsuite/g++.dg/opt/musttail2.C.jj 2025-03-24 13:27:44.329204196 > +0100 > +++ gcc/testsuite/g++.dg/opt/musttail2.C 2025-03-24 13:28:08.975867389 > +0100 > @@ -0,0 +1,14 @@ > +// PR ipa/119376 > +// { dg-do compile { target musttail } } I think this need

Re: [PATCH] tailc: Don't fail musttail calls if they use or could use local arguments, instead warn [PR119376]

2025-03-25 Thread Andi Kleen

> This can be rewritten as > > void foo(int v) > { > { > int a; > capture(&a); > if (condition) > goto tail_position; > // do something with a > } > tail_position: > tailcall(v); > } > > or with 'do { ... if (...) break; ...} while (0)' when one prefers that to > goto

Re: [PATCH] tailc: Don't fail musttail calls if they use or could use local arguments, instead warn [PR119376]

2025-03-25 Thread Andi Kleen

On Tue, Mar 25, 2025 at 07:43:28PM +0300, Alexander Monakov wrote: > Hello, > > FWIW I think Clang made a mistake in bending semantics in a way that is > clearly > misaligned with the general design of C and C++, where a language-native, so > to > speak, solution was available: introduce a scope

Re: [PATCH v3] Don't instrument exit edges after musttail

2025-03-25 Thread Andi Kleen

> 2025-03-25 Jakub Jelinek > Andi Kleen > > PR gcov-profile/118442 > * profile.cc (branch_prob): Ignore EDGE_FAKE edges from musttail calls > to EXIT. > > * c-c++-common/pr118442.c: New test. > > --- gcc/profile.cc.jj 2025-

[PATCH] PR118442: Don't instrument exit edges after musttail

2025-03-22 Thread Andi Kleen

From: Andi Kleen When -fprofile-generate is used musttail often fails because the compiler adds instrumentation after the tail calls. This patch prevents adding exit extra edges after musttail because for a tail call the execution leaves the function and can never come back even on a unwind or

Re: [PATCH v2 2/2] PR119376: Disable clang musttail

2025-03-20 Thread Andi Kleen

On Thu, Mar 20, 2025 at 06:25:26PM +0100, Jakub Jelinek wrote: > On Thu, Mar 20, 2025 at 10:01:02AM -0700, Andi Kleen wrote: > > So it could be as simple as that patch? It solves your test case at least > > for x86. > > Not sure I like this, but if others (e.g. Richi, Josep

Re: [PATCH v2 2/2] PR119376: Disable clang musttail

2025-03-20 Thread Andi Kleen

On Thu, Mar 20, 2025 at 05:28:48PM +0100, Jakub Jelinek wrote: > On Thu, Mar 20, 2025 at 09:19:02AM -0700, Andi Kleen wrote: > > The inlining was just one of the issue, there are some related to > > different semantics of escaped locals. gcc always errors out while > > LLVM

Re: [PATCH v2 2/2] PR119376: Disable clang musttail

2025-03-20 Thread Andi Kleen

On Thu, Mar 20, 2025 at 11:45:33AM -0400, Jason Merrill wrote: > On 3/19/25 9:31 PM, Andi Kleen wrote: > > From: Andi Kleen > > > > There are multiple reports (see PR 119376) now where semantic differences > > in the gcc musttail implementation break existing programs

[PATCH v2 2/2] PR119376: Disable clang musttail

2025-03-19 Thread Andi Kleen

From: Andi Kleen There are multiple reports (see PR 119376) now where semantic differences in the gcc musttail implementation break existing programs written for the clang variant. Even though that can be all hopefully fixed eventually, for the gcc 15 release it seems safer to disable clang

[PATCH v2 1/2] PR118442: Don't instrument exit edges after musttail

2025-03-19 Thread Andi Kleen

From: Andi Kleen When -fprofile-generate is used musttail often fails because the compiler adds instrumentation after the tail calls. This patch prevents adding exit extra edges after musttail because for a tail call the execution leaves the function and can never come back even on a unwind or

Re: [PATCH] PR118442: Don't instrument exit edges after musttail

2025-03-19 Thread Andi Kleen

> This looks wrong to me. Even tail calls can be terminated with exit, > perform longjmp, do other things for which stmt_can_terminate_bb_p > should return true. stmt_can_terminate_bb_p is used in many places, not > just in the predict instrumentation. Okay so the check should be only used for s

Re: [PATCH] PR118442: Don't instrument exit edges after musttail

2025-03-19 Thread Andi Kleen

Andi Kleen writes: > diff --git a/gcc/input.cc b/gcc/input.cc > index fabfbfb6eaa..d3b12037ba8 100644 > --- a/gcc/input.cc > +++ b/gcc/input.cc > @@ -1325,6 +1325,8 @@ dump_line_table_statistics (void) >if (s.num_expanded_macros != 0) > fprintf (stderr, "Av

Re: The COBOL front end, version 3, now in 14 easy pieces (+NIST)

2025-02-24 Thread Andi Kleen

"James K. Lowden" writes: >> Having a minimal harness in GCCs testsuite is critical - I'd expect a >> gcc/testsuite/gcobol.dg/dg.exp supporting execution tests. I assume >> Cobol has a way to exit OK or fatally and this should be >> distinguished as testsuite PASS or FAIL. > > Yes, a COBOL pro

[COMMITTED PATCH] Fix description of file-cache-lines/file-cache-files params

2025-02-18 Thread Andi Kleen

From: Andi Kleen The file-cache-lines / file-cache-files tunables were documented in the wrong section. Fix that. Reported-by: Filip Kastl Comitted as obvious. gcc/ChangeLog: * doc/invoke.texi: --- gcc/doc/invoke.texi | 20 ++-- 1 file changed, 10 insertions(+), 10

[COMITTED] Fix file cache tunables documentation

2025-02-04 Thread Andi Kleen

From: Andi Kleen Document new params in invoke.texi. The auto tuning description was on the wrong tunable, move to lines. Comitted as obvious. gcc/ChangeLog: * doc/invoke.texi: Document file cache tunables. * params.opt: Move auto tuning description to lines. --- gcc/doc

Re: [PATCH v2 6/7] Enable vectorization for input.cc find_end_of_line function

2025-02-02 Thread Andi Kleen

On Tue, Jan 28, 2025 at 09:50:41AM +0100, Richard Biener wrote: > On Mon, Jan 27, 2025 at 9:59 PM David Malcolm wrote: > > > > On Sat, 2025-01-25 at 23:31 -0800, Andi Kleen wrote: > > > From: Andi Kleen > > > > > > This is the hot function in input.cc &

Re: [PATCH v2 4/7] Add a cache of recent lines

2025-02-02 Thread Andi Kleen

> > If I reading this right, calls to get_next_line lead to insertions into > the ring buffer whilst the buffer is empty or the last line in the ring > buffer cache is m_line_num - 1. > > There are a few places where we update m_line_num, but this caching > code doesn't seem to touch those places

Re: [PATCH v2 7/7] Add a unit test for random access in the file cache

2025-02-02 Thread Andi Kleen

On Sun, Feb 02, 2025 at 09:35:52PM -0800, Andi Kleen wrote: > > Patch 7 is OK otherwise, and I'm taking a look at the rest of the > > patches now; thanks. > > Any comments on the other patches? nm. I see you already commented. somehow i missed that. -Andi

Re: [PATCH v2 7/7] Add a unit test for random access in the file cache

2025-02-02 Thread Andi Kleen

> Patch 7 is OK otherwise, and I'm taking a look at the rest of the > patches now; thanks. Any comments on the other patches? Thanks, -Andi

[PATCH v2 5/7] Size input line cache based on file size

2025-01-25 Thread Andi Kleen

From: Andi Kleen While the input line cache size now tunable it's better if the compiler auto tunes it. Otherwise large files needing random file access will still have to search many lines to find the right lines. Add support for allocating one line anchor per hundred input lines. This

PR118168: Updated fix

2025-01-25 Thread Andi Kleen

This is a fix for slowness accessing random lines in the source file for diagnostics. This version I added a unit test as requested by David, and also added a x86 vectorization hint for the hot line search function (with the early break work the vectorizer is powerful enough to handle it now) If

[PATCH v2 7/7] Add a unit test for random access in the file cache

2025-01-25 Thread Andi Kleen

From: Andi Kleen gcc/ChangeLog: * input.cc (check_line): New. (test_replacement): New function to test line caching. (input_cc_tests): Call test_replacement --- gcc/input.cc | 46 ++ 1 file changed, 46 insertions(+) diff

[PATCH v2 6/7] Enable vectorization for input.cc find_end_of_line function

2025-01-25 Thread Andi Kleen

From: Andi Kleen This is the hot function in input.cc The vectorizer can vectorize it now, but in a generic cpu O2 x86 build it isn't. Add a automatic target clone to handle it for x86 and build that function with O3. The ifdef here is ugly, perhaps gcc should have a more convenient "

[PATCH v2 1/7] Add tunables for input buffer

2025-01-25 Thread Andi Kleen

From: Andi Kleen The input machinery to read the source code independent of the lexer has a range of hard coded maximum array sizes that can impact performance. Make them tunable. input.cc is part of libcommon so it cannot direct access params without a level of indirection. gcc/ChangeLog

[PATCH v2 2/7] Rebalance file_cache input line cache dynamically

2025-01-25 Thread Andi Kleen

From: Andi Kleen The input context file_cache maintains an array of anchors to speed up accessing lines before the previous line. The array has a fixed upper size and the algorithm relies on the linemap reporting the maximum number of lines in the file in advance to compute the position of each

[PATCH v2 3/7] Remove m_total_lines support from input cache

2025-01-25 Thread Andi Kleen

From: Andi Kleen With the new cache maintenance algorithm we don't need the maximum number of lines anymore. Remove all the code for that. gcc/ChangeLog: PR preprocessor/118168 * input.cc (total_lines_num): Remove. (file_cache_slot::evict):

[PATCH v2 4/7] Add a cache of recent lines

2025-01-25 Thread Andi Kleen

From: Andi Kleen For larger files the file_cache line index will be spread out to make the index fit into the fixed buffer, so any access to the non latest line will need some skipping of lines. Most accesses for line are near the latest line because a diagnostic is likely near where the

[PATCH] Describe inline assembler parsing

2025-01-18 Thread Andi Kleen

From: Andi Kleen Correct the description of inline assembler to say that gcc does limited assembler parsing to estimate the length of inline assembler statements, and document that certain assembler primitives can confuse it. gcc/ChangeLog: * doc/extend.texi: Document assembler parsing

[COMMITTED] Fix an incorrect file header comment for the core2 scheduling model

2025-01-15 Thread Andi Kleen

From: Andi Kleen Committed as obvious. gcc/ChangeLog: * config/i386/x86-tune-sched-core.cc: Fix incorrect comment. --- gcc/config/i386/x86-tune-sched-core.cc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/config/i386/x86-tune-sched-core.cc b/gcc/config/i386

Re: [PATCH] docs: Fix up inline asm documentation

2025-01-15 Thread Andi Kleen

On Wed, Jan 15, 2025 at 10:41:11PM +0100, Jakub Jelinek wrote: > Hi! > > When writing the gcc-15/changes.html patch posted earlier, I've been > wondering where significant part of the Basic asm chapter went and the > problem was the insertion of a new @node in the middle of the Basic Asm > @node,

Re: [PING] [PATCH 1/6] Add tunables for input buffer

2025-01-08 Thread Andi Kleen

On Wed, Jan 08, 2025 at 07:47:27PM -0500, David Malcolm wrote: > On Wed, 2025-01-08 at 07:48 -0800, Andi Kleen wrote: > > > > I wanted to ping this patch series. Thanks. > > > > -Andi > > > > Thanks for tha patches, and sorry about not getting back

[PING] [PATCH 1/6] Add tunables for input buffer

2025-01-08 Thread Andi Kleen

I wanted to ping this patch series. Thanks. -Andi

Re: [PATCH] c++: Fix up ICEs on constexpr inline asm strings in templates [PR118277]

2025-01-07 Thread Andi Kleen

On Tue, Jan 07, 2025 at 08:36:29PM +0100, Jakub Jelinek wrote: > Hi! > > The following patch fixes ICEs when the new inline asm syntax > to use C++26 static_assert-like constant expressions in place > of string literals is used in templates. > As finish_asm_stmt doesn't do any checking for > proce

Re: [PATCH] tree-switch-conversion: don't apply switch size limit on jump tables

2025-01-05 Thread Andi Kleen

Mark Wielaard writes: > commit 56946c801a7c ("gimple: Add limit after which slower switchlower > algs are used [PR117091] [PR117352]") introduced a limit on the number > of cases of a switch. It also bails out on finding jump tables if the > switch is too large. This introduces a compile time reg

[PATCH 6/6] Size input line cache based on file size

2024-12-26 Thread Andi Kleen

From: Andi Kleen While the input line cache size now tunable it's better if the compiler auto tunes it. Otherwise large files needing random file access will still have to search many lines to find the right lines. Add support for allocating one line anchor per hundred input lines. This

[PATCH 3/6] Remove m_total_lines support from input cache

2024-12-26 Thread Andi Kleen

From: Andi Kleen With the new cache maintenance algorithm we don't need the maximum number of lines anymore. Remove all the code for that. gcc/ChangeLog: PR preprocessor/118168 * input.cc (total_lines_num): Remove. (file_cache_slot::evict):

[PATCH 2/6] Rebalance file_cache input line cache dynamically

2024-12-26 Thread Andi Kleen

From: Andi Kleen The input context file_cache maintains an array of anchors to speed up accessing lines before the previous line. The array has a fixed upper size and the algorithm relies on the linemap reporting the maximum number of lines in the file in advance to compute the position of each

[PATCH 5/6] Add a cache of recent lines

2024-12-26 Thread Andi Kleen

From: Andi Kleen For larger files the file_cache line index will be spread out to make the index fit into the fixed buffer, so any access to the non latest line will need some skipping of lines. Most accesses for line are near the latest line because a diagnostic is likely near where the

Fix file_cache for large files

2024-12-26 Thread Andi Kleen

This patch kit fixes scaling issues for the input cache, especially for C, motivated by PR118168. In overall in number of lines it is practically neutral: gcc/input.cc | 261 -- gcc/inp

[PATCH 4/6] Move ferror out of hot loop of file cache

2024-12-26 Thread Andi Kleen

From: Andi Kleen glibc ferror is surprisingly expensive. Move it out of the hot loop of finding lines by setting a flag after the actual IO operations. gcc/ChangeLog: PR preprocessor/118168 * input.cc (file_cache_slot::m_error): New field. (file_cache_slot::create

[PATCH 1/6] Add tunables for input buffer

2024-12-26 Thread Andi Kleen

From: Andi Kleen The input machinery to read the source code independent of the lexer has a range of hard coded maximum array sizes that can impact performance. Make them tunable. input.cc is part of libcommon so it cannot direct access params without a level of indirection. gcc/ChangeLog

Re: The COBOL front end, in 8 notes

2024-12-13 Thread Andi Kleen

"James K. Lowden" writes: > The following 8 patches constitute the 80 files needed to build and > document the COBOL front end. They assume that following exist: > > gcc/cobol/ChangeLog > libgcobol/ChangeLog > > The messages are grouped by files in a more or less logical order, > but gro

Re: [PATCH] gimple: Add limit after which slower switchlower algs are used [PR117091] [PR117352]

2024-12-05 Thread Andi Kleen

> > What do you think, Andi and Richi? I myself slightly prefer keeping the DP > > but > > I would be fine with either option. > > I think we can keep both, though I have no strong opinion. Keeping both is fine for me. -Andi

Re: [PATCH] gimple: Add limit after which slower switchlower algs are used [PR117091] [PR117352]

2024-12-05 Thread Andi Kleen

> > But yeah, thinking about it some more, 1 seems like a lot. Maybe the > > limit > > could be 1000. That's also big enough. I could try to run the testcase > > set to > > 1000 on my not-so-powerful laptop this time and check that even on that > > machine > > it finishes "fast" (under a

Re: [PATCH] PR117350: Keep assembler name for abstract decls for autofdo

2024-11-26 Thread Andi Kleen

On Tue, Nov 26, 2024 at 04:06:37PM -0800, Andrew Pinski wrote: > On Thu, Oct 31, 2024 at 1:41 PM Andi Kleen wrote: > > > > From: Andi Kleen > > > > autofdo looks up inline stacks and tries to match them with the profile > > data using their symbol name. Mak

Re: [PATCH] gimple: Add limit after which slower switchlower algs are used [PR117091] [PR117352]

2024-11-21 Thread Andi Kleen

On Fri, Nov 15, 2024 at 10:43:57AM +0100, Filip Kastl wrote: > Hi, > > Andi's greedy bit test finding algorithm was reverted. I found a fix for the > problem that caused the revert. I made this patch to reintroduce the greedy > alg into GCC. However I think we should keep the old slow but more

Re: [PATCH] Add a bootstrap-native build config

2024-11-06 Thread Andi Kleen

On Tue, Jul 30, 2024 at 09:40:42AM -0700, Andi Kleen wrote: > From: Andi Kleen > > ... that uses -march=native -mtune=native to build a compiler optimized > for the host. > > config/ChangeLog: > > * bootstrap-native.mk: New file. > > gcc/ChangeLog:

Re: [PATCH v3] Remove sys/user time in -ftime-report

2024-11-06 Thread Andi Kleen

On Fri, Nov 01, 2024 at 02:01:18PM -0400, John David Anglin wrote: > This breaks build on hppa64-hp-hpux11.11. This target has clock_gettime > but it doesn't have CLOCK_MONOTONIC. It has CLOCK_REALTIME. I modified > timevar.cc as follows to restore build. Alternative would be to check for CLOCK

Re: [PATCH] PR117350: Keep assembler name for abstract decls for autofdo

2024-11-05 Thread Andi Kleen

On Tue, Nov 05, 2024 at 09:47:17AM +0100, Richard Biener wrote: > On Tue, Nov 5, 2024 at 2:02 AM Jason Merrill wrote: > > > > On 10/31/24 4:40 PM, Andi Kleen wrote: > > > From: Andi Kleen > > > > > > autofdo looks up inline stacks and tries to match th

[PATCH] Update gcc-auto-profile / gen_autofdo_event.py

2024-10-31 Thread Andi Kleen

From: Andi Kleen - Fix warnings with newer python versions about bad escapes by making all the python string raw. - Add a fallback for using the builtin perf event list if the CPU model number is unknown. - Regenerate the shipped gcc-auto-profile with the changes. contrib/ChangeLog

[PATCH] Enable autofdo bootstrap for lto/fortran

2024-10-31 Thread Andi Kleen

From: Andi Kleen When autofdo bootstrap support was originally implemented there were issues with the LTO bootstrap, that is why it wasn't enabled for them. I retested this now and it works on x86_64-linux. Fortran was also missing, not sure why. Also enabled now. gcc/fortran/Chan

[PATCH] PR117350: Keep assembler name for abstract decls for autofdo

2024-10-31 Thread Andi Kleen

From: Andi Kleen autofdo looks up inline stacks and tries to match them with the profile data using their symbol name. Make sure all decls that can be in a inline stack have a valid assembler name. This fixes a bootstrap problem with autoprofiledbootstrap and LTO. 2024-10-30 Jason Merrill

Re: [PATCH v3] Remove sys/user time in -ftime-report

2024-10-31 Thread Andi Kleen

> I'm getting a build failure: > > timevar.cc:163: undefined reference to `clock_gettime' > > Our frozen build tools are intended to produce binaries that work > "everywhere", so they're a few years old, but apparently something didn't > configure correctly. > > I see that libbacktrace configure

Re: [PATCH v3] Remove sys/user time in -ftime-report

2024-10-30 Thread Andi Kleen

On Wed, Oct 23, 2024 at 02:56:51PM +0200, Richard Biener wrote: > On Wed, Oct 9, 2024 at 6:18 PM Andi Kleen wrote: > > > > From: Andi Kleen > > > > Retrieving sys/user time in timevars is quite expensive because it > > always needs a system call. Only getting

Re: [PATCH v3 1/2][RFC] Provide more contexts for -Warray-bounds, -Wstringop-* warning messages due to code movements from compiler transformation [PR109071]

2024-10-30 Thread Andi Kleen

Qing Zhao writes: > Control this with a new option -fdiagnostics-details. It would be useful to be also able to print the inline call stack, maybe with a separate option. In some array bounds cases I looked at the problem was hidden in some inlines and it wasn't trivial to figure it out. I wro

Re: [PATCH v2 3/3] Simplify switch bit test clustering algorithmg

2024-10-29 Thread Andi Kleen

> > However this exposes PR117352 which is a negative interaction of the > > more aggressive bit test conversion. I don't think it's a show stopper, > > this can be sorted out later. > > I think it is a show stopper for GCC 15 because it is a pretty big > performance regression with targets that

Re: [PATCH v2 3/3] Simplify switch bit test clustering algorithm

2024-10-29 Thread Andi Kleen

On Tue, Oct 29, 2024 at 01:50:57PM +0100, Richard Biener wrote: > On Mon, Oct 28, 2024 at 9:58 PM Andi Kleen wrote: > > > > From: Andi Kleen > > > > The current switch bit test clustering enumerates all possible case > > clusters combinations to find ones

[PATCH v2 2/3] Only do switch bit test clustering when multiple labels point to same bb

2024-10-28 Thread Andi Kleen

From: Andi Kleen The bit cluster code generation strategy is only beneficial when multiple case labels point to the same code. Do a quick check if that is the case before trying to cluster. This fixes the switch part of PR117091 where all case labels are unique however it doesn't addres

[PATCH v2 3/3] Simplify switch bit test clustering algorithm

2024-10-28 Thread Andi Kleen

From: Andi Kleen The current switch bit test clustering enumerates all possible case clusters combinations to find ones that fit the bit test constrains best. This causes performance problems with very large switches. For bit test clustering which happens naturally in word sized chunks I don&#

[PATCH v2 1/3] Disable -fbit-tests and -fjump-tables at -O0

2024-10-28 Thread Andi Kleen

From: Andi Kleen gcc/ChangeLog: * common.opt: Enable -fbit-tests and -fjump-tables only at -O1. * opts.cc (default_options_table): Dito. --- gcc/common.opt | 4 ++-- gcc/opts.cc| 2 ++ 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/gcc/common.opt b/gcc

[PATCH 2/2] Only do switch bit test clustering when multiple labels point to same bb

2024-10-16 Thread Andi Kleen

From: Andi Kleen The bit cluster code generation strategy is only beneficial when multiple case labels point to the same code. Do a quick check if that is the case before trying to cluster. This fixes the switch part of PR117091 where all case labels are unique however it doesn't addres

[PATCH 1/2] Disable -fbit-tests and -fjump-tables at -O0

2024-10-16 Thread Andi Kleen

From: Andi Kleen gcc/ChangeLog: * common.opt: Enable -fbit-tests and -fjump-tables only at -O1. * tree-switch-conversion.h (jump_table_cluster::is_enabled): Dito. --- gcc/common.opt | 4 ++-- gcc/tree-switch-conversion.h | 5 +++-- 2 files changed, 5

[PATCH] PR116510: Add missing fold_converts into tree switch if conversion

2024-10-15 Thread Andi Kleen

From: Andi Kleen Passes test suite. Ok to commit? gcc/ChangeLog: PR middle-end/116510 * tree-if-conv.cc (predicate_bbs): Add missing fold_converts. gcc/testsuite/ChangeLog: * gcc.dg/vect/vect-switch-ifcvt-3.c: New test. --- gcc/testsuite/gcc.dg/vect/vect-switch-ifcvt

[PATCH v3] Remove sys/user time in -ftime-report

2024-10-09 Thread Andi Kleen

From: Andi Kleen Retrieving sys/user time in timevars is quite expensive because it always needs a system call. Only getting the wall time is much cheaper because operating systems have optimized paths for this. The sys time isn't that interesting for a compiler and wall time is usually

Re: [PATCH v2] Add -ftime-report-wall

2024-10-09 Thread Andi Kleen

> So, shouldn't we go without the new option and simply change > -ftime-report behavior? I think it's fine (given the constraints I outlined earlier). It will slightly change the output, but I guess there aren't that many users that parse it mechanically. I can do that unless someoneelse objects.

[PATCH v2] Add -ftime-report-wall

2024-10-05 Thread Andi Kleen

From: Andi Kleen Time vars normally use times(2) to get the user/sys/wall time, which is always a system call. I don't think the system time is very useful because most overhead is in user time. If we only use the wall (or monotonic) time modern OS have an optimized path to get it directly

Re: [PATCH v1] Add -ftime-report-wall

2024-10-03 Thread Andi Kleen

> The only consumer I know of for the JSON time report data is in the > integration tests I wrote for -fanalyzer, which assumes that all fields > are present when printing, and then goes on to use the "user" times for > summarizing; see this commit FWIW: > https://github.com/davidmalcolm/gcc-analyz

Re: [PATCH] testsuite: Fix tail_call and musttail effective targets [PR116080]

2024-10-03 Thread Andi Kleen

On Thu, Oct 03, 2024 at 01:48:35PM +, Christophe Lyon wrote: > Some of the musttail tests (eg musttail7.c) fail on arm-eabi because > check_effective_target_musttail pass, but the actual code in the test > is rejected. Looks good to me. Thanks. -Andi

Re: [PATCH v1] Add -ftime-report-wall

2024-10-03 Thread Andi Kleen

> Note that if the user requests SARIF output e.g. with > -fdiagnostics-format=sarif-stderr > then any timevar data from -ftime-report is written in JSON form as > part of the SARIF, rather than in text form to stderr (see > 75d623946d4b6ea80a777b789b116d4b4a2298dc). > > I see that the proposed

[PATCH v1] Add -ftime-report-wall

2024-10-02 Thread Andi Kleen

From: Andi Kleen Time vars normally use times(2) to get the user/sys/wall time, which is always a system call. I don't think the system time is very useful because most overhead is in user time. If we only use the wall (or monotonic) time modern OS have an optimized path to get it directly

Re: [RFC PATCH] Allow limited extended asm at toplevel

2024-10-02 Thread Andi Kleen

Jakub Jelinek writes: > And for kernel perhaps we should add some new option which allows > some dumb parsing of the toplevel asms and gather something from that > parsing. See also https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107779 > The restrictions I've implemented are: > 1) asm qualifiers

Re: [PING^4] [PATCH] Add a bootstrap-native build config

2024-09-10 Thread Andi Kleen

On Tue, Sep 10, 2024 at 03:29:08AM +, Ramana Radhakrishnan wrote: > > diff --git a/config/bootstrap-native.mk b/config/bootstrap-native.mk > > new file mode 100644 > > index ..a4a3d8594089 > > --- /dev/null > > +++ b/config/bootstrap-native.mk > > @@ -0,0 +1

[PING^4] [PATCH] Add a bootstrap-native build config

2024-09-09 Thread Andi Kleen

Andi Kleen writes: Ping^4 Could someone please approve this (nearly trivial) patch? Thanks, -Andi > Andi Kleen writes: > > Ping^3 > >> Andi Kleen writes: >> >> PING^2 for the patch. >> >> (not sure if there is any maintainer to cc here, this is gen

[PING^3] [PATCH] PR116080: Fix test suite checks for musttail

2024-09-02 Thread Andi Kleen

Andi Kleen writes: PING^3 > Andi Kleen writes: > > PING^2 for https://gcc.gnu.org/pipermail/gcc-patches/2024-July/658602.html > > This fixes some musttail related test suite failures that cause noise on > various targets. > >> Andi Kleen writes: >> >>

[PING^3] [PATCH] Add a bootstrap-native build config

2024-09-02 Thread Andi Kleen

Andi Kleen writes: Ping^3 > Andi Kleen writes: > > PING^2 for the patch. > > (not sure if there is any maintainer to cc here, this is generic build > infrastructure) > >> Andi Kleen writes: >> >> I wanted to ping this patch: >> >> https:/

[PATCH] Fix test failing on sparc

2024-08-27 Thread Andi Kleen

From: Andi Kleen SPARC does not support vectorizing conditions, which this test relies on. Use vect_condition as effective target. Committed as obvious. PR testsuite/116500 gcc/testsuite/ChangeLog: * gcc.dg/vect/vect-switch-ifcvt-1.c: Use vect_condition to check if

[PING^2] [PATCH] Add a bootstrap-native build config

2024-08-25 Thread Andi Kleen

Andi Kleen writes: PING^2 for the patch. (not sure if there is any maintainer to cc here, this is generic build infrastructure) > Andi Kleen writes: > > I wanted to ping this patch: > > https://gcc.gnu.org/pipermail/gcc-patches/2024-July/658729.html > > >> From:

Re: [PING^2] [PATCH] PR116080: Fix test suite checks for musttail

2024-08-25 Thread Andi Kleen

Andi Kleen writes: PING^2 for https://gcc.gnu.org/pipermail/gcc-patches/2024-July/658602.html This fixes some musttail related test suite failures that cause noise on various targets. > Andi Kleen writes: > > I wanted to ping this patch. It fixes test suite noise on various

[PING] [PATCH v2] Support if conversion for switches

2024-08-13 Thread Andi Kleen

Andi Kleen writes: I wanted to ping this patch. I believe Richard ok'ed most of it earlier but need an ok for the changes resulting from his review too (but they were mostly only test suite and comment fixes apart from some minor tweaks) -Andi > The gimple-if-to-switch pass con

[PING] [PATCH] PR116080: Fix test suite checks for musttail

2024-08-12 Thread Andi Kleen

Andi Kleen writes: I wanted to ping this patch. It fixes test suite noise on various targets. https://gcc.gnu.org/pipermail/gcc-patches/2024-July/658602.html > From: Andi Kleen > > This is a new attempt to fix PR116080. The previous try was reverted > because it just broke a bu

1 2 3 4 5 6 7 8 9 >

1 - 100 of 868 matches

Mail list logo