Re: [PATCH] [contrib] Add process_make.py

2025-07-18 Thread Dhruv Chawla
nvidia.com wrote: From: Dhruv Chawla This is a script that makes it easier to visualize the output from make. It filters out most of the output, leaving only (mostly) messages about files being compiled, installed and linked. It is not 100% accurate in the matching, but I feel it does a good enoug

Re: [PATCH 0/1] [RFC][AutoFDO]: Source filename tracking in GCOV

2025-07-15 Thread Dhruv Chawla
On 08/07/25 18:01, Jan Hubicka wrote: External email: Use caution opening links or attachments Hi Honza, On 8 Jul 2025, at 2:26 am, Jan Hubicka wrote: External email: Use caution opening links or attachments Hi, as discussed also on the autofdo pull request, LLVM solves the same problem

Re: [PATCH 1/1] [RFC][AutoFDO] Propagate information to outline copies if not inlined

2025-07-01 Thread Dhruv Chawla
On 02/07/25 07:26, Kugan Vivekanandarajah wrote: Given the latest few patches that you have committed, is this patch necessary anymore? I have not fully understood the new logic as I was on holiday last week, but it looks like the propagation is occurring correctly now? I think you are ref

Re: [PATCH 1/1] [RFC][AutoFDO] Propagate information to outline copies if not inlined

2025-07-01 Thread Dhruv Chawla
On 17/06/25 18:35, Jan Hubicka wrote: External email: Use caution opening links or attachments From: Dhruv Chawla This patch modifies afdo_set_bb_count to propagate profile information to outline copies of functions if they are not inlined. This information gets lost otherwise. Signed-off

Re: [PATCH] [RFC][AutoFDO] Source filename tracking in GCOV

2025-06-18 Thread Dhruv Chawla
On 16/06/25 22:31, Jan Hubicka wrote: External email: Use caution opening links or attachments gcc/ChangeLog: * auto-profile.cc (AUTO_PROFILE_VERSION): Bump from 2 to 3. (string_table::get_real_name): Define new member function. (string_table::get_file_name): Likewise.

Re: [AutoFDO] Fix get_original_name to strip only names that are generated after auto-profile

2025-06-18 Thread Dhruv Chawla
On 18/06/25 14:21, Kugan Vivekanandarajah wrote: Hi, On 17 Jun 2025, at 4:51 pm, Kugan Vivekanandarajah wrote: External email: Use caution opening links or attachments On 17 Jun 2025, at 4:18 pm, Dhruv Chawla wrote: On 17/06/25 06:10, Kugan Vivekanandarajah wrote: External email: Use

Re: [PATCH 1/1] [RFC][AutoFDO] Propagate information to outline copies if not inlined

2025-06-17 Thread Dhruv Chawla
On 17/06/25 18:35, Jan Hubicka wrote: External email: Use caution opening links or attachments From: Dhruv Chawla This patch modifies afdo_set_bb_count to propagate profile information to outline copies of functions if they are not inlined. This information gets lost otherwise. Signed-off

Re: [AutoFDO] Fix get_original_name to strip only names that are generated after auto-profile

2025-06-16 Thread Dhruv Chawla
On 17/06/25 06:10, Kugan Vivekanandarajah wrote: External email: Use caution opening links or attachments Hi, As discusses earlier, get_original_name is used to match profile binary names to the symbol names in the IR during auto-profile pass. I think it could be good to add a link to the mai

Re: [PATCH] [RFC][AutoFDO] Source filename tracking in GCOV

2025-06-16 Thread Dhruv Chawla
On 16/06/25 22:31, Jan Hubicka wrote: External email: Use caution opening links or attachments gcc/ChangeLog: * auto-profile.cc (AUTO_PROFILE_VERSION): Bump from 2 to 3. (string_table::get_real_name): Define new member function. (string_table::get_file_name): Likewise.

Re: [PATCH 0/1] [RFC][AutoFDO]: Source filename tracking in GCOV

2025-06-16 Thread Dhruv Chawla
me function name. Maybe there could be a warning for entries in the GCOV profile which were never accessed? Honza Bootstrapped and regtested on aarch64-linux-gnu. Dhruv Chawla (1): [RFC][AutoFDO] Source filename tracking in GCOV gcc/auto-profile.cc | 101

Re: [PATCH 0/1] [RFC][AutoFDO] Propagate inline information to outline definitions if not inlined

2025-06-13 Thread Dhruv Chawla
On 13/06/25 14:51, Jan Hubicka wrote: External email: Use caution opening links or attachments From: Dhruv Chawla Hi, For reasons explained in the patch, this patch prevents the loss of profile information when inlining occurs in the profiled binary but not in the auto-profile pass as a

Re: [PATCH] widening_mul: Make better use of overflowing operations in codegen of min/max(a, add/sub(a, b))

2025-06-05 Thread Dhruv Chawla
On 05/06/25 12:01, Richard Biener wrote: External email: Use caution opening links or attachments On Wed, Jun 4, 2025 at 7:44 PM Andrew Pinski wrote: On Wed, Jun 4, 2025 at 6:27 AM Richard Biener wrote: On Thu, May 29, 2025 at 10:04 AM wrote: From: Dhruv Chawla This patch folds the

Re: [PATCH] widening_mul: Make better use of overflowing operations in codegen of min/max(a, add/sub(a, b))

2025-06-04 Thread Dhruv Chawla
On 30/05/25 13:35, Andrew Pinski wrote: External email: Use caution opening links or attachments On Thu, May 29, 2025 at 1:05 AM wrote: From: Dhruv Chawla This patch folds the following patterns: - max (a, add (a, b)) -> [sum, ovf] = addo (a, b); !ovf ? sum : a - max (a, sub (a

Re: [PATCH] widening_mul: Make better use of overflowing operations in codegen of min/max(a, add/sub(a, b))

2025-06-04 Thread Dhruv Chawla
On 04/06/25 23:14, Andrew Pinski wrote: External email: Use caution opening links or attachments On Wed, Jun 4, 2025 at 6:27 AM Richard Biener wrote: On Thu, May 29, 2025 at 10:04 AM wrote: From: Dhruv Chawla This patch folds the following patterns: - max (a, add (a, b)) -> [sum,

Re: [AUTOFDO] Merge profiles of clones before annotating

2025-05-26 Thread Dhruv Chawla
On 26/05/25 12:58, Jan Hubicka wrote: External email: Use caution opening links or attachments Hi, Ping? Sorry for the delay. I think I finally got auto-fdo running on my box and indeed I see that if function is cloned later, the profile is lost. There are .suffixes added before afdo pass (su

Re: [PATCH] aarch64: Use LDR for first-element loads for Advanced SIMD

2025-05-25 Thread Dhruv Chawla
On 08/05/25 18:43, Richard Sandiford wrote: External email: Use caution opening links or attachments Dhruv Chawla writes: This patch modifies Advanced SIMD assembly generation to emit an LDR instruction when a vector is created using a load to the first element with the other elements being

Re: [PATCH v4 1/2] aarch64: Match unpredicated shift patterns for ADR, SRA and ADDHNB instructions

2025-05-23 Thread Dhruv Chawla
On 22/05/25 15:56, Richard Sandiford wrote: External email: Use caution opening links or attachments writes: From: Dhruv Chawla This patch modifies the shift expander to immediately lower constant shifts without unspec. It also modifies the ADR, SRA and ADDHNB patterns to match the

Re: [PATCH v4 2/2] aarch64: Fold lsl+lsr+orr to rev for half-width shifts

2025-05-22 Thread Dhruv Chawla
er-approval approval and adding myself to the MAINTAINERS file :) Thanks for the sponsor! -- >8 -- [PATCH] aarch64: Fold lsl+lsr+orr to rev for half-width shifts This patch folds the following pattern: lsl , , lsr , , orr , , to: revb/h/w , when the shift

Re: [PATCH 2/2] aarch64: Fold lsl+lsr+orr to rev for half-width shifts

2025-05-21 Thread Dhruv Chawla
On 20/05/25 16:35, Richard Sandiford wrote: External email: Use caution opening links or attachments Dhruv Chawla writes: On 06/05/25 21:57, Richard Sandiford wrote: External email: Use caution opening links or attachments Dhruv Chawla writes: This patch modifies the intrinsic expanders

Re: [PATCH 2/2] aarch64: Fold lsl+lsr+orr to rev for half-width shifts

2025-05-14 Thread Dhruv Chawla
On 06/05/25 21:57, Richard Sandiford wrote: External email: Use caution opening links or attachments Dhruv Chawla writes: This patch modifies the intrinsic expanders to expand svlsl and svlsr to unpredicated forms when the predicate is a ptrue. It also folds the following pattern: lsl

Re: [PATCH 1/2] aarch64: Match unpredicated shift patterns for ADR, SRA, and ADDHNB instructions

2025-05-14 Thread Dhruv Chawla
have too much space. Attaching is fine if that's easier. Hi, I have tried using git send-email for the next round of patches. Please let me know if the formatting is still broken! Thanks. Dhruv Chawla writes: diff --git a/gcc/config/aarch64/aarch64-sve.md b/gcc/config/aarch64/aarch64-s

Re: [PATCH] aarch64: Use LDR for first-element loads for Advanced SIMD

2025-05-06 Thread Dhruv Chawla
On 06/01/25 11:44, Andrew Pinski wrote: External email: Use caution opening links or attachments On Sun, Jan 5, 2025 at 10:06 PM Dhruv Chawla wrote: This patch modifies Advanced SIMD assembly generation to emit an LDR instruction when a vector is created using a load to the first element

[PATCH 2/2] aarch64: Fold lsl+lsr+orr to rev for half-width shifts

2025-05-02 Thread Dhruv Chawla
to: revb/h/w , when the shift amount is equal to half the bitwidth of the register. Bootstrapped and regtested on aarch64-linux-gnu. Signed-off-by: Dhruv Chawla gcc/ChangeLog: * config/aarch64/aarch64-sve-builtins-base.cc (svlsl_impl::expand): Define. (svlsr_impl):

[PATCH 1/2] aarch64: Match unpredicated shift patterns for ADR, SRA, and ADDHNB instructions

2025-05-02 Thread Dhruv Chawla
On 07/12/24 00:08, Richard Sandiford wrote: External email: Use caution opening links or attachments Sorry for the slow reply. Dhruv Chawla writes: This patch modifies the intrinsic expanders to expand svlsl and svlsr to unpredicated forms when the predicate is a ptrue. It also folds the

[PATCH] libstdc++: Add missing feature-test macro in

2025-05-02 Thread Dhruv Chawla
Per version.syn#2, is required to define __cpp_lib_addressof_constexpr as 201603L. Bootstrapped and tested on aarch64-linux-gnu. Signed-off-by: Dhruv Chawla libstdc++-v3/ChangeLog: * include/std/memory: Define __glibcxx_want_addressof_constexpr. * testsuite/20_util/headers

[PATCH] aarch64: Add support for -mcpu=olympus

2025-03-21 Thread Dhruv Chawla
. * config/aarch64/aarch64-tune.md: Regenerate. * doc/invoke.texi (AArch64 Options): Document the above. Signed-off-by: Dhruv Chawla --- gcc/config/aarch64/aarch64-cores.def | 3 +++ gcc/config/aarch64/aarch64-tune.md | 2 +- gcc/doc/invoke.texi | 2 +- 3 files changed

[PATCH] aarch64: Use LDR for first-element loads for Advanced SIMD

2025-01-05 Thread Dhruv Chawla
+ +LDR_NARROW (float16x4_t, float16_t, f16) +LDR_NARROW (float32x2_t, float32_t, f32) +LDR_NARROW (float64x1_t, float64_t, f64) + +LDR_NARROW (bfloat16x4_t, bfloat16_t, bf16) + +/* { dg-final { scan-assembler-times "\\tldr" 24 } } */ +/* { dg-final { scan-assembler-not "\\tmov" } } */ -

Ping: [RFC][PATCH] aarch64: Fold lsl+lsr+orr to rev for half-width shifts

2024-12-04 Thread Dhruv Chawla
Ping. On 27/11/24 10:23, Dhruv Chawla wrote: External email: Use caution opening links or attachments This patch modifies the intrinsic expanders to expand svlsl and svlsr to unpredicated forms when the predicate is a ptrue. It also folds the following pattern:   lsl , ,   lsr , ,   orr

[RFC][PATCH] aarch64: Fold lsl+lsr+orr to rev for half-width shifts

2024-11-26 Thread Dhruv Chawla
do this in a better way. The patch was bootstrapped and regtested on aarch64-linux-gnu. -- Regards, Dhruv From 026c972dba99b59c24771cfca632f3cd4e1df323 Mon Sep 17 00:00:00 2001 From: Dhruv Chawla Date: Sat, 16 Nov 2024 19:40:03 +0530 Subject: [PATCH] aarch64: Fold lsl+lsr+orr to rev for half-widt

[PATCH] libstdc++: Add missing feature-test macro in various headers

2024-08-29 Thread Dhruv Chawla
On 28/08/24 15:40, Jonathan Wakely wrote: External email: Use caution opening links or attachments On Wed, 28 Aug 2024 at 06:47, Dhruv Chawla wrote: version.syn#2 requires to define __cpp_lib_allocator_traits_is_always_equal. The attached patch therefore defines the

[PATCH] libstdc++: Add missing feature-test macro in

2024-08-27 Thread Dhruv Chawla
-by: Dhruv Chawla -- Regards, Dhruv From 40c0b154f2ef11a18fd318008ae366560d4c8d79 Mon Sep 17 00:00:00 2001 From: Dhruv Chawla Date: Mon, 26 Aug 2024 11:09:19 +0530 Subject: [PATCH] libstdc++: Add missing feature-test macro in version.syn#2 requires to define