Re: [PATCH] aarch64: Use LDR for first-element loads for Advanced SIMD

2025-05-29 Thread Richard Sandiford
Dhruv Chawla writes: > On 08/05/25 18:43, Richard Sandiford wrote: >> Otherwise it looks good. But I think we should think about how we >> plan to integrate the related optimisation for register inputs. E.g.: >> >> int32x4_t foo(int32_t x) { >> return vsetq_lane_s32(x, vdupq_n_s32(0), 0);

Re: [PATCH] aarch64: Use LDR for first-element loads for Advanced SIMD

2025-05-25 Thread Dhruv Chawla
On 08/05/25 18:43, Richard Sandiford wrote: External email: Use caution opening links or attachments Dhruv Chawla writes: This patch modifies Advanced SIMD assembly generation to emit an LDR instruction when a vector is created using a load to the first element with the other elements being z

Re: [PATCH] aarch64: Use LDR for first-element loads for Advanced SIMD

2025-05-08 Thread Richard Sandiford
Dhruv Chawla writes: > This patch modifies Advanced SIMD assembly generation to emit an LDR > instruction when a vector is created using a load to the first element with > the > other elements being zero. > > This is similar to what *aarch64_combinez already does. > > Example: > > uint8x16_t foo(

Re: [PATCH] aarch64: Use LDR for first-element loads for Advanced SIMD

2025-05-06 Thread Dhruv Chawla
uint8_t, u8) +LDR_NARROW (uint16x4_t, uint16_t, u16) +LDR_NARROW (uint32x2_t, uint32_t, u32) +LDR_NARROW (uint64x1_t, uint64_t, u64) + +LDR_NARROW (float16x4_t, float16_t, f16) +LDR_NARROW (float32x2_t, float32_t, f32) +LDR_NARROW (float64x1_t, float64_t, f64) + +LDR_NARROW (bfloat16x4_t, bfloat16_t, bf16

Re: [PATCH] aarch64: Use LDR for first-element loads for Advanced SIMD

2025-01-05 Thread Andrew Pinski
On Sun, Jan 5, 2025 at 10:06 PM Dhruv Chawla wrote: > > This patch modifies Advanced SIMD assembly generation to emit an LDR > instruction when a vector is created using a load to the first element with > the > other elements being zero. > > This is similar to what *aarch64_combinez already does.

[PATCH] aarch64: Use LDR for first-element loads for Advanced SIMD

2025-01-05 Thread Dhruv Chawla
+ +LDR_NARROW (float16x4_t, float16_t, f16) +LDR_NARROW (float32x2_t, float32_t, f32) +LDR_NARROW (float64x1_t, float64_t, f64) + +LDR_NARROW (bfloat16x4_t, bfloat16_t, bf16) + +/* { dg-final { scan-assembler-times "\\tldr" 24 } } */ +/* { dg-final { scan-assembler-not "\\tmov" } } */ -