[Bug c/103738] New: No warning when setting deprecated fields using designated initializers

2021-12-15 Thread gcc at haasn dot dev via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103738

Bug ID: 103738
   Summary: No warning when setting deprecated fields using
designated initializers
   Product: gcc
   Version: 11.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gcc at haasn dot dev
  Target Milestone: ---

Created attachment 52009
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52009&action=edit
No deprecation warning produced

Deprecated attributes on struct members are ignored when setting those fields
using designated initializers. The attached example should produce a warning,
but it does not.

Contrast the following alternative code:

struct foo foo;
foo.bar = 5;

This does produce a deprecation warning as expected:

no_warning.c: In function ‘main’:
no_warning.c:8:5: warning: ‘bar’ is deprecated [-Wdeprecated-declarations]
8 | foo.bar = 5;
  | ^~~
no_warning.c:2:9: note: declared here
2 | int bar __attribute((deprecated));
  | ^~~

[Bug c/103738] No warning when setting deprecated fields using designated initializers

2021-12-15 Thread gcc at haasn dot dev via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103738

Niklas Haas  changed:

   What|Removed |Added

  Attachment #52009|no_warning.c|warning.c
   filename||
  Attachment #52009|No deprecation warning  |Deprecation warning
description|produced|produced

--- Comment #1 from Niklas Haas  ---
Comment on attachment 52009
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52009
Deprecation warning produced

Accidentally attached the wrong version of the file, fixing.

[Bug c/103738] No warning when setting deprecated fields using designated initializers

2021-12-15 Thread gcc at haasn dot dev via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103738

--- Comment #2 from Niklas Haas  ---
Created attachment 52010
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52010&action=edit
No deprecation warning produced

[Bug tree-optimization/119294] New: Strange (buggy?) codegen when passing cleared vector as argument

2025-03-14 Thread gcc at haasn dot dev via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119294

Bug ID: 119294
   Summary: Strange (buggy?) codegen when passing cleared vector
as argument
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gcc at haasn dot dev
  Target Milestone: ---

### Description

For some reason, in this program, GCC extraneously writes the vector argument,
normally passed in xmm0, to the stack - sometimes without even incrementing
the stack pointer.

It's furthermore strange that `set_indirect()` compiles to different code than
`set()`, even though the former (should) just directly inline the latter.

You can see that the problem goes away when the value to write is a compile
time constant, rather than taken from a register argument.

### Code

typedef char vec_t __attribute__((vector_size(16)));

void func(vec_t x);

void set(vec_t x, const char val)
{
for (int i = 0; i < 16; i++)
x[i] = val;

func(x);
}

void set_indirect(vec_t x, const char val)
{
set(x, val);
}

void setFF(vec_t x)
{
set(x, 0xFF);
}

void set123(vec_t x)
{
set(x, 123);
}

void set0(vec_t x)
{
set(x, 0);
}

### Expected Output

set:
vmovd   xmm0, edi
vpbroadcastbxmm0, xmm0
jmp func@PLT

set_indirect:
vmovd   xmm0, edi
vpbroadcastbxmm0, xmm0
jmp func@PLT

setFF:
vpcmpeqdxmm0, xmm0, xmm0
jmp func@PLT

.LCPI3_1:
.zero   4,123
set123:
vbroadcastssxmm0, dword ptr [rip + .LCPI3_1]
jmp func@PLT

set0:
vxorps  xmm0, xmm0, xmm0
jmp func@PLT

### Actual Output

set:
vmovd   xmm0, edi
sub rsp, 24
vpbroadcastbxmm0, xmm0
vmovdqa XMMWORD PTR [rsp], xmm0
callfunc
add rsp, 24
ret
set_indirect:
vmovd   xmm0, edi
vpbroadcastbxmm0, xmm0
vmovdqa XMMWORD PTR [rsp-24], xmm0
jmp func
setFF:
vpcmpeqdxmm0, xmm0, xmm0
jmp func
set123:
mov eax, 2071690107
vmovd   xmm0, eax
vpbroadcastdxmm0, xmm0
jmp func
set0:
vpxor   xmm0, xmm0, xmm0
jmp func

### See Also:

https://godbolt.org/z/1hrjKqf8Y

[Bug tree-optimization/119103] New: Very suboptimal AVX2 code generation of simple shift loop

2025-03-03 Thread gcc at haasn dot dev via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119103

Bug ID: 119103
   Summary: Very suboptimal AVX2 code generation of simple shift
loop
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gcc at haasn dot dev
  Target Milestone: ---

== Summary ==

On x86_64 with -mavx2, GCC has a very hard time optimizing a shift by a small
unsigned unknown, even if I add knowledge that the shift amount is sufficiently
small.

In particular, GCC always chooses vpslld instead of vpsllw, and there seems to
be no way to convince it otherwise short of hand written asm or intrinsics.

See demonstration here: https://godbolt.org/z/4YobqhsG4

== Code ==

#include 

void lshift(uint16_t *x, uint8_t amount)
{
if (amount > 15)
__builtin_unreachable();

for (int i = 0; i < 16; i++)
x[i] <<= amount;
}

== Output of `gcc -O3 -mavx2 -ftree-vectorize` ==

lshift:
vmovdqu ymm1, YMMWORD PTR [rdi]
movzx   eax, sil
vmovq   xmm2, rax
vpmovzxwd   ymm0, xmm1
vextracti128xmm1, ymm1, 0x1
vpmovzxwd   ymm1, xmm1
vpslld  ymm0, ymm0, xmm2
vpslld  ymm1, ymm1, xmm2
vpxor   xmm2, xmm2, xmm2
vpblendwymm0, ymm2, ymm0, 85
vpblendwymm2, ymm2, ymm1, 85
vpackusdw   ymm0, ymm0, ymm2
vpermq  ymm0, ymm0, 216
vmovdqu YMMWORD PTR [rdi], ymm0
vzeroupper
ret

== Expected result ==

lshift:
vmovdqu ymm1, YMMWORD PTR [rdi]
movzx   esi, sil
vmovd   xmm0, esi
vpsllw  ymm0, ymm1, xmm0
vmovdqu YMMWORD PTR [rdi], ymm0
vzeroupper
ret

Compiled from:

void lshift(uint16_t *x, uint8_t amount)
{
__m256i data = _mm256_loadu_si256((__m256i *) x);
__m128i shift_amount = _mm_cvtsi32_si128(amount);
__m256i shifted = _mm256_sll_epi16(data, shift_amount);
_mm256_storeu_si256((__m256i *) x, shifted);
}

[Bug tree-optimization/119103] shift not demotated when shift amount range is known

2025-03-04 Thread gcc at haasn dot dev via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119103

--- Comment #12 from Niklas Haas  ---
Out of curiosity, is there a work-around that I could use to get current
versions of GCC to compile the right thing, but without breaking cross-platform
compatibility?

I did try replacing the assertion by "x[i] << (amount & 0xF)" but the vector
version of the code at least still compiles down to vpslld.

[Bug tree-optimization/119103] shift not demotated when shift amount range is known

2025-03-05 Thread gcc at haasn dot dev via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119103

--- Comment #16 from Niklas Haas  ---
(In reply to Alexander Monakov from comment #15)
> (In reply to Niklas Haas from comment #12)
> > Out of curiosity, is there a work-around that I could use to get current
> > versions of GCC to compile the right thing, but without breaking
> > cross-platform compatibility?
> > 
> > I did try replacing the assertion by "x[i] << (amount & 0xF)" but the vector
> > version of the code at least still compiles down to vpslld.
> 
> This quote seems apropos: "autovectorization is not a programming model".

This is true, but if I need to write portable C code I have no choice but to
rely on auto-vectorization, unless I want to pepper my code with `#ifdef
__GNUC__` and provide multiple implementations for everything.

> 
> Not sure if you know that already, but with generic vectors you can write:
> 
> typedef uint16_t u16v16 __attribute__((vector_size(32)));
> typedef u16v16 u16v16_u __attribute__((aligned(2)));
> 
> void lshift_register(u16v16_u *x, uint8_t amount)
> {
> *x <<= amount;
> }

Thanks, I'll use that as a GCC-specific work-around for now in this particular
case.

[Bug tree-optimization/119103] Very suboptimal AVX2 code generation of simple shift loop

2025-03-03 Thread gcc at haasn dot dev via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119103

--- Comment #1 from Niklas Haas  ---
Clang's output for comparison:

lshift:
vmovdqu ymm0, ymmword ptr [rdi]
vmovd   xmm1, esi
vpsllw  ymm0, ymm0, xmm1
vmovdqu ymmword ptr [rdi], ymm0
vzeroupper
ret