On Fri, Jul 4, 2025 at 4:22 PM Richard Biener
<richard.guent...@gmail.com> wrote:
>
> On Fri, Jul 4, 2025 at 10:11 AM H.J. Lu <hjl.to...@gmail.com> wrote:
> >
> > On Fri, Jul 4, 2025 at 4:09 PM H.J. Lu <hjl.to...@gmail.com> wrote:
> > >
> > > On Fri, Jul 4, 2025 at 4:02 PM Richard Biener
> > > <richard.guent...@gmail.com> wrote:
> > > >
> > > > On Fri, Jul 4, 2025 at 9:46 AM H.J. Lu <hjl.to...@gmail.com> wrote:
> > > > >
> > > > > On Fri, Jul 4, 2025 at 3:42 PM Richard Biener
> > > > > <richard.guent...@gmail.com> wrote:
> > > > > >
> > > > > > On Fri, Jul 4, 2025 at 9:33 AM H.J. Lu <hjl.to...@gmail.com> wrote:
> > > > > > >
> > > > > > > On Fri, Jul 4, 2025 at 2:37 PM Richard Sandiford
> > > > > > > <richard.sandif...@arm.com> wrote:
> > > > > > > >
> > > > > > > > "H.J. Lu" <hjl.to...@gmail.com> writes:
> > > > > > > > > On Thu, Jul 3, 2025 at 11:02 PM Richard Sandiford
> > > > > > > > > <richard.sandif...@arm.com> wrote:
> > > > > > > > >>
> > > > > > > > >> "H.J. Lu" <hjl.to...@gmail.com> writes:
> > > > > > > > >> > Since a backend may ignore user type alignment for 
> > > > > > > > >> > arguments passed on
> > > > > > > > >> > stack, update alignment for arguments passed on stack when 
> > > > > > > > >> > copying MEM's
> > > > > > > > >> > memory attributes.
> > > > > > > > >> >
> > > > > > > > >> > gcc/
> > > > > > > > >> >
> > > > > > > > >> > PR target/120839
> > > > > > > > >> > * emit-rtl.cc (set_mem_attrs): Update alignment for 
> > > > > > > > >> > argument on
> > > > > > > > >> > stack.
> > > > > > > > >> >
> > > > > > > > >> > gcc/testsuite/
> > > > > > > > >> >
> > > > > > > > >> > PR target/120839
> > > > > > > > >> > * gcc.target/i386/pr120839-1.c: New test.
> > > > > > > > >> > * gcc.target/i386/pr120839-2.c: Likewise.
> > > > > > > > >> >
> > > > > > > > >> >
> > > > > > > > >> > --
> > > > > > > > >> > H.J.
> > > > > > > > >> >
> > > > > > > > >> > From 3f8a9bfb4beae47bfc0da20b517a5b3b06a1cbcc Mon Sep 17 
> > > > > > > > >> > 00:00:00 2001
> > > > > > > > >> > From: "H.J. Lu" <hjl.to...@gmail.com>
> > > > > > > > >> > Date: Sat, 28 Jun 2025 06:27:25 +0800
> > > > > > > > >> > Subject: [PATCH] Update alignment for argument on stack
> > > > > > > > >> >
> > > > > > > > >> > Since a backend may ignore user type alignment for 
> > > > > > > > >> > arguments passed on
> > > > > > > > >> > stack, update alignment for arguments passed on stack when 
> > > > > > > > >> > copying MEM's
> > > > > > > > >> > memory attributes.
> > > > > > > > >> >
> > > > > > > > >> > gcc/
> > > > > > > > >> >
> > > > > > > > >> >       PR target/120839
> > > > > > > > >> >       * emit-rtl.cc (set_mem_attrs): Update alignment for 
> > > > > > > > >> > argument on
> > > > > > > > >> >       stack.
> > > > > > > > >> >
> > > > > > > > >> > gcc/testsuite/
> > > > > > > > >> >
> > > > > > > > >> >       PR target/120839
> > > > > > > > >> >       * gcc.target/i386/pr120839-1.c: New test.
> > > > > > > > >> >       * gcc.target/i386/pr120839-2.c: Likewise.
> > > > > > > > >> >
> > > > > > > > >> > Signed-off-by: H.J. Lu <hjl.to...@gmail.com>
> > > > > > > > >> > ---
> > > > > > > > >> >  gcc/emit-rtl.cc                            | 14 
> > > > > > > > >> > ++++++++++++++
> > > > > > > > >> >  gcc/testsuite/gcc.target/i386/pr120839-1.c | 14 
> > > > > > > > >> > ++++++++++++++
> > > > > > > > >> >  gcc/testsuite/gcc.target/i386/pr120839-2.c | 19 
> > > > > > > > >> > +++++++++++++++++++
> > > > > > > > >> >  3 files changed, 47 insertions(+)
> > > > > > > > >> >  create mode 100644 
> > > > > > > > >> > gcc/testsuite/gcc.target/i386/pr120839-1.c
> > > > > > > > >> >  create mode 100644 
> > > > > > > > >> > gcc/testsuite/gcc.target/i386/pr120839-2.c
> > > > > > > > >> >
> > > > > > > > >> > diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc
> > > > > > > > >> > index f4fc92bb37a..0d1616361ca 100644
> > > > > > > > >> > --- a/gcc/emit-rtl.cc
> > > > > > > > >> > +++ b/gcc/emit-rtl.cc
> > > > > > > > >> > @@ -389,6 +389,20 @@ set_mem_attrs (rtx mem, mem_attrs 
> > > > > > > > >> > *attrs)
> > > > > > > > >> >      {
> > > > > > > > >> >        MEM_ATTRS (mem) = ggc_alloc<mem_attrs> ();
> > > > > > > > >> >        memcpy (MEM_ATTRS (mem), attrs, sizeof (mem_attrs));
> > > > > > > > >> > +      if (MEM_EXPR (mem))
> > > > > > > > >> > +     {
> > > > > > > > >> > +       tree base_address = get_base_address (MEM_EXPR 
> > > > > > > > >> > (mem));
> > > > > > > > >> > +       if (base_address && TREE_CODE (base_address) == 
> > > > > > > > >> > PARM_DECL)
> > > > > > > > >> > +         {
> > > > > > > > >> > +           /* User alignment on type may be ignored for 
> > > > > > > > >> > parameter
> > > > > > > > >> > +              passed on stack.  */
> > > > > > > > >> > +           tree type = TREE_TYPE (base_address);
> > > > > > > > >> > +           unsigned int alignment
> > > > > > > > >> > +             = targetm.calls.function_arg_boundary 
> > > > > > > > >> > (TYPE_MODE (type),
> > > > > > > > >> > +                                                    type);
> > > > > > > > >> > +           set_mem_align (mem, alignment);
> > > > > > > > >> > +         }
> > > > > > > > >> > +     }
> > > > > > > > >> >      }
> > > > > > > > >> >  }
> > > > > > > > >>
> > > > > > > > >> This doesn't feel like the right place to address this.  
> > > > > > > > >> set_mem_attrs
> > > > > > > > >> is just supposed to install the attributes that it has been 
> > > > > > > > >> given,
> > > > > > > > >> without second-guessing the contents.
> > > > > > > > >>
> > > > > > > > >> Where does the incorrect alignment ultimately come from?  
> > > > > > > > >> (As in,
> > > > > > > > >> which piece of code creates the MEM and fails to give it the 
> > > > > > > > >> correct
> > > > > > > > >> alignment?)
> > > > > > > > >
> > > > > > > > > x86 has
> > > > > > > > >
> > > > > > > > > static unsigned int
> > > > > > > > > ix86_function_arg_boundary (machine_mode mode, const_tree 
> > > > > > > > > type)
> > > > > > > > > {
> > > > > > > > >   unsigned int align;
> > > > > > > > >   if (type)
> > > > > > > > >     {
> > > > > > > > >       /* Since the main variant type is used for call, we 
> > > > > > > > > convert it to
> > > > > > > > >          the main variant type.  */
> > > > > > > > >       type = TYPE_MAIN_VARIANT (type);
> > > > > > > > >       align = TYPE_ALIGN (type);
> > > > > > > > >       if (TYPE_EMPTY_P (type))
> > > > > > > > >         return PARM_BOUNDARY;
> > > > > > > > >     }
> > > > > > > > >
> > > > > > > > > Because of
> > > > > > > > >
> > > > > > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120900
> > > > > > > > >
> > > > > > > > > for C,
> > > > > > > > >
> > > > > > > > > typedef struct {
> > > > > > > > >   long double a, b
> > > > > > > > > } c __attribute__((aligned(32)));
> > > > > > > > >
> > > > > > > > > when such a variable is passed on stack, its alignment
> > > > > > > > > on stack is 16, not 32.  However, when set_mem_attrs
> > > > > > > >                                     ^^^^^^^^^^^^^^^^^^
> > > > > > > > > is called to set MEM_ALIGN for such a variable on stack,
> > > > > > > >   ^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > > > > > > > it sets MEM_ALIGN to 32, instead of 16.
> > > > > > > >
> > > > > > > > But what I meant was: which piece of code calls set_mem_attrs
> > > > > > > > to set MEM_ALIGN in this case?  The bug seems further up the
> > > > > > > > call stack to me.
> > > > > > > >
> > > > > > >
> > > > > > > Breakpoint 1, set_mem_attrs (mem=0x7fffe9605510, 
> > > > > > > attrs=0x7fffffffccd0)
> > > > > > >     at 
> > > > > > > /export/gnu/import/git/gitlab/x86-gcc-test/gcc/emit-rtl.cc:381
> > > > > > > 381   if (mem_attrs_eq_p (attrs, mode_mem_attrs[(int) GET_MODE 
> > > > > > > (mem)]))
> > > > > > > (gdb) bt
> > > > > > > #0  set_mem_attrs (mem=0x7fffe9605510, attrs=0x7fffffffccd0)
> > > > > > >     at 
> > > > > > > /export/gnu/import/git/gitlab/x86-gcc-test/gcc/emit-rtl.cc:381
> > > > > > > #1  0x00000000008096d3 in set_mem_attributes_minus_bitpos 
> > > > > > > (ref=0x7fffe9605510,
> > > > > > >     t=0x7fffe9810bb0, objectp=1, bitpos=...)
> > > > > > >     at 
> > > > > > > /export/gnu/import/git/gitlab/x86-gcc-test/gcc/emit-rtl.cc:2191
> > > > > > > #2  0x0000000000809729 in set_mem_attributes (ref=0x7fffe9605510,
> > > > > > >     t=0x7fffe9810bb0, objectp=1)
> > > > > > >     at 
> > > > > > > /export/gnu/import/git/gitlab/x86-gcc-test/gcc/emit-rtl.cc:2197
> > > > > > > #3  0x000000000090d1a6 in assign_parm_find_stack_rtl 
> > > > > > > (parm=0x7fffe9810bb0,
> > > > > > >     data=0x7fffffffcee0)
> > > > > > >     at 
> > > > > > > /export/gnu/import/git/gitlab/x86-gcc-test/gcc/function.cc:2681
> > > > > > > #4  0x0000000000911278 in assign_parms (fndecl=0x7fffe99d2900)
> > > > > > >     at 
> > > > > > > /export/gnu/import/git/gitlab/x86-gcc-test/gcc/function.cc:3691
> > > > > > > #5  0x0000000000915be8 in expand_function_start 
> > > > > > > (subr=0x7fffe99d2900)
> > > > > > >     at 
> > > > > > > /export/gnu/import/git/gitlab/x86-gcc-test/gcc/function.cc:5201
> > > > > > > #6  0x00000000006c5b6b in (anonymous 
> > > > > > > namespace)::pass_expand::execute (
> > > > > > >     this=0x486c0e0, fun=0x7ffff7fbc190)
> > > > > > >     at 
> > > > > > > /export/gnu/import/git/gitlab/x86-gcc-test/gcc/cfgexpand.cc:7168
> > > > > > > #7  0x0000000000d02703 in execute_one_pass (pass=0x486c0e0)
> > > > > > >     at 
> > > > > > > /export/gnu/import/git/gitlab/x86-gcc-test/gcc/passes.cc:2648
> > > > > > > #8  0x0000000000d02afb in execute_pass_list_1 (pass=0x486c0e0)
> > > > > > >     at 
> > > > > > > /export/gnu/import/git/gitlab/x86-gcc-test/gcc/passes.cc:2757
> > > > > > > ...
> > > > > > > (gdb) call debug (mem)
> > > > > > > (mem/c:BLK (reg/f:DI 92 virtual-incoming-args) [0  A8])
> > > > > > > (gdb) f 1
> > > > > > > #1  0x00000000008096d3 in set_mem_attributes_minus_bitpos 
> > > > > > > (ref=0x7fffe9605510,
> > > > > > >     t=0x7fffe9810bb0, objectp=1, bitpos=...)
> > > > > > >     at 
> > > > > > > /export/gnu/import/git/gitlab/x86-gcc-test/gcc/emit-rtl.cc:2191
> > > > > > > 2191   set_mem_attrs (ref, &attrs);
> > > > > > > (gdb) p attrs
> > > > > > > $2 = {expr = 0x7fffe9810bb0, offset = {coeffs = {0}}, size = 
> > > > > > > {coeffs = {32}},
> > > > > > >   alias = 2, align = 256, addrspace = 0 '\000', offset_known_p = 
> > > > > > > true,
> > > > > > >   size_known_p = true}
> > > > > > > (gdb)
> > > > > > > Breakpoint 2, set_mem_attributes_minus_bitpos (ref=0x7fffe9605510,
> > > > > > >     t=0x7fffe9810bb0, objectp=1, bitpos=...)
> > > > > > >     at 
> > > > > > > /export/gnu/import/git/gitlab/x86-gcc-test/gcc/emit-rtl.cc:2163
> > > > > > > 2163       get_object_alignment_1 (t, &obj_align, &obj_bitpos);
> > > > > > > (gdb) next
> > > > > > > 2164       unsigned int diff_align = known_alignment (obj_bitpos 
> > > > > > > - bitpos);
> > > > > > > (gdb)
> > > > > > > 2165       if (diff_align != 0)
> > > > > > > (gdb) p obj_align
> > > > > > > $3 = 256
> > > > > > > (gdb)
> > > > > > > (gdb) next
> > > > > > > 2167       attrs.align = MAX (attrs.align, obj_align);
> > > > > > > (gdb) p attrs.align
> > > > > > > $4 = 256
> > > > > > > (gdb)
> > > > > > >
> > > > > > > 256 is wrong only because
> > > > > > >
> > > > > > > typedef struct {
> > > > > > >   long double a;
> > > > > > >   long double b;
> > > > > > > } c __attribute__((aligned(32)));
> > > > > > >
> > > > > > > is passed on stack.
> > > > > > >
> > > > > > > The problems are in get_object_alignment_2 and 
> > > > > > > set_mem_attributes_minus_bitpos.
> > > > > >
> > > > > > I think the problem is that the IL doesn't reflect reality.  When 
> > > > > > the PARM_DECL
> > > > > > (or its type) has the specified alignment that has to reflect 
> > > > > > reality.  Consider
> > > > > >
> > > > > > void foo (c x)
> > > > > > {
> > > > > >    __typeof (x) *tem = &x;
> > > > > >    x->a;
> > > > > > }
> > > > > >
> > > > > > when the target doesn't honor the alignment of 'x' then the accesses
> > > > > > based off 'tem'
> > > > > > will be wrong.
> > > > > >
> > > > > > The middle-end has, in similar cases, the "fix" for such bad 
> > > > > > backends
> > > > > > to generate a
> > > > > > callee copy for the incoming argument (with insufficient alignment) 
> > > > > > to
> > > > > > a local stack
> > > > > > variable with (sufficient alignment).  Why is that not triggering 
> > > > > > here?
> > > > >
> > > > > Where should I look for such codes in the middle-end?
> > > >
> > > > It must be in the maze of assign_parms, assign_parm_adjust_stack_rtl
> > > > looks like what I remember:
> > > >
> > > >   /* If we can't trust the parm stack slot to be aligned enough for its
> > > >      ultimate type, don't use that slot after entry.  We'll make another
> > > >      stack slot, if we need one.  */
> > > >   if (stack_parm
> > > >
> > >
> > >   /* If we can't trust the parm stack slot to be aligned enough for its
> > >      ultimate type, don't use that slot after entry.  We'll make another
> > >      stack slot, if we need one.  */
> > >   if (stack_parm
> > >       && ((GET_MODE_ALIGNMENT (data->nominal_mode) > MEM_ALIGN 
> > > (stack_parm)
> > >            && ((optab_handler (movmisalign_optab, data->nominal_mode)
> > >                 != CODE_FOR_nothing)
> > >                || targetm.slow_unaligned_access (data->nominal_mode,
> > >                                                  MEM_ALIGN (stack_parm))))
> > >           || (data->nominal_type
> > >               && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm)
> > >               && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY)))
> > >     stack_parm = NULL;
> > >
> > > (gdb) call debug (stack_parm)
> > > (mem/c:BLK (reg/f:DI 92 virtual-incoming-args) [2 f+0 S32 A128])
> > > (gdb) call debug (data->nominal_type)
> > >  <record_type 0x7fffe99cc348 c sizes-gimplified BLK
> > >     size <integer_cst 0x7fffe98242d0 type <integer_type 0x7fffe98220a8
> > > bitsizetype> constant 256>
> > >     unit-size <integer_cst 0x7fffe98243a8 type <integer_type
> > > 0x7fffe9822000 sizetype> constant 32>
> > >     user align:256 warn_if_not_align:0 symtab:0 alias-set -1
> > > canonical-type 0x7fffe99cc2a0
> > >     fields <field_decl 0x7fffe98318c0 a
> > >         type <real_type 0x7fffe982a3f0 long double sizes-gimplified XF
> > >             size <integer_cst 0x7fffe9802fa8 constant 128>
> > >             unit-size <integer_cst 0x7fffe9802fc0 constant 16>
> > >             align:128 warn_if_not_align:0 symtab:0 alias-set 1
> > > canonical-type 0x7fffe982a3f0 precision:80
> > >             pointer_to_this <pointer_type 0x7fffe982aa80>>
> > >         XF z.c:2:15 size <integer_cst 0x7fffe9802fa8 128> unit-size
> > > <integer_cst 0x7fffe9802fc0 16>
> > >         align:128 warn_if_not_align:0 offset_align 128 
> > > decl_not_flexarray: 1
> > >         offset <integer_cst 0x7fffe9802f90 constant 0>
> > >         bit-offset <integer_cst 0x7fffe9802fd8 constant 0> context
> > > <record_type 0x7fffe99cc2a0>
> > >         chain <field_decl 0x7fffe9831960 b type <real_type
> > > 0x7fffe982a3f0 long double>
> > >             XF z.c:3:15 size <integer_cst 0x7fffe9802fa8 128>
> > > unit-size <integer_cst 0x7fffe9802fc0 16>
> > >             align:128 warn_if_not_align:0 offset_align 128
> > > decl_not_flexarray: 1 offset <integer_cst 0x7fffe9802fc0 16>
> > > bit-offset <integer_cst 0x7fffe9802fd8 0> context <record_type
> > > 0x7fffe99cc2a0>>>
> > >     pointer_to_this <pointer_type 0x7fffe99cc150>>
> > > (gdb)
> > >
> > > It doesn't check DECL_USER_ALIGN.
> > >
> > > --
> > > H.J.
> >
> > Nowhere in function.cc DECL_USER_ALIGN is checked.
>
> I would have expected for
>
> typedef struct {
>   long double a;
>   long double b;
> } c __attribute__((aligned(32)));
>
> void foo (c x) {}
>
> that the type of 'x' has TYPE_ALIGN of 32 bytes.  You say
> only the PARM_DECL has DECL_ALIGN?  I might have expected

Correct for C, incorrect for C++:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120900

> this for
>
> void foo (int x __attribute__((aligned(32))) {}
>
> that is, when the alignment is specified on the formal argument
> (and in this case I'd simply ignore it for efficiency reason).
>
> That said, given we don't seem to consistently create a type variant
> with alignment matching that of the decl (I'd consider this most clean,
> but then it would make the alignment specifier on the decl redundant),
> the code above should probably check the decl for alignment, not
> the type.  This is, after all, what is taken as fact when visible.
>
> Test coverge for a case w/o a typedef and the alignment on the struct
> type plus the case where the alignment is on the formal argument
> would be useful of course.

For C, it depends on where the alignment attribute is placed:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120900#c1

> Richard.
>
>
> > --
> > H.J.



-- 
H.J.

Reply via email to