On Fri, Jul 4, 2025 at 10:11 AM H.J. Lu <hjl.to...@gmail.com> wrote: > > On Fri, Jul 4, 2025 at 4:09 PM H.J. Lu <hjl.to...@gmail.com> wrote: > > > > On Fri, Jul 4, 2025 at 4:02 PM Richard Biener > > <richard.guent...@gmail.com> wrote: > > > > > > On Fri, Jul 4, 2025 at 9:46 AM H.J. Lu <hjl.to...@gmail.com> wrote: > > > > > > > > On Fri, Jul 4, 2025 at 3:42 PM Richard Biener > > > > <richard.guent...@gmail.com> wrote: > > > > > > > > > > On Fri, Jul 4, 2025 at 9:33 AM H.J. Lu <hjl.to...@gmail.com> wrote: > > > > > > > > > > > > On Fri, Jul 4, 2025 at 2:37 PM Richard Sandiford > > > > > > <richard.sandif...@arm.com> wrote: > > > > > > > > > > > > > > "H.J. Lu" <hjl.to...@gmail.com> writes: > > > > > > > > On Thu, Jul 3, 2025 at 11:02 PM Richard Sandiford > > > > > > > > <richard.sandif...@arm.com> wrote: > > > > > > > >> > > > > > > > >> "H.J. Lu" <hjl.to...@gmail.com> writes: > > > > > > > >> > Since a backend may ignore user type alignment for arguments > > > > > > > >> > passed on > > > > > > > >> > stack, update alignment for arguments passed on stack when > > > > > > > >> > copying MEM's > > > > > > > >> > memory attributes. > > > > > > > >> > > > > > > > > >> > gcc/ > > > > > > > >> > > > > > > > > >> > PR target/120839 > > > > > > > >> > * emit-rtl.cc (set_mem_attrs): Update alignment for argument > > > > > > > >> > on > > > > > > > >> > stack. > > > > > > > >> > > > > > > > > >> > gcc/testsuite/ > > > > > > > >> > > > > > > > > >> > PR target/120839 > > > > > > > >> > * gcc.target/i386/pr120839-1.c: New test. > > > > > > > >> > * gcc.target/i386/pr120839-2.c: Likewise. > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > -- > > > > > > > >> > H.J. > > > > > > > >> > > > > > > > > >> > From 3f8a9bfb4beae47bfc0da20b517a5b3b06a1cbcc Mon Sep 17 > > > > > > > >> > 00:00:00 2001 > > > > > > > >> > From: "H.J. Lu" <hjl.to...@gmail.com> > > > > > > > >> > Date: Sat, 28 Jun 2025 06:27:25 +0800 > > > > > > > >> > Subject: [PATCH] Update alignment for argument on stack > > > > > > > >> > > > > > > > > >> > Since a backend may ignore user type alignment for arguments > > > > > > > >> > passed on > > > > > > > >> > stack, update alignment for arguments passed on stack when > > > > > > > >> > copying MEM's > > > > > > > >> > memory attributes. > > > > > > > >> > > > > > > > > >> > gcc/ > > > > > > > >> > > > > > > > > >> > PR target/120839 > > > > > > > >> > * emit-rtl.cc (set_mem_attrs): Update alignment for > > > > > > > >> > argument on > > > > > > > >> > stack. > > > > > > > >> > > > > > > > > >> > gcc/testsuite/ > > > > > > > >> > > > > > > > > >> > PR target/120839 > > > > > > > >> > * gcc.target/i386/pr120839-1.c: New test. > > > > > > > >> > * gcc.target/i386/pr120839-2.c: Likewise. > > > > > > > >> > > > > > > > > >> > Signed-off-by: H.J. Lu <hjl.to...@gmail.com> > > > > > > > >> > --- > > > > > > > >> > gcc/emit-rtl.cc | 14 > > > > > > > >> > ++++++++++++++ > > > > > > > >> > gcc/testsuite/gcc.target/i386/pr120839-1.c | 14 > > > > > > > >> > ++++++++++++++ > > > > > > > >> > gcc/testsuite/gcc.target/i386/pr120839-2.c | 19 > > > > > > > >> > +++++++++++++++++++ > > > > > > > >> > 3 files changed, 47 insertions(+) > > > > > > > >> > create mode 100644 > > > > > > > >> > gcc/testsuite/gcc.target/i386/pr120839-1.c > > > > > > > >> > create mode 100644 > > > > > > > >> > gcc/testsuite/gcc.target/i386/pr120839-2.c > > > > > > > >> > > > > > > > > >> > diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc > > > > > > > >> > index f4fc92bb37a..0d1616361ca 100644 > > > > > > > >> > --- a/gcc/emit-rtl.cc > > > > > > > >> > +++ b/gcc/emit-rtl.cc > > > > > > > >> > @@ -389,6 +389,20 @@ set_mem_attrs (rtx mem, mem_attrs > > > > > > > >> > *attrs) > > > > > > > >> > { > > > > > > > >> > MEM_ATTRS (mem) = ggc_alloc<mem_attrs> (); > > > > > > > >> > memcpy (MEM_ATTRS (mem), attrs, sizeof (mem_attrs)); > > > > > > > >> > + if (MEM_EXPR (mem)) > > > > > > > >> > + { > > > > > > > >> > + tree base_address = get_base_address (MEM_EXPR > > > > > > > >> > (mem)); > > > > > > > >> > + if (base_address && TREE_CODE (base_address) == > > > > > > > >> > PARM_DECL) > > > > > > > >> > + { > > > > > > > >> > + /* User alignment on type may be ignored for > > > > > > > >> > parameter > > > > > > > >> > + passed on stack. */ > > > > > > > >> > + tree type = TREE_TYPE (base_address); > > > > > > > >> > + unsigned int alignment > > > > > > > >> > + = targetm.calls.function_arg_boundary > > > > > > > >> > (TYPE_MODE (type), > > > > > > > >> > + type); > > > > > > > >> > + set_mem_align (mem, alignment); > > > > > > > >> > + } > > > > > > > >> > + } > > > > > > > >> > } > > > > > > > >> > } > > > > > > > >> > > > > > > > >> This doesn't feel like the right place to address this. > > > > > > > >> set_mem_attrs > > > > > > > >> is just supposed to install the attributes that it has been > > > > > > > >> given, > > > > > > > >> without second-guessing the contents. > > > > > > > >> > > > > > > > >> Where does the incorrect alignment ultimately come from? (As > > > > > > > >> in, > > > > > > > >> which piece of code creates the MEM and fails to give it the > > > > > > > >> correct > > > > > > > >> alignment?) > > > > > > > > > > > > > > > > x86 has > > > > > > > > > > > > > > > > static unsigned int > > > > > > > > ix86_function_arg_boundary (machine_mode mode, const_tree type) > > > > > > > > { > > > > > > > > unsigned int align; > > > > > > > > if (type) > > > > > > > > { > > > > > > > > /* Since the main variant type is used for call, we > > > > > > > > convert it to > > > > > > > > the main variant type. */ > > > > > > > > type = TYPE_MAIN_VARIANT (type); > > > > > > > > align = TYPE_ALIGN (type); > > > > > > > > if (TYPE_EMPTY_P (type)) > > > > > > > > return PARM_BOUNDARY; > > > > > > > > } > > > > > > > > > > > > > > > > Because of > > > > > > > > > > > > > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120900 > > > > > > > > > > > > > > > > for C, > > > > > > > > > > > > > > > > typedef struct { > > > > > > > > long double a, b > > > > > > > > } c __attribute__((aligned(32))); > > > > > > > > > > > > > > > > when such a variable is passed on stack, its alignment > > > > > > > > on stack is 16, not 32. However, when set_mem_attrs > > > > > > > ^^^^^^^^^^^^^^^^^^ > > > > > > > > is called to set MEM_ALIGN for such a variable on stack, > > > > > > > ^^^^^^^^^^^^^^^^^^^^^^^^^^ > > > > > > > > it sets MEM_ALIGN to 32, instead of 16. > > > > > > > > > > > > > > But what I meant was: which piece of code calls set_mem_attrs > > > > > > > to set MEM_ALIGN in this case? The bug seems further up the > > > > > > > call stack to me. > > > > > > > > > > > > > > > > > > > Breakpoint 1, set_mem_attrs (mem=0x7fffe9605510, > > > > > > attrs=0x7fffffffccd0) > > > > > > at > > > > > > /export/gnu/import/git/gitlab/x86-gcc-test/gcc/emit-rtl.cc:381 > > > > > > 381 if (mem_attrs_eq_p (attrs, mode_mem_attrs[(int) GET_MODE > > > > > > (mem)])) > > > > > > (gdb) bt > > > > > > #0 set_mem_attrs (mem=0x7fffe9605510, attrs=0x7fffffffccd0) > > > > > > at > > > > > > /export/gnu/import/git/gitlab/x86-gcc-test/gcc/emit-rtl.cc:381 > > > > > > #1 0x00000000008096d3 in set_mem_attributes_minus_bitpos > > > > > > (ref=0x7fffe9605510, > > > > > > t=0x7fffe9810bb0, objectp=1, bitpos=...) > > > > > > at > > > > > > /export/gnu/import/git/gitlab/x86-gcc-test/gcc/emit-rtl.cc:2191 > > > > > > #2 0x0000000000809729 in set_mem_attributes (ref=0x7fffe9605510, > > > > > > t=0x7fffe9810bb0, objectp=1) > > > > > > at > > > > > > /export/gnu/import/git/gitlab/x86-gcc-test/gcc/emit-rtl.cc:2197 > > > > > > #3 0x000000000090d1a6 in assign_parm_find_stack_rtl > > > > > > (parm=0x7fffe9810bb0, > > > > > > data=0x7fffffffcee0) > > > > > > at > > > > > > /export/gnu/import/git/gitlab/x86-gcc-test/gcc/function.cc:2681 > > > > > > #4 0x0000000000911278 in assign_parms (fndecl=0x7fffe99d2900) > > > > > > at > > > > > > /export/gnu/import/git/gitlab/x86-gcc-test/gcc/function.cc:3691 > > > > > > #5 0x0000000000915be8 in expand_function_start > > > > > > (subr=0x7fffe99d2900) > > > > > > at > > > > > > /export/gnu/import/git/gitlab/x86-gcc-test/gcc/function.cc:5201 > > > > > > #6 0x00000000006c5b6b in (anonymous > > > > > > namespace)::pass_expand::execute ( > > > > > > this=0x486c0e0, fun=0x7ffff7fbc190) > > > > > > at > > > > > > /export/gnu/import/git/gitlab/x86-gcc-test/gcc/cfgexpand.cc:7168 > > > > > > #7 0x0000000000d02703 in execute_one_pass (pass=0x486c0e0) > > > > > > at /export/gnu/import/git/gitlab/x86-gcc-test/gcc/passes.cc:2648 > > > > > > #8 0x0000000000d02afb in execute_pass_list_1 (pass=0x486c0e0) > > > > > > at /export/gnu/import/git/gitlab/x86-gcc-test/gcc/passes.cc:2757 > > > > > > ... > > > > > > (gdb) call debug (mem) > > > > > > (mem/c:BLK (reg/f:DI 92 virtual-incoming-args) [0 A8]) > > > > > > (gdb) f 1 > > > > > > #1 0x00000000008096d3 in set_mem_attributes_minus_bitpos > > > > > > (ref=0x7fffe9605510, > > > > > > t=0x7fffe9810bb0, objectp=1, bitpos=...) > > > > > > at > > > > > > /export/gnu/import/git/gitlab/x86-gcc-test/gcc/emit-rtl.cc:2191 > > > > > > 2191 set_mem_attrs (ref, &attrs); > > > > > > (gdb) p attrs > > > > > > $2 = {expr = 0x7fffe9810bb0, offset = {coeffs = {0}}, size = > > > > > > {coeffs = {32}}, > > > > > > alias = 2, align = 256, addrspace = 0 '\000', offset_known_p = > > > > > > true, > > > > > > size_known_p = true} > > > > > > (gdb) > > > > > > Breakpoint 2, set_mem_attributes_minus_bitpos (ref=0x7fffe9605510, > > > > > > t=0x7fffe9810bb0, objectp=1, bitpos=...) > > > > > > at > > > > > > /export/gnu/import/git/gitlab/x86-gcc-test/gcc/emit-rtl.cc:2163 > > > > > > 2163 get_object_alignment_1 (t, &obj_align, &obj_bitpos); > > > > > > (gdb) next > > > > > > 2164 unsigned int diff_align = known_alignment (obj_bitpos - > > > > > > bitpos); > > > > > > (gdb) > > > > > > 2165 if (diff_align != 0) > > > > > > (gdb) p obj_align > > > > > > $3 = 256 > > > > > > (gdb) > > > > > > (gdb) next > > > > > > 2167 attrs.align = MAX (attrs.align, obj_align); > > > > > > (gdb) p attrs.align > > > > > > $4 = 256 > > > > > > (gdb) > > > > > > > > > > > > 256 is wrong only because > > > > > > > > > > > > typedef struct { > > > > > > long double a; > > > > > > long double b; > > > > > > } c __attribute__((aligned(32))); > > > > > > > > > > > > is passed on stack. > > > > > > > > > > > > The problems are in get_object_alignment_2 and > > > > > > set_mem_attributes_minus_bitpos. > > > > > > > > > > I think the problem is that the IL doesn't reflect reality. When the > > > > > PARM_DECL > > > > > (or its type) has the specified alignment that has to reflect > > > > > reality. Consider > > > > > > > > > > void foo (c x) > > > > > { > > > > > __typeof (x) *tem = &x; > > > > > x->a; > > > > > } > > > > > > > > > > when the target doesn't honor the alignment of 'x' then the accesses > > > > > based off 'tem' > > > > > will be wrong. > > > > > > > > > > The middle-end has, in similar cases, the "fix" for such bad backends > > > > > to generate a > > > > > callee copy for the incoming argument (with insufficient alignment) to > > > > > a local stack > > > > > variable with (sufficient alignment). Why is that not triggering > > > > > here? > > > > > > > > Where should I look for such codes in the middle-end? > > > > > > It must be in the maze of assign_parms, assign_parm_adjust_stack_rtl > > > looks like what I remember: > > > > > > /* If we can't trust the parm stack slot to be aligned enough for its > > > ultimate type, don't use that slot after entry. We'll make another > > > stack slot, if we need one. */ > > > if (stack_parm > > > > > > > /* If we can't trust the parm stack slot to be aligned enough for its > > ultimate type, don't use that slot after entry. We'll make another > > stack slot, if we need one. */ > > if (stack_parm > > && ((GET_MODE_ALIGNMENT (data->nominal_mode) > MEM_ALIGN (stack_parm) > > && ((optab_handler (movmisalign_optab, data->nominal_mode) > > != CODE_FOR_nothing) > > || targetm.slow_unaligned_access (data->nominal_mode, > > MEM_ALIGN (stack_parm)))) > > || (data->nominal_type > > && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm) > > && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY))) > > stack_parm = NULL; > > > > (gdb) call debug (stack_parm) > > (mem/c:BLK (reg/f:DI 92 virtual-incoming-args) [2 f+0 S32 A128]) > > (gdb) call debug (data->nominal_type) > > <record_type 0x7fffe99cc348 c sizes-gimplified BLK > > size <integer_cst 0x7fffe98242d0 type <integer_type 0x7fffe98220a8 > > bitsizetype> constant 256> > > unit-size <integer_cst 0x7fffe98243a8 type <integer_type > > 0x7fffe9822000 sizetype> constant 32> > > user align:256 warn_if_not_align:0 symtab:0 alias-set -1 > > canonical-type 0x7fffe99cc2a0 > > fields <field_decl 0x7fffe98318c0 a > > type <real_type 0x7fffe982a3f0 long double sizes-gimplified XF > > size <integer_cst 0x7fffe9802fa8 constant 128> > > unit-size <integer_cst 0x7fffe9802fc0 constant 16> > > align:128 warn_if_not_align:0 symtab:0 alias-set 1 > > canonical-type 0x7fffe982a3f0 precision:80 > > pointer_to_this <pointer_type 0x7fffe982aa80>> > > XF z.c:2:15 size <integer_cst 0x7fffe9802fa8 128> unit-size > > <integer_cst 0x7fffe9802fc0 16> > > align:128 warn_if_not_align:0 offset_align 128 decl_not_flexarray: 1 > > offset <integer_cst 0x7fffe9802f90 constant 0> > > bit-offset <integer_cst 0x7fffe9802fd8 constant 0> context > > <record_type 0x7fffe99cc2a0> > > chain <field_decl 0x7fffe9831960 b type <real_type > > 0x7fffe982a3f0 long double> > > XF z.c:3:15 size <integer_cst 0x7fffe9802fa8 128> > > unit-size <integer_cst 0x7fffe9802fc0 16> > > align:128 warn_if_not_align:0 offset_align 128 > > decl_not_flexarray: 1 offset <integer_cst 0x7fffe9802fc0 16> > > bit-offset <integer_cst 0x7fffe9802fd8 0> context <record_type > > 0x7fffe99cc2a0>>> > > pointer_to_this <pointer_type 0x7fffe99cc150>> > > (gdb) > > > > It doesn't check DECL_USER_ALIGN. > > > > -- > > H.J. > > Nowhere in function.cc DECL_USER_ALIGN is checked.
I would have expected for typedef struct { long double a; long double b; } c __attribute__((aligned(32))); void foo (c x) {} that the type of 'x' has TYPE_ALIGN of 32 bytes. You say only the PARM_DECL has DECL_ALIGN? I might have expected this for void foo (int x __attribute__((aligned(32))) {} that is, when the alignment is specified on the formal argument (and in this case I'd simply ignore it for efficiency reason). That said, given we don't seem to consistently create a type variant with alignment matching that of the decl (I'd consider this most clean, but then it would make the alignment specifier on the decl redundant), the code above should probably check the decl for alignment, not the type. This is, after all, what is taken as fact when visible. Test coverge for a case w/o a typedef and the alignment on the struct type plus the case where the alignment is on the formal argument would be useful of course. Richard. > -- > H.J.