On Fri, May 2, 2025 at 2:33 AM H.J. Lu <hjl.to...@gmail.com> wrote: > > On Wed, Apr 30, 2025 at 7:40 PM Uros Bizjak <ubiz...@gmail.com> wrote: > > > > On Tue, Apr 29, 2025 at 12:22 PM H.J. Lu <hjl.to...@gmail.com> wrote: > > > > > > On Tue, Apr 29, 2025 at 5:30 PM Uros Bizjak <ubiz...@gmail.com> wrote: > > > > > > > > On Tue, Apr 29, 2025 at 9:56 AM H.J. Lu <hjl.to...@gmail.com> wrote: > > > > > > > > > > Don't expand UNSPEC_TLS_LD_BASE to a call so that the RTL local copy > > > > > propagation pass can eliminate multiple __tls_get_addr calls. > > > > > > > > __tls_get_addr needs to be called with 16-byte aligned stack, I don't > > > > think the compiler will correctly handle required call alignment if > > > > you emit the call without emit_libcall_block. > > > > > > ix86_split_tls_local_dynamic_base_64 generates the same sequence > > > as emit_libcall_block. stack alignment is handled by > > > > > > (define_expand "@tls_local_dynamic_base_64_<mode>" > > > [(set (match_operand:P 0 "register_operand") > > > (unspec:P > > > [(match_operand 1 "constant_call_address_operand") > > > (reg:P SP_REG)] > > > UNSPEC_TLS_LD_BASE))] > > > "TARGET_64BIT" > > > "ix86_tls_descriptor_calls_expanded_in_cfun = true;") > > > > The above is to align the initial %rsp at the beginning of the > > function. When PUSH instructions in the function misaling %rsp, there > > will be nothing to keep %rsp aligned before the call to > > __tls_get_addr. > > > > We have been bitten by this in the past. > > > > True: > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58066 > > which was fixed by > > commit 272325bd6abba598a8f125dab36b626acb648b03 > Author: Wei Mi <w...@google.com> > Date: Thu May 8 16:44:52 2014 +0000 > > re PR target/58066 (__tls_get_addr is called with misaligned stack on > x86-64 > )
It was finally fixed by (also, please see the number of patches to get this in order): Author: uros Date: Wed Jul 15 07:39:30 2015 New Revision: 225807 URL: https://gcc.gnu.org/viewcvs?rev=225807&root=gcc&view=rev Log: PR rtl-optimization/58066 * calls.c (expand_call): Precompute register parameters before stack alignment is performed. > __tls_get_addr doesn't take an argument and my patch still aligns the stack > properly. Can you perhaps get an ack from the middle-end maintainer that the patch does the right thing? Thanks, Uros.