Hi! On Tue, Feb 02, 2016 at 05:09:34PM +0300, Ilya Enkovich wrote: > And it's too late to do it after STV pass and therefore we disable it > when stack is not properly aligned. I think this argumentation goes in > a loop.
This is a P1 that needs to be fixed, so that we don't defer this forever, what about the following patch? As neither stv nor preferred-stack-boundary nor incoming-stack-boundary are allowed target attribute/GCC target pragma switches, I can't find a problem with that. We don't track at expansion time whether the function is leaf or not, so the patch pessimistically assumes that the function might be leaf and check both incoming and preferred stack boundaries. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2016-02-03 Jakub Jelinek <ja...@redhat.com> Ilya Enkovich <enkovich....@gmail.com> H.J. Lu <hongjiu...@intel.com> PR target/69454 * config/i386/i386.c (convert_scalars_to_vector): Remove stack alignment fixes. (ix86_option_override_internal): Disable TARGET_STV if stack might not be aligned enough. (ix86_minimum_alignment): Assert that TARGET_STV is false. * gcc.target/i386/pr69454-1.c: New test. * gcc.target/i386/pr69454-2.c: New test. --- gcc/config/i386/i386.c.jj 2016-02-02 20:42:01.024287587 +0100 +++ gcc/config/i386/i386.c 2016-02-03 18:45:43.271997909 +0100 @@ -3588,16 +3588,6 @@ convert_scalars_to_vector () bitmap_obstack_release (NULL); df_process_deferred_rescans (); - /* Conversion means we may have 128bit register spills/fills - which require aligned stack. */ - if (converted_insns) - { - if (crtl->stack_alignment_needed < 128) - crtl->stack_alignment_needed = 128; - if (crtl->stack_alignment_estimated < 128) - crtl->stack_alignment_estimated = 128; - } - return 0; } @@ -5453,6 +5443,13 @@ ix86_option_override_internal (bool main opts->x_target_flags |= MASK_VZEROUPPER; if (!(opts_set->x_target_flags & MASK_STV)) opts->x_target_flags |= MASK_STV; + /* Disable STV if -mpreferred-stack-boundary=2 or + -mincoming-stack-boundary=2 - the needed + stack realignment will be extra cost the pass doesn't take into + account and the pass can't realign the stack. */ + if (ix86_preferred_stack_boundary < 64 + || ix86_incoming_stack_boundary < 64) + opts->x_target_flags &= ~MASK_STV; if (!ix86_tune_features[X86_TUNE_AVX256_UNALIGNED_LOAD_OPTIMAL] && !(opts_set->x_target_flags & MASK_AVX256_SPLIT_UNALIGNED_LOAD)) opts->x_target_flags |= MASK_AVX256_SPLIT_UNALIGNED_LOAD; @@ -29323,7 +29320,10 @@ ix86_minimum_alignment (tree exp, machin if ((mode == DImode || (type && TYPE_MODE (type) == DImode)) && (!type || !TYPE_USER_ALIGN (type)) && (!decl || !DECL_USER_ALIGN (decl))) - return 32; + { + gcc_checking_assert (!TARGET_STV); + return 32; + } return align; } --- gcc/testsuite/gcc.target/i386/pr69454-1.c.jj 2016-02-03 18:44:17.642175753 +0100 +++ gcc/testsuite/gcc.target/i386/pr69454-1.c 2016-02-03 18:44:17.642175753 +0100 @@ -0,0 +1,11 @@ +/* { dg-do compile { target { ia32 } } } */ +/* { dg-options "-O2 -msse2 -mno-accumulate-outgoing-args -mpreferred-stack-boundary=2" } */ + +typedef struct { long long w64[2]; } V128; +extern V128* fn2(void); +long long a; +V128 b; +void fn1() { + V128 *c = fn2(); + c->w64[0] = a ^ b.w64[0]; +} --- gcc/testsuite/gcc.target/i386/pr69454-2.c.jj 2016-02-03 18:44:17.655175574 +0100 +++ gcc/testsuite/gcc.target/i386/pr69454-2.c 2016-02-03 18:44:17.655175574 +0100 @@ -0,0 +1,13 @@ +/* { dg-do compile { target { ia32 } } } */ +/* { dg-options "-O2 -mpreferred-stack-boundary=2" } */ + +extern void fn2 (); +long long a, b; + +void fn1 () +{ + long long c = a; + a = b ^ a; + fn2 (); + a = c; +} Jakub