Re: [PATCH] Extend -falign-FOO=N to N[,M]: the second number is max padding

Andrew Pinski Thu, 11 Aug 2016 17:35:03 -0700

On Thu, Aug 11, 2016 at 5:19 PM, Denys Vlasenko <dvlas...@redhat.com> wrote:
>
>
> On 08/11/2016 10:59 PM, Andrew Pinski wrote:
>>
>> On Thu, Aug 11, 2016 at 1:49 PM, Denys Vlasenko <dvlas...@redhat.com>
>> wrote:
>>>
>>> falign-functions=N is too simplistic.
>>>
>>> Ingo Molnar ran some tests and it looks on latest x86 CPUs, 64-byte
>>> alignment
>>> runs fastest (he tried many other possibilites).
>>>
>>> However, developers are less than thrilled by the idea of a slam-dunk
>>> 64-byte
>>> aligning everything. Too much waste:
>>>         On 05/20/2015 02:47 AM, Linus Torvalds wrote:
>>>         > At the same time, I have to admit that I abhor a 64-byte
>>> function
>>>         > alignment, when we have a fair number of functions that are
>>> (much)
>>>         > smaller than that.
>>>         >
>>>         > Is there some way to get gcc to take the size of the function
>>> into
>>>         > account? Because aligning a 16-byte or 32-byte function on a
>>> 64-byte
>>>         > alignment is just criminally nasty and wasteful.
>>>
>>> This change makes it possible to align function to 64-byte boundaries
>>> *IF*
>>> this does not introduce huge amount of padding.
>>>
>>> Patch drops forced alignment to 8 if requested alignment is higher than
>>> 8:
>>> before the patch, -falign-functions=9 was generating
>>>
>>>         .p2align 4,,8
>>>         .p2align 3
>>>
>>> which means: "align to 16 if the skip is 8 bytes or less; else align to
>>> 8".
>>> After this change, ".p2align 3" is not emitted.
>>>
>>> It is dropped because I ultimately want to do something
>>> like -falign-functions=64,8 - IOW, I want to align functions to 64 bytes,
>>> but only if that generates padding of less than 8 bytes - otherwise I
>>> want
>>> *no alignment at all*. The forced ".p2align 3" interferes with that
>>> intention.
>>>
>>> Testing:
>>> tested that with -falign-functions=N (tried 8, 15, 16, 17...) the
>>> alignment
>>> directives are the same before and after the patch.
>>> Tested that -falign-functions=N,N (two equal paramenters) works exactly
>>> like -falign-functions=N.
>>
>>
>> Two things I noticed you missed:
>> 1) ChangeLog entries (this is required as per the GNU coding style)
>
>
> Is this form okay? -


This is getting there but needs a little improvement.

>
> ...
> Testing:
> tested that with -falign-functions=N (tried 8, 15, 16, 17...) the alignment
> directives are the same before and after the patch.
> Tested that -falign-functions=N,N (two equal paramenters) works exactly
> like -falign-functions=N.

Oh also maybe adding testcases would be useful, just scanning the
outputted assembly should be good enough.

>
> 2016-08-11  Denys Vlasenko  <dvlas...@redhat.com>
>
>     * common.opt: Change definitions so that -falign_FOO= accepts a string.

I think this should be:
* common.opt (-falign-functions): Accept a string instead of an integer.
(falign-loops): Likewise.
....

>     * config/i386/freebsd.h: Remove "If N is large, do at least 8 byte
>     alignment" code.
This should be something more like:
* config/i386/freebsd.h (ASM_OUTPUT_MAX_SKIP_ALIGN): Remove "If N is
large, do at least 8 byte alignment" code.

>     * config/i386/gnu-user.h: Likewise.
* config/i386/gnu-user.h (ASM_OUTPUT_MAX_SKIP_ALIGN): Likewise.

>     * config/i386/iamcu.h: Likewise.
>     * config/i386/openbsdelf.h: Likewise.
>     * config/i386/x86-64.h: Likewise.
>     * flags.h (struct target_flag_state): Add x_align_functions_max_skip
>     member.
>     * toplev.c (parse_N_M): New function.
>     (init_alignments): Set align_FOO_log, align_FOO, align_FOO_max_skip
>     from specified --align_FOO=N{,M} option
>     * varasm.c (assemble_start_function): Use align_functions_max_skip
>     instead of align_functions - 1.
>
> Index: gcc/common.opt
> ===================================================================
> ....
>
>
>> 2) Documentation changes.

You did not add this or you doing this in a separate patch now?
The current documentation is done in doc/invoke.texi:
@item -falign-functions
@itemx -falign-functions=@var{n}
@opindex falign-functions
Align the start of functions to the next power-of-two greater than
@var{n}, skipping up to @var{n} bytes.  For instance,
@option{-falign-functions=32} aligns functions to the next 32-byte
boundary, but @option{-falign-functions=24} aligns to the next
32-byte boundary only if this can be done by skipping 23 bytes or less.

@option{-fno-align-functions} and @option{-falign-functions=1} are
equivalent and mean that functions are not aligned.

Some assemblers only support this flag when @var{n} is a power of two;
in that case, it is rounded up.

If @var{n} is not specified or is zero, use a machine-dependent default.

---- CUT ----
You should add your documentation there (and the rest of the -falign-* options.


>>
>>>
>>> Index: gcc/common.opt
>>> ===================================================================
>
> ...
>
>>>  #define ASM_OUTPUT_MAX_SKIP_ALIGN(FILE,LOG,MAX_SKIP)                   \
>>>    do {                                                                 \
>>>      if ((LOG) != 0) {                                                  \
>>> -      if ((MAX_SKIP) == 0) fprintf ((FILE), "\t.p2align %d\n", (LOG)); \
>>> -      else {                                                           \
>>> +      if ((MAX_SKIP) == 0 || (MAX_SKIP) >= (1<<(LOG)))                 \
>>> +        fprintf ((FILE), "\t.p2align %d\n", (LOG));                    \
>>> +      else                                                             \
>>>         fprintf ((FILE), "\t.p2align %d,,%d\n", (LOG), (MAX_SKIP));     \
>>> -       /* Make sure that we have at least 8 byte alignment if > 8 byte \
>>> -          alignment is preferred.  */                                  \
>>> -       if ((LOG) > 3                                                   \
>>> -           && (1 << (LOG)) > ((MAX_SKIP) + 1)                          \
>>> -           && (MAX_SKIP) >= 7)                                         \
>>> -         fputs ("\t.p2align 3\n", (FILE));                             \
>>> -      }
>>> \
>>>      }                                                                  \
>>>    } while (0)
>>>  #undef  ASM_OUTPUT_MAX_SKIP_PAD
>>
>>
>> These looks like a bug fix, it most likely should be sent separately;
>> maybe even first.
>
>
> ok
>
>> Also can you make sure the generic ASM_OUTPUT_MAX_SKIP_ALIGN is done
>> correctly?
>
>
> If by generic you mean "non-arch-specific", the situation is as follows:
> if arch does not define ASM_OUTPUT_MAX_SKIP_ALIGN(file, align, max_skip),
> then
> ASM_OUTPUT_ALIGN_WITH_NOP(file, align) or
> ASM_OUTPUT_ALIGN (file, align) macros are used.
> As you se from their arguments, those macros don't take "max_skip" argument.
> IOW: even in existing code, they ignore max_skip, and always pad to the
> specified alignment.
> Whether this is considered "correct" or not, is up to you. The patch
> changes nothing for those arches.
>
> I checked all arches which do have ASM_OUTPUT_MAX_SKIP_ALIGN.
>
> visium has the same "align at least to 8" logic like many x86 flavors,
> do you think it should be removed?
>
> rs6000, aarch64, arm use ".p2align %d,,%d" logic with no special cases,
> look good to me.
>
> rx uses ".balign %d,3,%d" and special-cases TARGET_AS100_SYNTAX,
> this arch is probably fine too.

Re: [PATCH] Extend -falign-FOO=N to N[,M]: the second number is max padding

Reply via email to