Evgeny Karpov <[email protected]> writes:
> Tuesday, October 29, 2024
> Richard Sandiford <[email protected]> wrote:
>
>> Hmm, I see. I think this is surprising enough that it would be worth
>> a comment. How about:
>>
>> /* Since the assembly directive only specifies a size, and not an
>> alignment, we need to follow the default ASM_OUTPUT_LOCAL behavior
>> and round the size up to at least a multiple of BIGGEST_ALIGNMENT bits,
>> so that each uninitialized object starts on such a boundary.
>> However, we also want to allow the alignment (and thus minimum size)
>> to exceed BIGGEST_ALIGNMENT. */
>
> Thanks for the suggestion. It will be included in the next version of the
> patch.
>
>> But how does using a larger size force the linker to assign a larger
>> alignment than BIGGEST_ALIGNMENT? Is there a second limit in play?
>>
>> Or does this patch not guarantee that the ffmpeg variable gets the
>> alignment it wants? Is it just about suppresing the error?
>>
>> If it's just about suppressing the error without guaranteeing the
>> requested alignment, then, yeah, I think patching ffmpeg would
>> be better. If the patch does guarantee the alignment, then the
>> patch seems ok, but I think the comment should explain how, and
>> explain why BIGGEST_ALIGNMENT isn't larger.
>
> It looks like it generates the expected assembly code for the alignments
> and the correct object file, and it should be the expected code for FFmpeg.
>
> The alignment cannot be larger than 8192, otherwise, it will generate an
> error.
>
> error: requested alignment ‘16384’ exceeds object file maximum 8192
> 16 | float __attribute__((aligned (1 << 14))) large_aligned_array10[3];
OK, thanks. But...
> Here an example:
>
> float large_aligned_array[3];
> float __attribute__((aligned (8))) large_aligned_array2[3];
> float __attribute__((aligned (16))) large_aligned_array3[3];
> float __attribute__((aligned (32))) large_aligned_array4[3];
> float __attribute__((aligned (64))) large_aligned_array5[3];
> float __attribute__((aligned (128))) large_aligned_array6[3];
> float __attribute__((aligned (256))) large_aligned_array7[3];
> float __attribute__((aligned (512))) large_aligned_array8[3];
> float __attribute__((aligned (1024))) large_aligned_array9[3];
>
>
> .align 3
> .def large_aligned_array; .scl 3; .type 0; .endef
> large_aligned_array:
> .space 12 // skip
Is this the construct used by the patch? The original patch was:
+#define ASM_OUTPUT_ALIGNED_LOCAL(FILE, NAME, SIZE, ALIGNMENT) \
+ { \
+ unsigned HOST_WIDE_INT rounded = MAX ((SIZE), 1); \
+ unsigned HOST_WIDE_INT alignment = MAX ((ALIGNMENT), BIGGEST_ALIGNMENT); \
+ rounded += (alignment / BITS_PER_UNIT) - 1; \
+ rounded = (rounded / (alignment / BITS_PER_UNIT) \
+ * (alignment / BITS_PER_UNIT)); \
+ ASM_OUTPUT_LOCAL (FILE, NAME, SIZE, rounded); \
+ }
with patch 5 defining ASM_OUTPUT_LOCAL as:
+#define ASM_OUTPUT_LOCAL(FILE, NAME, SIZE, ROUNDED) \
+( fputs (".lcomm ", (FILE)), \
+ assemble_name ((FILE), (NAME)), \
+ fprintf ((FILE), ",%u\n", (int)(ROUNDED)))
So for the change in question, I was expecting, say, a 1024-byte-aligned
float[3] to be defined using:
.lcomm array, 1024
If we have access to .align, couldn't we define ASM_OUTPUT_ALIGNED_LOCAL
to use that, using the style you quoted above? Or is the .lcomm approach
needed to work with -fcommon?
Thanks,
Richard
> .global large_aligned_array2
> .align 3
> .def large_aligned_array2; .scl 3; .type 0; .endef
> large_aligned_array2:
> .space 12 // skip
>
> .global large_aligned_array3
> .align 4
> .def large_aligned_array3; .scl 3; .type 0; .endef
> large_aligned_array3:
> .space 12 // skip
>
> .global large_aligned_array4
> .align 5
> .def large_aligned_array4; .scl 3; .type 0; .endef
> large_aligned_array4:
> .space 12 // skip
>
> .global large_aligned_array5
> .align 6
> .def large_aligned_array5; .scl 3; .type 0; .endef
> large_aligned_array5:
> .space 12 // skip
>
> .global large_aligned_array6
> .align 7
> .def large_aligned_array6; .scl 3; .type 0; .endef
> large_aligned_array6:
> .space 12 // skip
>
> .global large_aligned_array7
> .align 8
> .def large_aligned_array7; .scl 3; .type 0; .endef
> large_aligned_array7:
> .space 12 // skip
>
> .global large_aligned_array8
> .align 9
> .def large_aligned_array8; .scl 3; .type 0; .endef
> large_aligned_array8:
> .space 12 // skip
>
> .global large_aligned_array9
> .align 10
> .def large_aligned_array9; .scl 3; .type 0; .endef
> large_aligned_array9:
> .space 12 // skip
>
>
> Symbols in the object file also look good.
>
> 015 00000000 SECT2 notype External | large_aligned_array
> 016 00000010 SECT2 notype External | large_aligned_array2
> 017 00000020 SECT2 notype External | large_aligned_array3
> 018 00000040 SECT2 notype External | large_aligned_array4
> 019 00000080 SECT2 notype External | large_aligned_array5
> 01A 00000100 SECT2 notype External | large_aligned_array6
> 01B 00000200 SECT2 notype External | large_aligned_array7
> 01C 00000400 SECT2 notype External | large_aligned_array8
> 01D 00000800 SECT2 notype External | large_aligned_array9