On 14.11.2024 22:01, Dimitry Andric wrote:
On 14 Nov 2024, at 13:44, Michal Meloun <m...@freebsd.org> wrote:
While searching for the cause of armv7 kernel corruption after updating to
llvm19 lld, I came across an interesting problem.
- The linker script does not list all generated sections. Specifically, the
data sections created by the linker set are not listed.
- The linker can place these orphaned sections in any location (OK, with some
restrictions). See
https://maskray.me/blog/2024-06-02-understanding-orphan-sections.
- Creating symbols outside a section is fragile and subject to error; the
linker may place an orphaned section between the symbol definition and the
following section.
We ran into this problem many years ago, see
https://github.com/freebsd/freebsd-src/commit/6e764e36da019837d90e3b4b712871ee4442637a.
Unfortunately, we didn't fix it completely then, and we have to address the
same corruption again.
I think we should be strict in this area and use '--orphan-handling=error' for
kernel linking. However, I'm not sure we can handle linker sets gracefully.
Any comments, contrary opinion or better solution ? Does anyone know how to properly
list all linker sets (mainly but not only 'set_<foo>_set') in linker script and
which section is appropriate for them ? .rodata?
I tried adding --orphan-handler=error, and on buildkernel (even for amd64) I
get pretty soon:
--- all_subdir_accf_data ---
ld: error: accf_data.o:(.data) is being placed in '.data'
ld: error: accf_data.o:(set_modmetadata_set) is being placed in
'set_modmetadata_set'
ld: error: accf_data.o:(set_sysinit_set) is being placed in 'set_sysinit_set'
ld: error: accf_data.o:(.debug_loc) is being placed in '.debug_loc'
ld: error: accf_data.o:(.debug_abbrev) is being placed in '.debug_abbrev'
ld: error: accf_data.o:(.debug_info) is being placed in '.debug_info'
ld: error: accf_data.o:(.debug_ranges) is being placed in '.debug_ranges'
ld: error: accf_data.o:(.debug_str) is being placed in '.debug_str'
ld: error: accf_data.o:(.comment) is being placed in '.comment'
ld: error: accf_data.o:(.debug_frame) is being placed in '.debug_frame'
ld: error: accf_data.o:(.debug_line) is being placed in '.debug_line'
ld: error: accf_data.o:(.llvm_addrsig) is being placed in '.llvm_addrsig'
ld: error: accf_data.o:(.SUNW_ctf) is being placed in '.SUNW_ctf'
ld: error: <internal>:(.note.gnu.build-id) is being placed in
'.note.gnu.build-id'
ld: error: <internal>:(.note.GNU-stack) is being placed in '.note.GNU-stack'
ld: error: <internal>:(.symtab) is being placed in '.symtab'
ld: error: <internal>:(.shstrtab) is being placed in '.shstrtab'
ld: error: <internal>:(.strtab) is being placed in '.strtab'
--- all_subdir_aic7xxx ---
--- all_subdir_aic7xxx/ahc ---
--- machine ---
machine -> /home/dim/src/freebsd/src/sys/amd64/include
--- all_subdir_accf_data ---
*** [accf_data.ko.full] Error code 1
Not sure if those are all really orphaned, though?
-Dimitry
Most of them are not orphaned and I think they should be explicitly
placed. Annoying as it is, we should probably keep a list of sections
used in the kernel (one is sufficient for all architectures) and include
it in the ldscripts for a particular arches(it's about 24 lines now).
After discussion with jrtc27 (thanks a lot for your patience), I think
we have only three options besides explicitly listing all kernel sections:
1) Leave the ldscripts as they are, but prefix each <foo>_start symbol
with a guard, i.e. explicit assignment to location counter ( '.=.' or
ALIGN()).
2) Move all <foo>_start/end symbols defined outside to the appropriate
sections
3) Add the linker '--orphan-handling=error' and declare/discard all
compiler-generated sections.
I definitely don't like option 1. It's too fragile and depends on not
very defined linker behavior.
Option 2 is easy and robust, and with the explicit placement of all
kernel sections seems sufficient.
In my best opinion, we can combine options 2 and 3 to get the most
robust solution.
Another problem is that an explicit list of kernel sections could
probably make modules outside the tree interact badly with linker script.
What was your preference?
Michal