[Bug ld/32067] New: ld crash in _bfd_elf_link_keep_memory when building statifier-1.7.4
https://sourceware.org/bugzilla/show_bug.cgi?id=32067 Bug ID: 32067 Summary: ld crash in _bfd_elf_link_keep_memory when building statifier-1.7.4 Product: binutils Version: unspecified Status: NEW Severity: normal Priority: P2 Component: ld Assignee: unassigned at sourceware dot org Reporter: sam at gentoo dot org Target Milestone: --- ld crashes when building statifier-1.7.4, as reported downstream at https://bugs.gentoo.org/937627: ``` Thread 3.1 "ld" received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x774f8740 (LWP 3092614)] 0x7793aec0 in _bfd_elf_link_keep_memory (info=0x55842e80 ) at /usr/src/debug/sys-devel/binutils-/binutils/bfd/elflink.c:67 67if (bed->use_mmap) (gdb) bt #0 0x7793aec0 in _bfd_elf_link_keep_memory (info=0x55842e80 ) at /usr/src/debug/sys-devel/binutils-/binutils/bfd/elflink.c:67 #1 elf_link_add_object_symbols (abfd=, info=) at /usr/src/debug/sys-devel/binutils-/binutils/bfd/elflink.c:5727 #2 bfd_elf_link_add_symbols (abfd=, info=) at /usr/src/debug/sys-devel/binutils-/binutils/bfd/elflink.c:6363 #3 0x55692ad5 in load_symbols (entry=entry@entry=0x55848740, place=place@entry=0x7fffdc30) at /usr/src/debug/sys-devel/binutils-/binutils/ld/ldlang.c:3130 #4 0x556847fd in open_input_bfds (s=0x55848740, os=0x55849cc0, mode=OPEN_BFD_NORMAL) at /usr/src/debug/sys-devel/binutils-/binutils/ld/ldlang.c:3622 #5 0x55682209 in lang_process () at /usr/src/debug/sys-devel/binutils-/binutils/ld/ldlang.c:8194 #6 0x5568c234 in main (argc=, argv=) at /usr/src/debug/sys-devel/binutils-/binutils/ld/ldmain.c:529 (gdb) p bed $1 = (const struct elf_backend_data *) 0x0 ``` toralf downstream hit this w/ 2.43, I've hit it on recent (few days old) trunk: ``` $ ld --version GNU ld (Gentoo p1) 2.43.50.20240806 Copyright (C) 2024 Free Software Foundation, Inc. This program is free software; you may redistribute it under the terms of the GNU General Public License version 3 or (at your option) a later version. This program has absolutely no warranty. ``` Testcase coming. -- You are receiving this mail because: You are on the CC list for the bug.
[Bug ld/32067] ld crash in _bfd_elf_link_keep_memory when building statifier-1.7.4
https://sourceware.org/bugzilla/show_bug.cgi?id=32067 --- Comment #1 from Sam James --- Created attachment 15665 --> https://sourceware.org/bugzilla/attachment.cgi?id=15665&action=edit processor.S ``` $ gcc -nostdinc -m32 processor.S -c -o dl-var.o && gcc -m32 -o dl-var dl-var.o -Wl,--oformat,binary,--entry=0x0 collect2: fatal error: ld terminated with signal 11 [Segmentation fault], core dumped compilation terminated. ``` -- You are receiving this mail because: You are on the CC list for the bug.
[Bug ld/32067] ld crash in _bfd_elf_link_keep_memory when building statifier-1.7.4
https://sourceware.org/bugzilla/show_bug.cgi?id=32067 --- Comment #2 from Sam James --- (gdb) p info->output_bfd $1 = (bfd *) 0x556bca80 (gdb) p *info->output_bfd $2 = {filename = 0x556bcbd0 "dl-var", xvec = 0x77fb1c60 , iostream = 0x5569cee0, iovec = 0x77fa9080 , lru_prev = 0x556c8ac0, lru_next = 0x556c8ac0, where = 0, mtime = 0, id = 0, flags = 384, format = bfd_object, direction = write_direction, last_io = bfd_io_seek, cacheable = 1, target_defaulted = 0, opened_once = 1, mtime_set = 0, no_export = 0, output_has_begun = 0, has_armap = 0, is_thin_archive = 0, no_element_cache = 0, selective_search = 0, is_linker_output = 1, is_linker_input = 0, plugin_format = bfd_plugin_unknown, lto_output = 0, read_only = 0, lto_type = lto_non_object, in_format_matches = 0, plugin_dummy_bfd = 0x0, origin = 0, proxy_origin = 0, section_htab = {table = 0x556bdbc0, newfunc = 0x77ecaafe , memory = 0x556bca20, size = 13, count = 0, entsize = 304, frozen = 0}, sections = 0x0, section_last = 0x0, section_count = 0, archive_plugin_fd = -1, archive_plugin_fd_open_count = 0, archive_pass = 0, alloc_size = 7, start_address = 0, outsymbols = 0x0, symcount = 0, dynsymcount = 0, arch_info = 0x77fbb8c0 , size = 0, arelt_data = 0x0, my_archive = 0x0, archive_next = 0x0, archive_head = 0x0, nested_archives = 0x0, link = {next = 0x556beba0, hash = 0x556beba0}, tdata = {aout_data = 0x0, aout_ar_data = 0x0, coff_obj_data = 0x0, pe_obj_data = 0x0, xcoff_obj_data = 0x0, ecoff_obj_data = 0x0, srec_data = 0x0, verilog_data = 0x0, ihex_data = 0x0, tekhex_data = 0x0, elf_obj_data = 0x0, mmo_data = 0x0, trad_core_data = 0x0, som_data = 0x0, hpux_core_data = 0x0, hppabsd_core_data = 0x0, sgi_core_data = 0x0, lynx_core_data = 0x0, osf_core_data = 0x0, cisco_core_data = 0x0, netbsd_core_data = 0x0, mach_o_data = 0x0, mach_o_fat_data = 0x0, plugin_data = 0x0, pef_data = 0x0, pef_xlib_data = 0x0, sym_data = 0x0, any = 0x0}, usrdata = 0x0, memory = 0x556bca00, build_id = 0x0, mmapped = 0x0} (gdb) and then (gdb) p *info->output_bfd.xvec $10 = {name = 0x77f7349c "binary", flavour = bfd_target_unknown_flavour, byteorder = BFD_ENDIAN_UNKNOWN, header_byteorder = BFD_ENDIAN_UNKNOWN, object_flags = 2, section_flags = 379, symbol_leading_char = 0 '\000', ar_pad_char = 32 ' ', ar_max_namelen = 16 '\020', match_priority = 255 '\377', keep_unused_section_symbols = true, [...] so we call get_elf_backend_data even though the format is 'binary'. Oops. -- You are receiving this mail because: You are on the CC list for the bug.
[Bug ld/32067] ld crash in _bfd_elf_link_keep_memory when building statifier-1.7.4
https://sourceware.org/bugzilla/show_bug.cgi?id=32067 Sam James changed: What|Removed |Added CC||hjl.tools at gmail dot com -- You are receiving this mail because: You are on the CC list for the bug.
[Bug ld/32067] ld crash in _bfd_elf_link_keep_memory when building statifier-1.7.4 with -Wl,--oformat,binary
https://sourceware.org/bugzilla/show_bug.cgi?id=32067 Sam James changed: What|Removed |Added Summary|ld crash in |ld crash in |_bfd_elf_link_keep_memory |_bfd_elf_link_keep_memory |when building |when building |statifier-1.7.4 |statifier-1.7.4 with ||-Wl,--oformat,binary -- You are receiving this mail because: You are on the CC list for the bug.
Questions Regarding Initial Input Seeds Used for Fuzzing Binutils via OSS-Fuzz
Dear Binutils Team, I hope this message finds you well. My name is Federico Cernera, and I am a PhD student at Sapienza University of Rome, currently collaborating with Vrije Universiteit Amsterdam. I have been exploring the Binutils library and its integration with OSS-Fuzz for fuzzing purposes. I have questions regarding the initial input seeds used to fuzz the library and would greatly appreciate your assistance. 1. Are the current initial input seeds used to fuzz the library via OSS-Fuzz human-made, or are they generated due to the fuzzing iterations? 2. Were the initial input seeds used for the first fuzzing campaign with OSS-Fuzz human-made? 3. Can I access the initial input seeds used during the first fuzzing campaign with OSS-Fuzz? Thank you for your time and consideration. I look forward to your response. Best regards, Federico Cernera
[Bug ld/32003] Specifying --package-metadata might not be possible and is too fragile
https://sourceware.org/bugzilla/show_bug.cgi?id=32003 --- Comment #23 from Benjamin Drung --- (In reply to H.J. Lu from comment #14) > (In reply to Benjamin Drung from comment #13) > > > > Will "%[string]" escape work? > > > > Like this? > > > > -Wl,--encoded-package-metadata={%[quot]type%[quot]: > > %[quot]deb%[quot]%[comma]%[quot]os%[quot]: > > %[quot]ubuntu%[quot]%[comma]%[quot]name%[quot]: > > %[quot]dpkg%[quot]%[comma]%[quot]version%[quot]:%[quot]1.22. > > 6ubuntu15%[quot]%[comma]%[quot]architecture%[quot]:%[quot]amd64%[quot]} > > It should be %[quote]". You suggested to borrow from HTML's Named Character References and https://dev.w3.org/html5/spec-LC/named-character-references.html says that U+00022 has the name "quot" (not "quote"). > Will adding support for "%[string]" to existing > --package-metadata option break anything? It might theoretical break existing use cases. --package-metadata='{"version":"1.0%2"}' The only safe option that I could come up with is to use a marker that would be invalid JSON. For example: If the string starts with a percent character, decode it: --package-metadata='%{"foo":"bar"%[comma]"baz":42}' would be invalid JSON and decode to: --package-metadata='{"foo":"bar","baz":42}' -- You are receiving this mail because: You are on the CC list for the bug.
[Bug ld/32003] Specifying --package-metadata might not be possible and is too fragile
https://sourceware.org/bugzilla/show_bug.cgi?id=32003 --- Comment #24 from Luca Boccassi --- (In reply to Benjamin Drung from comment #23) > (In reply to H.J. Lu from comment #14) > > (In reply to Benjamin Drung from comment #13) > > > > > > Will "%[string]" escape work? > > > > > > Like this? > > > > > > -Wl,--encoded-package-metadata={%[quot]type%[quot]: > > > %[quot]deb%[quot]%[comma]%[quot]os%[quot]: > > > %[quot]ubuntu%[quot]%[comma]%[quot]name%[quot]: > > > %[quot]dpkg%[quot]%[comma]%[quot]version%[quot]:%[quot]1.22. > > > 6ubuntu15%[quot]%[comma]%[quot]architecture%[quot]:%[quot]amd64%[quot]} > > > > It should be %[quote]". > > You suggested to borrow from HTML's Named Character References and > https://dev.w3.org/html5/spec-LC/named-character-references.html says that > U+00022 has the name "quot" (not "quote"). > > > Will adding support for "%[string]" to existing > > --package-metadata option break anything? > > It might theoretical break existing use cases. > > --package-metadata='{"version":"1.0%2"}' Are there distros where '%' is an allowed character in a version string or a package name? I care about backward compatibility, but we can be sensible about it, and if in practice it's not a problem, then it's fine to do such a change -- You are receiving this mail because: You are on the CC list for the bug.
[Bug ld/32003] Specifying --package-metadata might not be possible and is too fragile
https://sourceware.org/bugzilla/show_bug.cgi?id=32003 --- Comment #25 from Benjamin Drung --- (In reply to Luca Boccassi from comment #24) > (In reply to Benjamin Drung from comment #23) > > (In reply to H.J. Lu from comment #14) > > > (In reply to Benjamin Drung from comment #13) > > > > > > Will adding support for "%[string]" to existing > > > --package-metadata option break anything? > > > > It might theoretical break existing use cases. > > > > --package-metadata='{"version":"1.0%2"}' > > Are there distros where '%' is an allowed character in a version string or a > package name? I care about backward compatibility, but we can be sensible > about it, and if in practice it's not a problem, then it's fine to do such a > change For all Debian-based distros: % is neither allowed in the package name nor in the package version. Who wants to check the other 400 distributions? -- You are receiving this mail because: You are on the CC list for the bug.
[Bug ld/32003] Specifying --package-metadata might not be possible and is too fragile
https://sourceware.org/bugzilla/show_bug.cgi?id=32003 --- Comment #26 from Luca Boccassi --- (In reply to Benjamin Drung from comment #25) > (In reply to Luca Boccassi from comment #24) > > (In reply to Benjamin Drung from comment #23) > > > (In reply to H.J. Lu from comment #14) > > > > (In reply to Benjamin Drung from comment #13) > > > > > > > > Will adding support for "%[string]" to existing > > > > --package-metadata option break anything? > > > > > > It might theoretical break existing use cases. > > > > > > --package-metadata='{"version":"1.0%2"}' > > > > Are there distros where '%' is an allowed character in a version string or a > > package name? I care about backward compatibility, but we can be sensible > > about it, and if in practice it's not a problem, then it's fine to do such a > > change > > For all Debian-based distros: % is neither allowed in the package name nor > in the package version. Who wants to check the other 400 distributions? We don't need to check all of them individually fortunately, the package format is enough - deb is out, I don't think it's allowed in rpm? So if Arch doesn't allow it either, I'd say we are good? I'm the most invested in retaining backward compat, but in a pragmatic way -- You are receiving this mail because: You are on the CC list for the bug.
[Bug ld/32003] Specifying --package-metadata might not be possible and is too fragile
https://sourceware.org/bugzilla/show_bug.cgi?id=32003 --- Comment #27 from Benjamin Drung --- Taking all comments into account, here is my implementation proposals: Encoding schema === Option 1: Support percent-encoding of the JSON data. Percent-encoding is widely used and supported (for example, Python provides urllib.parse.unquote for decoding and urllib.parse.quote for encoding). Example encoded JSON: '{%22foo%22:%22bar%22}' or '%7B%22foo%22%3A%22bar%22%7D' This option has the benefit of being easy to implement. Either the encoded string can be read directly (I can spot package names and version in there) or decoded using Python's urllib.parse.unquote or online tools. Option 2: Support percent-encoding and %[string] (where string refers to the name in HTML's Named Character References) the JSON data. Example encoded JSON: '{%[quot]foo%[quot]:%[quot]bar%[quot]}' or '{%22foo%22:%22bar%22}' This option allow people to use %[string] encoding in case they dislike percent-encoding. The drawback is that it is more work to implement since there must be a list of names. To make the code simpler, the list of allowed names might be restricted to, e. g. quot, comma, lbrace, rbrace and maybe add space. These are the two options that I would be happy about to implement. Supporting only %[string] would not satisfy me. Using base64 encoding would make the string shorter, but would not be human readable. Quoted-printable would be an alternative, but the problematic characters like comma and quotes would not be encoded by quoted-printable. Parameter usage === Option A: Introduce a new --encoded-package-metadata parameter that takes the encoded string. Option B: Extend --package-metadata to always decode the given string. As previously discussed, package names and version should not contain percent characters. So this change should not break backward compatibility. Which of those proposals do you want to see implemented? My initial implementation is option 1A. -- You are receiving this mail because: You are on the CC list for the bug.
[Bug ld/32067] ld crash in _bfd_elf_link_keep_memory when building statifier-1.7.4 with -Wl,--oformat,binary
https://sourceware.org/bugzilla/show_bug.cgi?id=32067 Alan Modra changed: What|Removed |Added Assignee|unassigned at sourceware dot org |amodra at gmail dot com Status|NEW |ASSIGNED -- You are receiving this mail because: You are on the CC list for the bug.
[Bug ld/32067] ld crash in _bfd_elf_link_keep_memory when building statifier-1.7.4 with -Wl,--oformat,binary
https://sourceware.org/bugzilla/show_bug.cgi?id=32067 --- Comment #3 from Sourceware Commits --- The master branch has been updated by Alan Modra : https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=ec8f5671b4e70806fe3053636426a8d179dfef55 commit ec8f5671b4e70806fe3053636426a8d179dfef55 Author: Alan Modra Date: Sat Aug 10 08:41:16 2024 +0930 PR32067, ld -Wl,--oformat,binary crash in _bfd_elf_link_keep_memory The direct fix for this segfault is to test for a non-NULL bed in _bfd_elf_link_keep_memory, but also there isn't much point in running code for LTO if the output is binary. PR 32067 * elflink.c (_bfd_elf_link_keep_memory): Test for non-NULL bed. (elf_link_add_object_symbols): Don't run the loop setting non_ir_ref_regular if the output hash table is not ELF. -- You are receiving this mail because: You are on the CC list for the bug.