https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95275
Bug ID: 95275 Summary: Possible performance regression in libasan with detect_stack_use_after_return=1 Product: gcc Version: 10.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: sanitizer Assignee: unassigned at gcc dot gnu.org Reporter: frantisek at sumsal dot cz CC: dodji at gcc dot gnu.org, dvyukov at gcc dot gnu.org, jakub at gcc dot gnu.org, kcc at gcc dot gnu.org, marxin at gcc dot gnu.org Target Milestone: --- Hello, This appears to be part #2 to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91101 as we managed to hit the issue once again[0], but in a different codepath. Reproducer: $ git clone https://github.com/systemd/systemd $ git fetch -fu origin refs/pull/15886/merge:pr $ git checkout pr $ meson build-gcc -Db_sanitize=address,undefined -Dfuzz-tests=true --optimization=1 $ ninja -C build-gcc $ export ASAN_OPTIONS=strict_string_checks=1:detect_stack_use_after_return=1:check_initialization_order=1:strict_init_order=1 $ export UBSAN_OPTIONS=print_stacktrace=1:print_summary=1:halt_on_error=1 $ time build-gcc/fuzz-unit-file\:address\,undefined test/fuzz/fuzz-unit-file/oss-fuzz-11569 Results: ### gcc (GCC) 10.0.1 20200328 (Red Hat 10.0.1-0.11) $ time build-gcc/fuzz-unit-file\:address\,undefined test/fuzz/fuzz-unit-file/oss-fuzz-11569 test/fuzz/fuzz-unit-file/oss-fuzz-11569... ok real 3m22.804s user 3m18.725s sys 0m0.245s ### gcc (GCC) 10.0.1 20200328 (Red Hat 10.0.1-0.11) with detect_stack_use_after_return=0 $ export ASAN_OPTIONS=strict_string_checks=1:detect_stack_use_after_return=0:check_initialization_order=1:strict_init_order=1 $ time build-gcc/fuzz-unit-file\:address\,undefined test/fuzz/fuzz-unit-file/oss-fuzz-11569 test/fuzz/fuzz-unit-file/oss-fuzz-11569... ok real 0m2.803s user 0m2.731s sys 0m0.060s ### clang version 10.0.0 (Fedora 10.0.0-0.3.rc4.fc32) for comparison $ time build-clang/fuzz-unit-file\:address\,undefined test/fuzz/fuzz-unit-file/oss-fuzz-11569 test/fuzz/fuzz-unit-file/oss-fuzz-11569... ok real 0m3.222s user 0m3.104s sys 0m0.089s ### perf # Samples: 862K of event 'cycles:u' # Event count (approx.): 873078595486 # # Overhead Command Shared Object Symbol # ........ ............... ........................ ......................................... # 95.14% fuzz-unit-file: libasan.so.6.0.0 [.] __asan_stack_malloc_0 3.13% fuzz-unit-file: libasan.so.6.0.0 [.] __asan_stack_malloc_1 0.18% fuzz-unit-file: libc-2.31.so [.] __strlen_avx2 0.17% fuzz-unit-file: libasan.so.6.0.0 [.] __asan_stack_malloc_2 0.15% fuzz-unit-file: libasan.so.6.0.0 [.] __asan_region_is_poisoned 0.07% fuzz-unit-file: libc-2.31.so [.] __strchr_avx2 0.06% fuzz-unit-file: libsystemd-shared-245.so [.] utf8_encoded_valid_unichar ... ### perf (call graph) # Samples: 804K of event 'cycles:u' # Event count (approx.): 811179153673 # # Children Self Command Shared Object Symbol # ........ ........ ............... ........................ ......................................... # 99.97% 0.00% fuzz-unit-file: fuzz-unit-file [.] LLVMFuzzerTestOneInput | ---LLVMFuzzerTestOneInput | --99.57%--config_parse | --99.56%--parse_line | |--93.39%--next_assignment (inlined) | | | --93.39%--config_parse_unit_requires_mounts_for | | | |--88.89%--path_simplify_and_warn | | | | | --88.87%--utf8_is_valid | | | | | --88.85%--utf8_encoded_valid_unichar | | | | | --88.79%--__asan_stack_malloc_0 | | | --4.09%--unit_full_printf | specifier_printf | | | --3.96%--specifier_cgroup | | | --3.96%--unit_default_cgroup_path | | | |--2.10%--cg_slice_to_path | | | | | |--0.76%--strextend_with_separator | | | | | | | --0.73%--__asan_stack_malloc_1 | | | | | --0.73%--cg_escape | | | | | --0.68%--__asan_stack_malloc_1 | | | --0.74%--unit_has_name | | | --0.74%--set_contains (inlined) | | | --0.74%--internal_hashmap_contains | | | --0.74%--base_bucket_hash | | | --0.70%--__asan_stack_malloc_1 | --6.15%--utf8_is_valid | --6.15%--utf8_encoded_valid_unichar | --6.15%--__asan_stack_malloc_0 As mentioned in [0], compiling with -O2 makes no difference. I'd attach full perf reports, but they're quite large (33M for the 'base' one, ~6.5G for call graph). [0] https://github.com/systemd/systemd/pull/15886#issuecomment-632689604