Am Dienstag, dem 23.09.2025 um 08:57 +0200 schrieb Richard Biener:
> On Tue, 23 Sep 2025, Martin Uecker wrote:
>
> > Am Montag, dem 22.09.2025 um 11:00 +0200 schrieb Richard Biener:
> > > On Sat, 20 Sep 2025, Martin Uecker wrote:
> > >
> > > > Bootstrapped and regression tested on x86_64.
> > >
> > > How's it useful when all UBSAN generated traps get a compile-time
> > > diagnostic? Specifically for the testcase you add, how is this
> > > diagnostig useful?
> >
> > My intention was that this works for all traps the compiler
> > generates, so this would natually include UBsan generated
> > traps. I think it would be rather strange to exclude them. The
> > testcase is not mean to illustrate a use case.
> >
> > I personally find it useful to better understand what UBsan
> > does to the code. Clearly, if you simply apply it to a regular
> > C program compiled with UBsan this will give you a lot of warnings,
> > and this will not be too useful except maybe for statistics and
> > overall impact. But in more specific cases it is useful to
> > better understand whether some error scenario is instrumented and
> > whether it can actually occur or not.
>
> Hmm, but this then sounds like a debugging feature, not something
> for a diagnostic? What's wrong with notes in dump files? It
> _might_ also fit -fopt-info (though it's not really an "optimization").
Often I am interested in what I need to fix in the source code to avoid
the trap (because this may not be acceptable, e.g. when controlling some
hardware or because I want to remove performance impact of having the
conditional check). So, I like a diagnostic because it ties it back to
the source code and existing tools can handle this nicely. For example,
consider this example:
maybe(int) safe_divide(int a, int b)
{
if (b == 0)
return maybe_nothing(int);
return maybe_just(int, a / b);
}
I would expect that compiling with
-fsanitize=undefined -fsanitize-trap=all -O2
comes up clean because I tested for zero and I expect the later trap to
be removed by the optimizer. But with the warning it tells me:
div.c: In function ‘safe_divide’:
div.c:13:34: warning: trap generated: "builtin" [-Wtrap]
13 | return maybe_just(int, a / b);
| ^
When I also add the missing check for overflow there is no trap
anymore and the code then does what I expect it to do. This is
why I think a diagnostic is the right tool.
https://godbolt.org/z/1cajbs481
Before even running the code, this would give me useful information
about potential issues. If I want to find such issues only using
run-time testing with the sanitizer, my best shot would be to use
a fuzzer, but this still misses many cases. People have been looking
at the assembly output instead, but this is not very nice and misses
too much context (it would be nice though to propagate the reason
string also into the assembly, but I do not know enough how this
can be done.)
One could consider this a misuse of the sanitizer with optimization,
but it in practice this works very well as GCC is very good in
removing unnecessary sanitzer checks. (Most instrumentation is
done early, so sometimes I also get better code with the sanitizer
because it injects additional information the optimizer can then
exploit.)
>
> I'd find it equally useful, in the UBSAN context, when the compiler
> was able to (fully) elide a UBSAN instrumentation.
You would get this with -O0, but I am not sure what your usecase is.
In WG14 we discuss having an optional safety mode where one would get
a warning about potentially unsafe code. But this would be a FE thing.
Martin
>
> Richard.
>
> > >
> > > I'd instead have expected path isolation to diagnose inserted traps.
> >
> > It will be diagnosed by the patch, if I am not missing anything.
> > But yes, this should get its own reason string as well.
> >
> > Martin
> >
> >
> > >
> > > Richard.
> > >
> > > >
> > > > Add reason string to compiler emitted traps.
> > > >
> > > > For traps generated by the compiler, add a short string stating
> > > > the reason. This string will be shown with -Wtrap when the trap
> > > > is actually emitted.
> > > >
> > > > gcc/ChangeLog:
> > > > * builtins.cc (expand_builtin_trap): Add string argument.
> > > > (expand_builtin): Add reasons for traps.
> > > > (expand_builtin_object_size): Don't emit message after
> > > > error.
> > > > * expr.cc (expand_assignment): Add reason for trap.
> > > > * stmt.cc (expand_sjlj_dispatch_table): Add reason for trap.
> > > > * ubsan.cc (ubsan_build_string): New function.
> > > > (ubsan_build_string_ptr): New function.
> > > > (ubsan_trap): New functions.
> > > > (ubsan_source_location): Use ubsan_build_string_ptr.
> > > > (ubsan_type_descriptor): Use ubsan_build_string.
> > > > (ubsan_expand_bounds_ifn): Use ubsan_trap.
> > > > (ubsan_expand_bounds_null): Dito.
> > > > (ubsan_expand_objsize_ifn): Dito.
> > > > (ubsan_expand_ptr_ifn): Dito.
> > > > (instrument_bool_enum_load): Dito.
> > > > (instrument_nonnull_arg): Dito.
> > > > (instrument_nonnull_return): Dito.
> > > > (instrument_builtin): Dito.
> > > > (ubsan_build_overflow_builtin): Add message.
> > > > (ubsan_instrument_float_case): Dito.
> > > >
> > > > gcc/testsuite/ChangeLog:
> > > > * gcc.dg/Wtrap-2.c: New test.
> > > >
> > > > diff --git a/gcc/builtins.cc b/gcc/builtins.cc
> > > > index de5c96f3a25..2e3ac9df86d 100644
> > > > --- a/gcc/builtins.cc
> > > > +++ b/gcc/builtins.cc
> > > > @@ -5968,10 +5968,10 @@ expand_builtin_trap_no_msg (void)
> > > > /* Expand a call to __builtin_trap. */
> > > >
> > > > void
> > > > -expand_builtin_trap ()
> > > > +expand_builtin_trap (const char *reason)
> > > > {
> > > > if (warn_trap)
> > > > - warning_at (input_location, OPT_Wtrap, "trap generated");
> > > > + warning_at (input_location, OPT_Wtrap, "trap generated:
> > > > \"%.99s\"", reason);
> > > >
> > > > return expand_builtin_trap_no_msg ();
> > > > }
> > > > @@ -8448,8 +8448,11 @@ expand_builtin (tree exp, rtx target, rtx
> > > > subtarget, machine_mode mode,
> > > > return const0_rtx;
> > > >
> > > > case BUILT_IN_TRAP:
> > > > + expand_builtin_trap ("builtin");
> > > > + return const0_rtx;
> > > > +
> > > > case BUILT_IN_UNREACHABLE_TRAP:
> > > > - expand_builtin_trap ();
> > > > + expand_builtin_trap ("unreachable");
> > > > return const0_rtx;
> > > >
> > > > case BUILT_IN_UNREACHABLE:
> > > > @@ -11554,7 +11557,7 @@ expand_builtin_object_size (tree exp)
> > > > {
> > > > error ("first argument of %qD must be a pointer, second integer
> > > > constant",
> > > > fndecl);
> > > > - expand_builtin_trap ();
> > > > + expand_builtin_trap_no_msg ();
> > > > return const0_rtx;
> > > > }
> > > >
> > > > @@ -11567,7 +11570,7 @@ expand_builtin_object_size (tree exp)
> > > > {
> > > > error ("last argument of %qD is not integer constant between 0
> > > > and 3",
> > > > fndecl);
> > > > - expand_builtin_trap ();
> > > > + expand_builtin_trap_no_msg ();
> > > > return const0_rtx;
> > > > }
> > > >
> > > > diff --git a/gcc/builtins.h b/gcc/builtins.h
> > > > index 5a553a9c836..ae4ba4a9b96 100644
> > > > --- a/gcc/builtins.h
> > > > +++ b/gcc/builtins.h
> > > > @@ -129,7 +129,7 @@ extern tree std_build_builtin_va_list (void);
> > > > extern tree std_fn_abi_va_list (tree);
> > > > extern tree std_canonical_va_list_type (tree);
> > > > extern void std_expand_builtin_va_start (tree, rtx);
> > > > -extern void expand_builtin_trap (void);
> > > > +extern void expand_builtin_trap (const char *);
> > > > extern void expand_ifn_atomic_bit_test_and (gcall *);
> > > > extern void expand_ifn_atomic_compare_exchange (gcall *);
> > > > extern void expand_ifn_atomic_op_fetch_cmp_0 (gcall *);
> > > > diff --git a/gcc/expr.cc b/gcc/expr.cc
> > > > index 4a699101bb5..c6fc3b25f42 100644
> > > > --- a/gcc/expr.cc
> > > > +++ b/gcc/expr.cc
> > > > @@ -6102,7 +6102,7 @@ expand_assignment (tree to, tree from, bool
> > > > nontemporal)
> > > > user code. Translate this to a trap instead of
> > > > ICEing. */
> > > > if (TREE_CODE (offset) == INTEGER_CST)
> > > > {
> > > > - expand_builtin_trap ();
> > > > + expand_builtin_trap ("invalid offset");
> > > > to_rtx = gen_rtx_MEM (BLKmode, const0_rtx);
> > > > }
> > > > /* Else spill for variable offset to the destination. We
> > > > expect
> > > > diff --git a/gcc/stmt.cc b/gcc/stmt.cc
> > > > index 7942aa3e484..2e95b207603 100644
> > > > --- a/gcc/stmt.cc
> > > > +++ b/gcc/stmt.cc
> > > > @@ -1393,7 +1393,7 @@ expand_sjlj_dispatch_table (rtx dispatch_index,
> > > > }
> > > >
> > > > /* Dispatching something not handled? Trap! */
> > > > - expand_builtin_trap ();
> > > > + expand_builtin_trap ("dispatch not handled");
> > > >
> > > > reorder_insns (NEXT_INSN (before_case), get_last_insn (),
> > > > before_case);
> > > >
> > > > diff --git a/gcc/testsuite/gcc.dg/Wtrap-2.c
> > > > b/gcc/testsuite/gcc.dg/Wtrap-2.c
> > > > new file mode 100644
> > > > index 00000000000..ce2e596d947
> > > > --- /dev/null
> > > > +++ b/gcc/testsuite/gcc.dg/Wtrap-2.c
> > > > @@ -0,0 +1,13 @@
> > > > +/* { dg-do compile } */
> > > > +/* { dg-options "-Wtrap -fsanitize=bounds -fsanitize-trap=bounds" } */
> > > > +
> > > > +void foo(int n, char (*buf)[n])
> > > > +{
> > > > + (*buf)[10] = 1; /* { dg-warning "trap
> > > > generated: \"bounds (sanitizer)\"" } */
> > > > +}
> > > > +
> > > > +void bar()
> > > > +{
> > > > + __builtin_unreachable(); /* { dg-warning "trap
> > > > generated: \"unreachable\"" } */
> > > > +}
> > > > +
> > > > diff --git a/gcc/ubsan.cc b/gcc/ubsan.cc
> > > > index 6d748258b1e..801f4c7830e 100644
> > > > --- a/gcc/ubsan.cc
> > > > +++ b/gcc/ubsan.cc
> > > > @@ -303,6 +303,35 @@ ubsan_get_source_location_type (void)
> > > > return ret;
> > > > }
> > > >
> > > > +
> > > > +static tree
> > > > +ubsan_build_string (const char *cstr)
> > > > +{
> > > > + size_t len = strlen (cstr) + 1;
> > > > + tree str = build_string (len, cstr);
> > > > + TREE_TYPE (str) = build_array_type_nelts (char_type_node, len);
> > > > + TREE_READONLY (str) = 1;
> > > > + TREE_STATIC (str) = 1;
> > > > + return str;
> > > > +}
> > > > +
> > > > +static tree
> > > > +ubsan_build_string_ptr (const char *cstr)
> > > > +{
> > > > + return build_fold_addr_expr (ubsan_build_string (cstr));
> > > > +}
> > > > +
> > > > +/* Helper routine that expands a trap with a message. */
> > > > +
> > > > +static gimple *
> > > > +ubsan_trap (const char *cstr)
> > > > +{
> > > > + return gimple_build_call (builtin_decl_explicit (BUILT_IN_TRAP_MSG),
> > > > + 1, ubsan_build_string_ptr (cstr));
> > > > +}
> > > > +
> > > > +
> > > > +
> > > > /* Helper routine that returns a CONSTRUCTOR of __ubsan_source_location
> > > > type with its fields filled from a location_t LOC. */
> > > >
> > > > @@ -323,12 +352,7 @@ ubsan_source_location (location_t loc)
> > > > else
> > > > {
> > > > /* Fill in the values from LOC. */
> > > > - size_t len = strlen (xloc.file) + 1;
> > > > - str = build_string (len, xloc.file);
> > > > - TREE_TYPE (str) = build_array_type_nelts (char_type_node, len);
> > > > - TREE_READONLY (str) = 1;
> > > > - TREE_STATIC (str) = 1;
> > > > - str = build_fold_addr_expr (str);
> > > > + str = ubsan_build_string_ptr (xloc.file);
> > > > }
> > > > tree ctor = build_constructor_va (type, 3, NULL_TREE, str, NULL_TREE,
> > > > build_int_cst (unsigned_type_node,
> > > > @@ -542,11 +566,7 @@ ubsan_type_descriptor (tree type, enum
> > > > ubsan_print_style pstyle)
> > > >
> > > > /* Create a new VAR_DECL of type descriptor. */
> > > > const char *tmp = pp_formatted_text (&pretty_name);
> > > > - size_t len = strlen (tmp) + 1;
> > > > - tree str = build_string (len, tmp);
> > > > - TREE_TYPE (str) = build_array_type_nelts (char_type_node, len);
> > > > - TREE_READONLY (str) = 1;
> > > > - TREE_STATIC (str) = 1;
> > > > + tree str = ubsan_build_string (tmp);
> > > >
> > > > decl = build_decl (UNKNOWN_LOCATION, VAR_DECL,
> > > > generate_internal_label ("Lubsan_type"), dtype);
> > > > @@ -786,7 +806,7 @@ ubsan_expand_bounds_ifn (gimple_stmt_iterator *gsi)
> > > > /* Generate __ubsan_handle_out_of_bounds call. */
> > > > *gsi = gsi_after_labels (then_bb);
> > > > if (flag_sanitize_trap & SANITIZE_BOUNDS)
> > > > - g = gimple_build_call (builtin_decl_explicit (BUILT_IN_TRAP), 0);
> > > > + g = ubsan_trap ("bounds (sanitizer)");
> > > > else
> > > > {
> > > > tree data
> > > > @@ -902,7 +922,7 @@ ubsan_expand_null_ifn (gimple_stmt_iterator *gsip)
> > > > /* Put the ubsan builtin call into the newly created BB. */
> > > > if (flag_sanitize_trap & ((check_align ? SANITIZE_ALIGNMENT + 0 : 0)
> > > > | (check_null ? SANITIZE_NULL + 0 : 0)))
> > > > - g = gimple_build_call (builtin_decl_implicit (BUILT_IN_TRAP), 0);
> > > > + g = ubsan_trap ("alignment / null (sanitizer)");
> > > > else
> > > > {
> > > > enum built_in_function bcode
> > > > @@ -1072,7 +1092,7 @@ ubsan_expand_objsize_ifn (gimple_stmt_iterator
> > > > *gsi)
> > > >
> > > > /* Generate __ubsan_handle_type_mismatch call. */
> > > > if (flag_sanitize_trap & SANITIZE_OBJECT_SIZE)
> > > > - g = gimple_build_call (builtin_decl_explicit (BUILT_IN_TRAP),
> > > > 0);
> > > > + g = ubsan_trap ("object size (sanitizer)");
> > > > else
> > > > {
> > > > tree data
> > > > @@ -1218,7 +1238,7 @@ ubsan_expand_ptr_ifn (gimple_stmt_iterator *gsip)
> > > >
> > > > /* Put the ubsan builtin call into the newly created BB. */
> > > > if (flag_sanitize_trap & SANITIZE_POINTER_OVERFLOW)
> > > > - g = gimple_build_call (builtin_decl_implicit (BUILT_IN_TRAP), 0);
> > > > + g = ubsan_trap ("pointer overflow (sanitizer)");
> > > > else
> > > > {
> > > > enum built_in_function bcode
> > > > @@ -1601,8 +1621,11 @@ ubsan_build_overflow_builtin (tree_code code,
> > > > location_t loc, tree lhstype,
> > > > tree op0, tree op1, tree *datap)
> > > > {
> > > > if (flag_sanitize_trap & SANITIZE_SI_OVERFLOW)
> > > > - return build_call_expr_loc (loc, builtin_decl_explicit
> > > > (BUILT_IN_TRAP), 0);
> > > > -
> > > > + {
> > > > + tree reason = ubsan_build_string_ptr ("signed overflow
> > > > (sanitizer)");
> > > > + return build_call_expr_loc (loc, builtin_decl_explicit
> > > > (BUILT_IN_TRAP_MSG),
> > > > + 1, reason);
> > > > + }
> > > > tree data;
> > > > if (datap && *datap)
> > > > data = *datap;
> > > > @@ -1830,7 +1853,7 @@ instrument_bool_enum_load (gimple_stmt_iterator
> > > > *gsi)
> > > > gsi2 = gsi_after_labels (then_bb);
> > > > if (flag_sanitize_trap & (TREE_CODE (type) == BOOLEAN_TYPE
> > > > ? SANITIZE_BOOL : SANITIZE_ENUM))
> > > > - g = gimple_build_call (builtin_decl_explicit (BUILT_IN_TRAP), 0);
> > > > + g = ubsan_trap ("bool / enum (sanitizer)");
> > > > else
> > > > {
> > > > tree data = ubsan_create_data ("__ubsan_invalid_value_data", 1,
> > > > &loc,
> > > > @@ -1991,7 +2014,11 @@ ubsan_instrument_float_cast (location_t loc,
> > > > tree type, tree expr)
> > > > return NULL_TREE;
> > > >
> > > > if (flag_sanitize_trap & SANITIZE_FLOAT_CAST)
> > > > - fn = build_call_expr_loc (loc, builtin_decl_explicit
> > > > (BUILT_IN_TRAP), 0);
> > > > + {
> > > > + tree reason = ubsan_build_string_ptr ("float cast (sanitizer)");
> > > > + fn = build_call_expr_loc (loc, builtin_decl_explicit
> > > > (BUILT_IN_TRAP_MSG),
> > > > + 1, reason);
> > > > + }
> > > > else
> > > > {
> > > > location_t *loc_ptr = NULL;
> > > > @@ -2102,7 +2129,7 @@ instrument_nonnull_arg (gimple_stmt_iterator *gsi)
> > > > *gsi = gsi_after_labels (then_bb);
> > > > }
> > > > if (flag_sanitize_trap & SANITIZE_NONNULL_ATTRIBUTE)
> > > > - g = gimple_build_call (builtin_decl_explicit
> > > > (BUILT_IN_TRAP), 0);
> > > > + g = ubsan_trap ("nonnull (sanitizer)");
> > > > else
> > > > {
> > > > tree data = ubsan_create_data ("__ubsan_nonnull_arg_data",
> > > > @@ -2158,7 +2185,7 @@ instrument_nonnull_return (gimple_stmt_iterator
> > > > *gsi)
> > > >
> > > > *gsi = gsi_after_labels (then_bb);
> > > > if (flag_sanitize_trap & SANITIZE_RETURNS_NONNULL_ATTRIBUTE)
> > > > - g = gimple_build_call (builtin_decl_explicit (BUILT_IN_TRAP),
> > > > 0);
> > > > + g = ubsan_trap ("returns nonnull (sanitizer)");
> > > > else
> > > > {
> > > > tree data = ubsan_create_data ("__ubsan_nonnull_return_data",
> > > > @@ -2404,7 +2431,7 @@ instrument_builtin (gimple_stmt_iterator *gsi)
> > > >
> > > > *gsi = gsi_after_labels (then_bb);
> > > > if (flag_sanitize_trap & SANITIZE_BUILTIN)
> > > > - g = gimple_build_call (builtin_decl_explicit
> > > > (BUILT_IN_TRAP), 0);
> > > > + g = ubsan_trap ("builtin (sanitizer)");
> > > > else
> > > > {
> > > > tree t = build_int_cst (unsigned_char_type_node, kind);
> > > >
> >