Matthew Malcomson <matthew.malcom...@arm.com> writes: > @@ -133,6 +137,13 @@ enum asan_mark_flags > #undef DEF > }; > > +enum hwasan_mark_flags > +{ > +#define DEF(X) HWASAN_MARK_##X > + IFN_ASAN_MARK_FLAGS > +#undef DEF > +};
Are these used anywhere? It looks like expand_HWASAN_MARK uses the plain asan versions. > @@ -640,6 +684,85 @@ handle_builtin_alloca (gcall *call, gimple_stmt_iterator > *iter) > = DECL_FUNCTION_CODE (callee) == BUILT_IN_ALLOCA > ? 0 : tree_to_uhwi (gimple_call_arg (call, 1)); > > + if (hwasan_sanitize_allocas_p ()) > + { > + /* > + HWASAN needs a different expansion. > + > + addr = __builtin_alloca (size, align); > + > + should be replaced by > + > + new_size = size rounded up to HWASAN_TAG_GRANULE_SIZE byte alignment; > + untagged_addr = __builtin_alloca (new_size, align); > + tag = __hwasan_choose_alloca_tag (); > + addr = __hwasan_tag_pointer (untagged_addr, tag); > + __hwasan_tag_memory (untagged_addr, tag, new_size); > + */ > + /* Ensure alignment at least HWASAN_TAG_GRANULE_SIZE bytes so we start > on > + a tag granule. */ > + align = align > HWASAN_TAG_GRANULE_SIZE ? align : > HWASAN_TAG_GRANULE_SIZE; > + > + uint8_t tg_mask = HWASAN_TAG_GRANULE_SIZE - 1; > + /* tree new_size = (old_size + tg_mask) & ~tg_mask; */ > + tree old_size = gimple_call_arg (call, 0); > + tree tree_mask = build_int_cst (size_type_node, tg_mask); > + g = gimple_build_assign (make_ssa_name (size_type_node), PLUS_EXPR, > + old_size, tree_mask); > + gsi_insert_before (iter, g, GSI_SAME_STMT); > + tree oversize = gimple_assign_lhs (g); > + > + g = gimple_build_assign (make_ssa_name (size_type_node), BIT_NOT_EXPR, > + tree_mask); > + tree mask = gimple_assign_lhs (g); > + gsi_insert_before (iter, g, GSI_SAME_STMT); Seems simpler to use: tree mask = build_int_cst (size_type_node, -HWASAN_TAG_GRANULE_SIZE); > + > + g = gimple_build_assign (make_ssa_name (size_type_node), BIT_AND_EXPR, > + oversize, mask); > + gsi_insert_before (iter, g, GSI_SAME_STMT); > + tree new_size = gimple_assign_lhs (g); > + > + /* emit the alloca call */ > + tree fn = builtin_decl_implicit (BUILT_IN_ALLOCA_WITH_ALIGN); > + gg = gimple_build_call (fn, 2, new_size, > + build_int_cst (size_type_node, align)); > + tree untagged_addr = make_ssa_name (ptr_type, gg); > + gimple_call_set_lhs (gg, untagged_addr); > + gsi_insert_before (iter, gg, GSI_SAME_STMT); > + > + /* Insert code choosing the tag. > + Here we use an internal function so we can choose the tag at expand > + time. We need the decision to be made after stack variables have been > + assigned their tag (i.e. once the tag_offset variable has been set to > + one after the last stack variables tag). */ > + > + gg = gimple_build_call_internal (IFN_HWASAN_CHOOSE_TAG, 0); > + tree tag = make_ssa_name (unsigned_char_type_node, gg); > + gimple_call_set_lhs (gg, tag); > + gsi_insert_before (iter, gg, GSI_SAME_STMT); > + > + /* Insert code adding tag to pointer. */ > + fn = builtin_decl_implicit (BUILT_IN_HWASAN_TAG_PTR); > + gg = gimple_build_call (fn, 2, untagged_addr, tag); > + tree addr = make_ssa_name (ptr_type, gg); > + gimple_call_set_lhs (gg, addr); > + gsi_insert_before (iter, gg, GSI_SAME_STMT); > + > + /* Insert code tagging shadow memory. > + NOTE: require using `untagged_addr` here for libhwasan API. */ > + fn = builtin_decl_implicit (BUILT_IN_HWASAN_TAG_MEM); > + gg = gimple_build_call (fn, 3, untagged_addr, tag, new_size); > + gsi_insert_before (iter, gg, GSI_SAME_STMT); > + > + /* Finally, replace old alloca ptr with NEW_ALLOCA. */ > + replace_call_with_value (iter, addr); > + return; > + } > + > + tree last_alloca = get_last_alloca_addr (); > + const HOST_WIDE_INT redzone_mask = ASAN_RED_ZONE_SIZE - 1; > + > + > /* If ALIGN > ASAN_RED_ZONE_SIZE, we embed left redzone into first ALIGN > bytes of allocated space. Otherwise, align alloca to ASAN_RED_ZONE_SIZE > manually. */ > @@ -792,6 +915,31 @@ get_mem_refs_of_builtin_call (gcall *call, > break; > > case BUILT_IN_STRLEN: > + /* Special case strlen here since its length is taken from its return > + value. > + > + The approach taken by the sanitizers is to check a memory access > + before it's taken. For ASAN strlen is intercepted by libasan, so no > + check is inserted by the compiler. > + > + This function still returns `true` and provides a length to the rest > + of the ASAN pass in order to record what areas have been checked, > + avoiding superfluous checks later on. > + > + HWASAN does not intercept any of these internal functions. > + This means that checks for memory accesses must be inserted by the > + compiler. > + strlen is a special case, because we can tell the length from the > + return of the function, but that is not known until after the function > + has returned. > + > + Hence we can't check the memory access before it happens. > + We could check the memory access after it has already happened, but > + for now I'm choosing to just ignore `strlen` calls. > + This decision was simply made because that means the special case is > + limited to this one case of this one function. */ > + if (hwasan_sanitize_p ()) > + return false; What does LLVM do here? s/I'm choosing/we choose/ > @@ -1361,6 +1509,149 @@ asan_redzone_buffer::flush_if_full (void) > flush_redzone_payload (); > } > > + > +/* HWAddressSanitizer (hwasan) is a probabilistic method for detecting > + out-of-bounds and use-after-free bugs. > + Read more: > + http://code.google.com/p/address-sanitizer/ > + > + Similar to AddressSanitizer (asan) it consists of two parts: the > + instrumentation module in this file, and a run-time library. > + > + The instrumentation module adds a run-time check before every memory insn > in > + the same manner as asan (see the block comment for AddressSanitizer > above). > + Currently, hwasan only adds out-of-line instrumentation, where each check > is > + implemented as a function call to the run-time library. Hence a check > for a > + load of N bytes from address X would be implemented with a function call > to > + __hwasan_loadN(X), and checking a store of N bytes from address X would be > + implemented with a function call to __hwasan_storeN(X). > + > + The main difference between hwasan and asan is in the information stored > to > + help this checking. Both sanitizers use a shadow memory area which stores > + data recording the state of main memory at a corresponding address. > + > + For hwasan, each 16 byte granule in main memory has a corresponding 1 byte > + in shadow memory. This shadow address can be calculated with equation: > + (addr >> HWASAN_TAG_SHIFT_SIZE) + > __hwasan_shadow_memory_dynamic_address; > + The conversion between real and shadow memory for asan is given in the > block > + comment at the top of this file. > + The description of how this shadow memory is laid out for asan is in the Maybe “is also in”? > + block comment at the top of this file, here we describe how this shadow > + memory is used for hwasan. > + > + For hwasan, each variable is assigned a byte-sized 'tag'. The extent of > + the shadow memory for that variable is filled with the assigned tag, and > + every pointer referencing that variable has its top byte set to the same > + tag. The run-time library redefines malloc so that every allocation > returns > + a tagged pointer and tags the corresponding shadow memory with the same > tag. > + > + On each pointer dereference the tag found in the pointer is compared to > the > + tag found in the shadow memory corresponding to the accessed memory > address. > + If these tags are found to differ then this memory access is judged to be > + invalid and a report is generated. > + > + This method of bug detection is not perfect -- it can not catch every bad > + access -- but catches them probabilistically instead. There is always the > + possibility that an invalid memory access will happen to access memory > + tagged with the same tag as the pointer that this access used. > + The chances of this are approx. 0.4% for any two uncorrelated objects. > + > + Random tag generation can mitigate this problem by decreasing the > + probability that an invalid access will be missed in the same manner over > + multiple runs. i.e. if two objects are tagged the same in one run of the > + binary they are unlikely to be tagged the same in the next run. > + Both heap and stack allocated objects have random tags by default. > + > + [16 byte granule implications] > + Since the shadow memory only has a resolution on real memory of 16 bytes, > + invalid accesses that are within the same 16 byte granule as a valid > + address will not be caught. > + > + There is a "short-granule" feature in the runtime library which does > catch > + such accesses, but this feature is not implemented for stack objects > (since > + stack objects are allocated and tagged by compiler instrumentation, and > + this feature has not yet been implemented in GCC instrumentation). > + > + Another outcome of this 16 byte resolution is that each tagged object > must > + be 16 byte aligned. If two objects were to share any 16 byte granule in > + memory, then they both would have to be given the same tag, and invalid > + accesses to one using a pointer to the other would be undetectable. > + > + [Compiler instrumentation] > + Compiler instrumentation ensures that two adjacent buffers on the stack > are > + given different tags, this means an access to one buffer using a pointer > + generated from the other (e.g. through buffer overrun) will have > mismatched > + tags and be caught by hwasan. > + > + We don't randomly tag every object on the stack, since that would require > + keeping many registers to record each tag. Instead we randomly generate > a > + tag for each function frame, and each new stack object uses a tag offset > + from that frame tag. > + i.e. each object is tagged as RFT + offset, where RFT is the "random > frame > + tag" generated for this frame. So randomisation isn't effective at perturbing the pairs of mismatching tags within a frame, but that is mitigated by objects with the same tag within a frame being very far apart. Is that right? > + As a demonstration, using the same example program as in the asan block > + comment above: > + > + int > + foo () > + { > + char a[23] = {0}; > + int b[2] = {0}; > + > + a[5] = 1; > + b[1] = 2; > + > + return a[5] + b[1]; > + } > + > + On AArch64 the stack will be ordered as follows for the above function: > + > + Slot 1/ [24 bytes for variable 'a'] > + Slot 2/ [8 bytes padding for alignment] > + Slot 3/ [8 bytes for variable 'b'] > + Slot 4/ [8 bytes padding for alignment] > + > + (The padding is there to ensure 16 byte alignment as described in the 16 > + byte granule implications). > + > + While the shadow memory will be ordered as follows: > + > + - 2 bytes (representing 32 bytes in real memory) tagged with RFT + 1. > + - 1 byte (representing 16 bytes in real memory) tagged with RFT + 2. > + > + And any pointer to "a" will have the tag RFT + 1, and any pointer to "b" > + will have the tag RFT + 2. > + > + [Top Byte Ignore requirements] > + Hwasan requires the ability to store an 8 bit tag in every pointer. > There > + is no instrumentation done to remove this tag from pointers before > + dereferencing, which means the hardware must ignore this tag during > memory > + accesses. > + > + One architecture that provides this feature is the AArch64 architecture. > + This is the only architecture that hwasan is currently implemented for. This could easily get out of date. Maybe it would be better to refer to the target hook instead. > + > + [Stack requires cleanup on unwinding] > + During normal operation of a hwasan sanitized program more space in the > + shadow memory becomes tagged as the stack grows. As the stack shrinks > this > + shadow memory space must become untagged. If it is not untagged then > when > + the stack grows again (during other function calls later on in the > program) > + objects on the stack that are usually not tagged (e.g. parameters passed > on > + the stack) can be placed in memory whose shadow space is tagged with > + something else, and accesses can cause false positive reports. > + > + Hence we place untagging code on every epilogue of functions which tag > some > + stack objects. > + > + Moreover, the run-time library intercepts longjmp & setjmp to uncolour uncolor > + when the stack is unwound this way. > + > + C++ exceptions are not yet handled, which means this sanitizer can not > + handle C++ code that throws exceptions -- it will give false positives > + after an exception has been thrown. */ Is this a limitation in the library as well as the compiler side? Might be worth clarifying in the comment. But thanks for this, it's a really nice write-up. > @@ -2301,7 +2620,8 @@ instrument_derefs (gimple_stmt_iterator *iter, tree t, > { > if (DECL_THREAD_LOCAL_P (inner)) > return; > - if (!param_asan_globals && is_global_var (inner)) > + if ((hwasan_sanitize_p () || !param_asan_globals) > + && is_global_var (inner)) > return; > if (!TREE_STATIC (inner)) > { Could you expand on this? > @@ -2530,10 +2850,13 @@ maybe_instrument_call (gimple_stmt_iterator *iter) > break; > } > } > - tree decl = builtin_decl_implicit (BUILT_IN_ASAN_HANDLE_NO_RETURN); > - gimple *g = gimple_build_call (decl, 0); > - gimple_set_location (g, gimple_location (stmt)); > - gsi_insert_before (iter, g, GSI_SAME_STMT); > + if (! hwasan_sanitize_p ()) > + { > + tree decl = builtin_decl_implicit (BUILT_IN_ASAN_HANDLE_NO_RETURN); > + gimple *g = gimple_build_call (decl, 0); > + gimple_set_location (g, gimple_location (stmt)); > + gsi_insert_before (iter, g, GSI_SAME_STMT); > + } Why is this only needed for asan? (Just curious.) > @@ -3255,18 +3581,34 @@ asan_expand_mark_ifn (gimple_stmt_iterator *iter) > > gcc_checking_assert (TREE_CODE (decl) == VAR_DECL); > > + tree len = gimple_call_arg (g, 2); > + gcc_assert (tree_fits_shwi_p (len)); > + unsigned HOST_WIDE_INT size_in_bytes = tree_to_shwi (len); > + gcc_assert (size_in_bytes); > + > + if (hwasan_sanitize_p ()) > + { > + gcc_assert (param_hwasan_instrument_stack); > + /* Here we swap ASAN_MARK calls for HWASAN_MARK. > + This is because we are using the approach of using ASAN_MARK as a > + synonym until here. > + That approach means we don't yet have to duplicate all the special > + cases for ASAN_MARK and ASAN_POISON with the exact same handling but > + called HWASAN_MARK etc. */ Definitely in favour of that :-) > + gimple *hw_poison_call > + = gimple_build_call_internal (IFN_HWASAN_MARK, 3, > + gimple_call_arg (g, 0), > + base, len); > + gsi_replace (iter, hw_poison_call, false); > + return false; > + } > + > if (is_poison) > { > if (asan_handled_variables == NULL) > asan_handled_variables = new hash_set<tree> (16); > asan_handled_variables->add (decl); > } > - tree len = gimple_call_arg (g, 2); > - > - gcc_assert (tree_fits_shwi_p (len)); > - unsigned HOST_WIDE_INT size_in_bytes = tree_to_shwi (len); > - gcc_assert (size_in_bytes); > - > g = gimple_build_assign (make_ssa_name (pointer_sized_int_node), > NOP_EXPR, base); > gimple_set_location (g, loc); > @@ -3329,6 +3671,7 @@ asan_expand_mark_ifn (gimple_stmt_iterator *iter) > bool > asan_expand_check_ifn (gimple_stmt_iterator *iter, bool use_calls) > { > + gcc_assert (!hwasan_sanitize_p ()); > gimple *g = gsi_stmt (*iter); > location_t loc = gimple_location (g); > bool recover_p; > @@ -3602,11 +3945,66 @@ asan_expand_poison_ifn (gimple_stmt_iterator *iter, > > int nargs; > bool store_p = gimple_call_internal_p (use, IFN_ASAN_POISON_USE); > - tree fun = report_error_func (store_p, recover_p, tree_to_uhwi (size), > - &nargs); > - > - gcall *call = gimple_build_call (fun, 1, > - build_fold_addr_expr (shadow_var)); > + gcall *call; > + if (hwasan_sanitize_p ()) > + { > + tree fun = builtin_decl_implicit (BUILT_IN_HWASAN_TAG_MISMATCH4); > + /* NOTE: hwasan has no __hwasan_report_* functions like asan does. > + We use __hwasan_tag_mismatch4 with arguments that tell it the > + size of access and load to report all tag mismatches. > + > + The arguments to this function are: > + Address of invalid access. > + Bitfield containing information about the access > + (access_info) > + Pointer to a frame of registers > + (for use in printing the contents of registers in a dump) > + Not used yet -- to be used by inline instrumentation. > + Size of access (only if size is not representable in the > + access_info argument). > + > + The access_info bitfield encodes the following pieces of > + information: > + - Is this a store or load? > + access_info & 0x10 => store > + - Should the program continue after reporting the error? > + access_info & 0x20 => recover > + - What size access is this (if size is less than 16 bytes) > + if (access_info & 0xf == 0xf) > + size is taken from last argument. > + else > + size == 1 << (access_info & 0xf) > + > + The last argument contains the size of the access iff the > + access was of a size greater than or equal to 16 bytes. > + > + See the function definition `__hwasan_tag_mismatch4` in > + libsanitizer/hwasan for the full definition. > + */ > + unsigned HOST_WIDE_INT size_in_bytes = tree_to_uhwi (size); > + unsigned size_indicator = (size_in_bytes > 16) > + ? 0xf > + : exact_log2 (size_in_bytes); The parts about “16 bytes” don't seem to match the equation; seems like it's a power of 2 in the range [1, 16384]. Is there any benefit to using the size field encoded in the access_info here? AIUI it was retained for compatibility with pre-__hwasan_tag_mismatch4 versions of the library. Seems like we could always use 0xf instead. > + unsigned access_info = (0x20 * recover_p) > + + (0x10 * store_p) > + + (size_indicator); > + tree long_pointer_type > + = build_pointer_type (long_unsigned_type_node); pointer_sized_int_type seems more appropriate, given that the second argument and the target of the third argument are uptrs. > + call = gimple_build_call (fun, 3, > + build_fold_addr_expr (shadow_var), > + build_int_cst (long_unsigned_type_node, > + access_info), > + build_int_cst (long_pointer_type, > + 0), > + size); > + } > + else > + { > + tree fun = report_error_func (store_p, recover_p, tree_to_uhwi (size), > + &nargs); > + call = gimple_build_call (fun, 1, > + build_fold_addr_expr (shadow_var)); > + } > gimple_set_location (call, gimple_location (use)); > gimple *call_to_insert = call; > > @@ -4005,4 +4425,216 @@ hwasan_finish_file (void) > flag_sanitize |= SANITIZE_HWADDRESS; > } > > +/* Construct a function tree for __hwasan_{load,store}{1,2,4,8,16,_n}. > + IS_STORE is either 1 (for a store) or 0 (for a load). */ > +static tree > +hwasan_check_func (bool is_store, bool recover_p, HOST_WIDE_INT > size_in_bytes, > + int *nargs) > +{ > + static enum built_in_function check[2][2][6] > + = { { { BUILT_IN_HWASAN_LOAD1, BUILT_IN_HWASAN_LOAD2, > + BUILT_IN_HWASAN_LOAD4, BUILT_IN_HWASAN_LOAD8, > + BUILT_IN_HWASAN_LOAD16, BUILT_IN_HWASAN_LOADN }, > + { BUILT_IN_HWASAN_STORE1, BUILT_IN_HWASAN_STORE2, > + BUILT_IN_HWASAN_STORE4, BUILT_IN_HWASAN_STORE8, > + BUILT_IN_HWASAN_STORE16, BUILT_IN_HWASAN_STOREN } }, > + { { BUILT_IN_HWASAN_LOAD1_NOABORT, > + BUILT_IN_HWASAN_LOAD2_NOABORT, > + BUILT_IN_HWASAN_LOAD4_NOABORT, > + BUILT_IN_HWASAN_LOAD8_NOABORT, > + BUILT_IN_HWASAN_LOAD16_NOABORT, > + BUILT_IN_HWASAN_LOADN_NOABORT }, > + { BUILT_IN_HWASAN_STORE1_NOABORT, > + BUILT_IN_HWASAN_STORE2_NOABORT, > + BUILT_IN_HWASAN_STORE4_NOABORT, > + BUILT_IN_HWASAN_STORE8_NOABORT, > + BUILT_IN_HWASAN_STORE16_NOABORT, > + BUILT_IN_HWASAN_STOREN_NOABORT } } }; > + if (size_in_bytes == -1) > + { > + *nargs = 2; > + return builtin_decl_implicit (check[recover_p][is_store][5]); > + } > + *nargs = 1; > + int size_log2 = exact_log2 (size_in_bytes); > + return builtin_decl_implicit (check[recover_p][is_store][size_log2]); I don't follow the logic here. Why is it the case that we have to handle size_in_bytes == -1, but are otherwise guaranteed that size_in_bytes is in [1, 2, 4, 8, 16]? I.e. why is the tree_fits_shwi_p call needed here: > + unsigned HOST_WIDE_INT size_in_bytes > + = is_scalar_access && tree_fits_shwi_p (len) ? tree_to_shwi (len) : -1; ? If is_scalar_access guarantees a valid value then I think we should drop the tree_fits_shwi_p check above and instead assert that size_log2 is in range. If is_scalar_access doesn't guarantee a valid value then I think we should check size_log2 and set *nargs to 2 for out-of-range values. > +} > + > +/* Expand the HWASAN_{LOAD,STORE} builtins. */ > +bool > +hwasan_expand_check_ifn (gimple_stmt_iterator *iter, bool) > +{ > + gimple *g = gsi_stmt (*iter); > + location_t loc = gimple_location (g); > + bool recover_p; > + if (flag_sanitize & SANITIZE_USER_HWADDRESS) > + recover_p = (flag_sanitize_recover & SANITIZE_USER_HWADDRESS) != 0; > + else > + recover_p = (flag_sanitize_recover & SANITIZE_KERNEL_HWADDRESS) != 0; > + > + HOST_WIDE_INT flags = tree_to_shwi (gimple_call_arg (g, 0)); > + gcc_assert (flags < ASAN_CHECK_LAST); > + bool is_scalar_access = (flags & ASAN_CHECK_SCALAR_ACCESS) != 0; > + bool is_store = (flags & ASAN_CHECK_STORE) != 0; > + bool is_non_zero_len = (flags & ASAN_CHECK_NON_ZERO_LEN) != 0; > + > + tree base = gimple_call_arg (g, 1); > + tree len = gimple_call_arg (g, 2); > + > + /* `align` is unused for HWASAN_CHECK, but I pass the argument anyway s/I/we/. > + since that way the arguments match ASAN_CHECK. */ > + /* HOST_WIDE_INT align = tree_to_shwi (gimple_call_arg (g, 3)); */ > + > + unsigned HOST_WIDE_INT size_in_bytes > + = is_scalar_access && tree_fits_shwi_p (len) ? tree_to_shwi (len) : -1; > + > + gimple_stmt_iterator gsi = *iter; > + > + if (!is_non_zero_len) > + { > + /* So, the length of the memory area to hwasan-protect is > + non-constant. Let's guard the generated instrumentation code > + like: > + > + if (len != 0) > + { > + // hwasan instrumentation code goes here. > + } > + // falltrough instructions, starting with *ITER. */ > + > + g = gimple_build_cond (NE_EXPR, > + len, > + build_int_cst (TREE_TYPE (len), 0), > + NULL_TREE, NULL_TREE); > + gimple_set_location (g, loc); > + > + basic_block then_bb, fallthrough_bb; > + insert_if_then_before_iter (as_a <gcond *> (g), iter, > + /*then_more_likely_p=*/true, > + &then_bb, &fallthrough_bb); > + /* Note that fallthrough_bb starts with the statement that was > + pointed to by ITER. */ > + > + /* The 'then block' of the 'if (len != 0) condition is where > + we'll generate the hwasan instrumentation code now. */ > + gsi = gsi_last_bb (then_bb); > + } > + > + g = gimple_build_assign (make_ssa_name (pointer_sized_int_node), > + NOP_EXPR, base); > + gimple_set_location (g, loc); > + gsi_insert_after (&gsi, g, GSI_NEW_STMT); > + tree base_addr = gimple_assign_lhs (g); > + > + int nargs = 0; > + tree fun = hwasan_check_func (is_store, recover_p, size_in_bytes, &nargs); > + if (nargs == 1) > + g = gimple_build_call (fun, 1, base_addr); > + else > + { > + gcc_assert (nargs == 2); > + g = gimple_build_assign (make_ssa_name (pointer_sized_int_node), > + NOP_EXPR, len); > + gimple_set_location (g, loc); > + gsi_insert_after (&gsi, g, GSI_NEW_STMT); > + tree sz_arg = gimple_assign_lhs (g); > + g = gimple_build_call (fun, nargs, base_addr, sz_arg); > + } > + > + gimple_set_location (g, loc); > + gsi_insert_after (&gsi, g, GSI_NEW_STMT); > + gsi_remove (iter, true); > + *iter = gsi; > + return false; > +} > + > +/* For stack tagging: > + Dummy: the HWASAN_MARK internal function should only ever be in the code > + after the sanopt pass. */ Same nit as the previous part about this style of comment formatting. > +bool > +hwasan_expand_mark_ifn (gimple_stmt_iterator *) > +{ > + gcc_unreachable (); > +} > + > +bool > +gate_hwasan () > +{ > + return hwasan_sanitize_p (); > +} > + > +namespace { > + > +const pass_data pass_data_hwasan = > +{ > + GIMPLE_PASS, /* type */ > + "hwasan", /* name */ > + OPTGROUP_NONE, /* optinfo_flags */ > + TV_NONE, /* tv_id */ > + ( PROP_ssa | PROP_cfg | PROP_gimple_leh ), /* properties_required */ > + 0, /* properties_provided */ > + 0, /* properties_destroyed */ > + 0, /* todo_flags_start */ > + TODO_update_ssa, /* todo_flags_finish */ > +}; > + > +class pass_hwasan : public gimple_opt_pass > +{ > +public: > + pass_hwasan (gcc::context *ctxt) > + : gimple_opt_pass (pass_data_hwasan, ctxt) > + {} > + > + /* opt_pass methods: */ > + opt_pass * clone () { return new pass_hwasan (m_ctxt); } > + virtual bool gate (function *) { return gate_hwasan (); } > + virtual unsigned int execute (function *) { return hwasan_instrument (); } > + > +}; /* class pass_hwasan */ > + > +} /* anon namespace */ > + > +gimple_opt_pass * > +make_pass_hwasan (gcc::context *ctxt) > +{ > + return new pass_hwasan (ctxt); > +} > + > +namespace { > + > +const pass_data pass_data_hwasan_O0 = > +{ > + GIMPLE_PASS, /* type */ > + "hwasan_O0", /* name */ > + OPTGROUP_NONE, /* optinfo_flags */ > + TV_NONE, /* tv_id */ > + ( PROP_ssa | PROP_cfg | PROP_gimple_leh ), /* properties_required */ > + 0, /* properties_provided */ > + 0, /* properties_destroyed */ > + 0, /* todo_flags_start */ > + TODO_update_ssa, /* todo_flags_finish */ > +}; > + > +class pass_hwasan_O0 : public gimple_opt_pass > +{ > +public: > + pass_hwasan_O0 (gcc::context *ctxt) > + : gimple_opt_pass (pass_data_hwasan_O0, ctxt) > + {} > + > + /* opt_pass methods: */ > + opt_pass * clone () { return new pass_hwasan_O0 (m_ctxt); } > + virtual bool gate (function *) { return !optimize && gate_hwasan (); } > + virtual unsigned int execute (function *) { return hwasan_instrument (); } > + > +}; /* class pass_hwasan */ > + > +} /* anon namespace */ > + > +gimple_opt_pass * > +make_pass_hwasan_O0 (gcc::context *ctxt) > +{ > + return new pass_hwasan_O0 (ctxt); > +} Is it worth creating separate passes for this? Seems like we could just fork the implementation of the asan ones based on whether we're using hwasan or not. > diff --git a/gcc/gimplify.c b/gcc/gimplify.c > index > 15dfee903ab298f6a02e45d1affcc2260f3c911d..24ebedd490aea4ad634a92aa5742a83b1b0c0bb7 > 100644 > --- a/gcc/gimplify.c > +++ b/gcc/gimplify.c > @@ -1234,8 +1234,11 @@ asan_poison_variable (tree decl, bool poison, > gimple_stmt_iterator *it, > > /* It's necessary to have all stack variables aligned to ASAN granularity > bytes. */ > - if (DECL_ALIGN_UNIT (decl) <= ASAN_SHADOW_GRANULARITY) > - SET_DECL_ALIGN (decl, BITS_PER_UNIT * ASAN_SHADOW_GRANULARITY); > + gcc_assert (!hwasan_sanitize_p () || hwasan_sanitize_stack_p ()); > + unsigned shadow_granularity > + = hwasan_sanitize_p () ? HWASAN_TAG_GRANULE_SIZE : > ASAN_SHADOW_GRANULARITY; > + if (DECL_ALIGN_UNIT (decl) <= shadow_granularity) > + SET_DECL_ALIGN (decl, BITS_PER_UNIT * shadow_granularity); > > HOST_WIDE_INT flags = poison ? ASAN_MARK_POISON : ASAN_MARK_UNPOISON; > > @@ -15023,7 +15026,7 @@ gimplify_function_tree (tree fndecl) > if necessary. */ > cfun->curr_properties |= PROP_gimple_lva; > > - if (asan_sanitize_use_after_scope () && sanitize_flags_p > (SANITIZE_ADDRESS)) > + if (asan_sanitize_use_after_scope ()) > asan_poisoned_variables = new hash_set<tree> (); > bind = gimplify_body (fndecl, true); > if (asan_poisoned_variables) > diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c > index > 8efc77d986b927dc4a37e396e6c710ffeda663ff..3397fa355f373f3ead5a02a07838a467f08f03c4 > 100644 > --- a/gcc/internal-fn.c > +++ b/gcc/internal-fn.c > @@ -467,6 +467,105 @@ expand_UBSAN_OBJECT_SIZE (internal_fn, gcall *) > /* This should get expanded in the sanopt pass. */ > > static void > +expand_HWASAN_CHECK (internal_fn, gcall *) > +{ > + gcc_unreachable (); > +} > + > +/* For hwasan stack tagging: > + Clear tags on the dynamically allocated space. > + For use after an object dynamically allocated on the stack goes out of > + scope. */ > +static void > +expand_HWASAN_ALLOCA_UNPOISON (internal_fn, gcall *gc) > +{ > + tree restored_position = gimple_call_arg (gc, 0); > + rtx restored_rtx = expand_expr (restored_position, NULL_RTX, VOIDmode, > + EXPAND_NORMAL); > + rtx func = init_one_libfunc ("__hwasan_tag_memory"); > + rtx off = expand_simple_binop (ptr_mode, MINUS, restored_rtx, > + stack_pointer_rtx, NULL_RTX, 0, > + OPTAB_LIB_WIDEN); > + rtx dynamic = convert_memory_address (ptr_mode, virtual_stack_dynamic_rtx); > + emit_library_call_value (func, NULL_RTX, LCT_NORMAL, VOIDmode, > + dynamic, ptr_mode, > + const0_rtx, QImode, > + off, ptr_mode); > +} > + > +/* For hwasan stack tagging: > + Return a tag to be used for a dynamic allocation. */ > +static void > +expand_HWASAN_CHOOSE_TAG (internal_fn, gcall *gc) > +{ > + tree tag = gimple_call_lhs (gc); > + rtx target = expand_expr (tag, NULL_RTX, VOIDmode, EXPAND_NORMAL); > + machine_mode mode = GET_MODE (target); > + gcc_assert (mode == QImode); > + > + rtx base_tag = hwasan_extract_tag (hwasan_base ()); > + gcc_assert (base_tag); > + rtx tag_offset = GEN_INT (hwasan_current_tag ()); > + rtx chosen_tag = expand_simple_binop (QImode, PLUS, base_tag, tag_offset, > + target, /* unsignedp = */1, > + OPTAB_WIDEN); > + > + gcc_assert (chosen_tag); > + /* Really need to put the tag into the `target` RTX. */ > + if (chosen_tag != target) > + { > + rtx temp = chosen_tag; > + machine_mode ret_mode = GET_MODE (chosen_tag); > + if (ret_mode != mode) > + temp = simplify_gen_unary (TRUNCATE, mode, chosen_tag, ret_mode); Is this truncation necessary? I'd have expected: expand_simple_binop (QImode, …) to always return a QImode value, even for the widening path. > +/* For hwasan stack tagging: > + Tag a region of space in the shadow stack according to the base pointer > of > + an object on the stack. */ > +static void > +expand_HWASAN_MARK (internal_fn, gcall *gc) > +{ > + HOST_WIDE_INT flag = tree_to_shwi (gimple_call_arg (gc, 0)); > + bool is_poison = ((asan_mark_flags)flag) == ASAN_MARK_POISON; > + > + tree base = gimple_call_arg (gc, 1); > + gcc_checking_assert (TREE_CODE (base) == ADDR_EXPR); > + /* base is a pointer argument, hence in ptr_mode. > + We convert to Pmode for use in the hwasan_extract_tag and > + hwasan_create_untagged_base functions. > + We then convert the result to ptr_mode for the emit_library_call. */ How does this ptr_mode<->Pmode stuff work for ILP32? Wouldn't the tag get lost by the Pmode->ptr_mode conversion? Or does something ensure that the tag is moved from bits 24+ to bits 56+ and back? > + rtx base_rtx = convert_memory_address (Pmode, expand_normal (base)); > + > + rtx tag = is_poison ? const0_rtx : hwasan_extract_tag (base_rtx); > + rtx address = hwasan_create_untagged_base (base_rtx); > + address = convert_memory_address (ptr_mode, address); > + > + tree len = gimple_call_arg (gc, 2); > + gcc_assert (tree_fits_shwi_p (len)); Why's this guaranteed to be a SHWI? For example, I couldn't see anything that explicitly stopped calls to build_asan_poison_call_expr with poly_int- or variable-sized decls. Could we expand this to an aligned size in gimple, when generating the HWASAN_MARK call? E.g. we could reuse the alloca code for computing an aligned size. It'd probably be worth using the gimple_build routines in gimple-fold.h to build the calculation, instead of calling gimple_build_assign directly. The operations will then get folded on the fly. > + unsigned HOST_WIDE_INT size_in_bytes = tree_to_shwi (len); > + uint8_t tg_mask = HWASAN_TAG_GRANULE_SIZE - 1; > + gcc_assert (size_in_bytes); > + size_in_bytes = (size_in_bytes + tg_mask) & ~tg_mask; > + rtx size = gen_int_mode (size_in_bytes, ptr_mode); > + > + rtx func = init_one_libfunc ("__hwasan_tag_memory"); > + emit_library_call (func, > + LCT_NORMAL, > + VOIDmode, > + address, ptr_mode, > + tag, QImode, > + size, ptr_mode); Formatting nit: should be indented under “func”. Seems at least LCT_NORMAL and VOIDmode could go on the same line as “func”. Generally looks really good though (this and previous patches). Thanks, Richard