On Mon, Jan 28, 2013 at 10:32:43PM +0100, Dodji Seketeli wrote: Thanks for working on this.
> @@ -212,6 +213,159 @@ alias_set_type asan_shadow_set = -1; > alias set is used for all shadow memory accesses. */ > static GTY(()) tree shadow_ptr_types[2]; > > +/* Hashtable support for memory references used by gimple > + statements. */ > + > +/* This type represents a reference to a memory region. */ > +struct __attribute__ ((visibility ("hidden"))) mem_ref This is wrong, you aren't checking whether visibility or even hidden visibility is supposed by the compiler. The C++ way I think would be to use anonymous namespace, but I don't see anything like that used in gcc/*.c yet, so perhaps just name it asan_mem_ref and be done with that. > +{ > + /* The expression of the begining of the memory region. */ > + tree start; > + /* The expression representing the length of the region. */ > + tree len; Do you really need to store len as tree? Wouldn't it be better just to store the access size in bytes as integer (1, 2, 4, 8 or 16 bytes) into say unsigned char; variable? > +struct __attribute__((visibility ("hidden"))) mem_ref_hasher > + : typed_delete_remove <mem_ref> Likewise. > +static hash_table <mem_ref_hasher>& Not sure about agreed formatting for references, but grep '>&' doesn't show anything, while '> &' shows some hits. > +get_mem_ref_hash_table () > +{ > + static hash_table <mem_ref_hasher> ht; > + > + if (!ht.is_created ()) > + ht.create (10); > + > + return ht; The above is all indented too much. > + hash_table <mem_ref_hasher> ht = get_mem_ref_hash_table (); > + mem_ref **slot = ht.find_slot (&ref, INSERT); > + gcc_assert (*slot == NULL); > + *slot = new mem_ref (ref); Wouldn't it be better to use pool_alloc/pool_free here instead of new/delete? > case BUILT_IN_STRLEN: > - return instrument_strlen_call (iter); > + source0 = gimple_call_arg (call, 0); Reminds me that we should replace all uses of is_gimple_builtin_call in asan.c/tsan.c with gimple_call_builtin_p (stmt, BUILT_IN_NORMAL), and probably nuke is_gimple_builtin_call, otherwise no verification whether K&R unprototyped size_t strlen (); etc. aren't used say with x = strlen (); . Right now gimple_call_arg here is unsafe, we haven't checked, whether it has at least one argument. But if we ignore calls which don't match the builtin prototype, we'd be safe. > +static int > +test1 () > +{ > + /*__builtin___asan_report_store1 called 1 time here to instrument > + the initialization. */ > + char foo[4] = {1}; > + > + /*__builtin___asan_report_store1 called 2 times here to instrument > + the store to the memory region of tab. */ > + __builtin_memset (tab, 3, sizeof (tab)); > + > + /* There is no instrumentation for the two memset calls below. */ > + __builtin_memset (tab, 4, sizeof (tab)); > + __builtin_memset (tab, 5, sizeof (tab)); If you don't instrument the above two, it shows something bad. I'd say for calls which test two bytes (first and last) you actually should enter into the hash table two records, one that &tab[0] with size 1 byte has been instrumented, another one that &tab[3] (resp. 4, resp. 5) with size 1 byte has been instrumented, and e.g. if the first access has been instrumented already, but second access hasn't (or vice versa), just emit instrumentation for the one which is missing. For future improvements (but 4.9 material likely, after we lower each instrumentation just to a single builtin first), different access sizes and/or different is_store should just mean we (at least in the same bb and with no intervening calls that could never return) could upgrade the the previous instrumentation to cover a wider access size, or load into store (or just ignore loads vs. stores?). > +/* I wouldn't start the comment on different line. > + { dg-final { scan-tree-dump-times "__builtin___asan_report_store1" 7 > "instrumented stores"} } Space between " and }, only one space between } }. > + > + { dg-final { scan-tree-dump-times "__builtin___asan_report_load" 4 > "instrumented loads"} } and just close */ at the end of the above line (or use /* */ on each line separately. Jakub