On 6/10/19 8:21 PM, Jason Merrill wrote: > On Mon, Jun 10, 2019 at 3:08 AM Martin Liška <mli...@suse.cz> wrote: >> On 6/7/19 11:43 PM, Jason Merrill wrote: >>> On Fri, Jun 7, 2019 at 8:14 AM Martin Liška <mli...@suse.cz> wrote: >>>> On 6/7/19 2:09 PM, Richard Biener wrote: >>>>> On Fri, Jun 7, 2019 at 2:03 PM Martin Liška <mli...@suse.cz> wrote: >>>>>> On 6/7/19 10:57 AM, Richard Biener wrote: >>>>>>> On Mon, Jun 3, 2019 at 3:35 PM Martin Liška <mli...@suse.cz> wrote: >>>>>>>> On 6/1/19 12:06 AM, Jeff Law wrote: >>>>>>>>> On 5/22/19 3:13 AM, Martin Liška wrote: >>>>>>>>>> On 5/21/19 1:51 PM, Richard Biener wrote: >>>>>>>>>>> On Tue, May 21, 2019 at 1:02 PM Martin Liška <mli...@suse.cz> wrote: >>>>>>>>>>>> >>>>>>>>>>>> On 5/21/19 11:38 AM, Richard Biener wrote: >>>>>>>>>>>>> On Tue, May 21, 2019 at 12:07 AM Jeff Law <l...@redhat.com> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 5/13/19 1:41 AM, Martin Liška wrote: >>>>>>>>>>>>>>> On 11/8/18 9:56 AM, Martin Liška wrote: >>>>>>>>>>>>>>>> On 11/7/18 11:23 PM, Jeff Law wrote: >>>>>>>>>>>>>>>>> On 10/30/18 6:28 AM, Martin Liška wrote: >>>>>>>>>>>>>>>>>> On 10/30/18 11:03 AM, Jakub Jelinek wrote: >>>>>>>>>>>>>>>>>>> On Mon, Oct 29, 2018 at 04:14:21PM +0100, Martin Liška >>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>> +hashtab_chk_error () >>>>>>>>>>>>>>>>>>>> +{ >>>>>>>>>>>>>>>>>>>> + fprintf (stderr, "hash table checking failed: " >>>>>>>>>>>>>>>>>>>> + "equal operator returns true for a pair " >>>>>>>>>>>>>>>>>>>> + "of values with a different hash value"); >>>>>>>>>>>>>>>>>>> BTW, either use internal_error here, or at least if using >>>>>>>>>>>>>>>>>>> fprintf >>>>>>>>>>>>>>>>>>> terminate with \n, in your recent mail I saw: >>>>>>>>>>>>>>>>>>> ...different hash valueduring RTL pass: vartrack >>>>>>>>>>>>>>>>>>> ^^^^^^ >>>>>>>>>>>>>>>>>> Sure, fixed in attached patch. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Martin >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> + gcc_unreachable (); >>>>>>>>>>>>>>>>>>>> +} >>>>>>>>>>>>>>>>>>> Jakub >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> 0001-Sanitize-equals-and-hash-functions-in-hash-tables.patch >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> From 0d9c979c845580a98767b83c099053d36eb49bb9 Mon Sep 17 >>>>>>>>>>>>>>>>>> 00:00:00 2001 >>>>>>>>>>>>>>>>>> From: marxin <mli...@suse.cz> >>>>>>>>>>>>>>>>>> Date: Mon, 29 Oct 2018 09:38:21 +0100 >>>>>>>>>>>>>>>>>> Subject: [PATCH] Sanitize equals and hash functions in >>>>>>>>>>>>>>>>>> hash-tables. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> --- >>>>>>>>>>>>>>>>>> gcc/hash-table.h | 40 >>>>>>>>>>>>>>>>>> +++++++++++++++++++++++++++++++++++++++- >>>>>>>>>>>>>>>>>> 1 file changed, 39 insertions(+), 1 deletion(-) >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> diff --git a/gcc/hash-table.h b/gcc/hash-table.h >>>>>>>>>>>>>>>>>> index bd83345c7b8..694eedfc4be 100644 >>>>>>>>>>>>>>>>>> --- a/gcc/hash-table.h >>>>>>>>>>>>>>>>>> +++ b/gcc/hash-table.h >>>>>>>>>>>>>>>>>> @@ -503,6 +503,7 @@ private: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> value_type *alloc_entries (size_t n CXX_MEM_STAT_INFO) >>>>>>>>>>>>>>>>>> const; >>>>>>>>>>>>>>>>>> value_type *find_empty_slot_for_expand (hashval_t); >>>>>>>>>>>>>>>>>> + void verify (const compare_type &comparable, hashval_t >>>>>>>>>>>>>>>>>> hash); >>>>>>>>>>>>>>>>>> bool too_empty_p (unsigned int); >>>>>>>>>>>>>>>>>> void expand (); >>>>>>>>>>>>>>>>>> static bool is_deleted (value_type &v) >>>>>>>>>>>>>>>>>> @@ -882,8 +883,12 @@ hash_table<Descriptor, Allocator> >>>>>>>>>>>>>>>>>> if (insert == INSERT && m_size * 3 <= m_n_elements * 4) >>>>>>>>>>>>>>>>>> expand (); >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> - m_searches++; >>>>>>>>>>>>>>>>>> +#if ENABLE_EXTRA_CHECKING >>>>>>>>>>>>>>>>>> + if (insert == INSERT) >>>>>>>>>>>>>>>>>> + verify (comparable, hash); >>>>>>>>>>>>>>>>>> +#endif >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> + m_searches++; >>>>>>>>>>>>>>>>>> value_type *first_deleted_slot = NULL; >>>>>>>>>>>>>>>>>> hashval_t index = hash_table_mod1 (hash, >>>>>>>>>>>>>>>>>> m_size_prime_index); >>>>>>>>>>>>>>>>>> hashval_t hash2 = hash_table_mod2 (hash, >>>>>>>>>>>>>>>>>> m_size_prime_index); >>>>>>>>>>>>>>>>>> @@ -930,6 +935,39 @@ hash_table<Descriptor, Allocator> >>>>>>>>>>>>>>>>>> return &m_entries[index]; >>>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> +#if ENABLE_EXTRA_CHECKING >>>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>>> +/* Report a hash table checking error. */ >>>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>>> +ATTRIBUTE_NORETURN ATTRIBUTE_COLD >>>>>>>>>>>>>>>>>> +static void >>>>>>>>>>>>>>>>>> +hashtab_chk_error () >>>>>>>>>>>>>>>>>> +{ >>>>>>>>>>>>>>>>>> + fprintf (stderr, "hash table checking failed: " >>>>>>>>>>>>>>>>>> + "equal operator returns true for a pair " >>>>>>>>>>>>>>>>>> + "of values with a different hash value\n"); >>>>>>>>>>>>>>>>>> + gcc_unreachable (); >>>>>>>>>>>>>>>>>> +} >>>>>>>>>>>>>>>>> I think an internal_error here is probably still better than >>>>>>>>>>>>>>>>> a simple >>>>>>>>>>>>>>>>> fprintf, even if the fprintf is terminated with a \n :-) >>>>>>>>>>>>>>>> Fully agree with that, but I see a lot of build errors when >>>>>>>>>>>>>>>> using internal_error. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> The question then becomes can we bootstrap with this stuff >>>>>>>>>>>>>>>>> enabled and >>>>>>>>>>>>>>>>> if not, are we likely to soon? It'd be a shame to put it into >>>>>>>>>>>>>>>>> EXTRA_CHECKING, but then not be able to really use >>>>>>>>>>>>>>>>> EXTRA_CHECKING >>>>>>>>>>>>>>>>> because we've got too many bugs to fix. >>>>>>>>>>>>>>>> Unfortunately it's blocked with these 2 PRs: >>>>>>>>>>>>>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87845 >>>>>>>>>>>>>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87847 >>>>>>>>>>>>>>> Hi. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I've just added one more PR: >>>>>>>>>>>>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90450 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I'm sending updated version of the patch that provides a >>>>>>>>>>>>>>> disablement for the 3 PRs >>>>>>>>>>>>>>> with a new function disable_sanitize_eq_and_hash. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> With that I can bootstrap and finish tests. However, I've done >>>>>>>>>>>>>>> that with a patch >>>>>>>>>>>>>>> limits maximal number of checks: >>>>>>>>>>>>>> So rather than call the disable_sanitize_eq_and_hash, can you >>>>>>>>>>>>>> have its >>>>>>>>>>>>>> state set up when you instantiate the object? It's not a huge >>>>>>>>>>>>>> deal, >>>>>>>>>>>>>> just thinking about loud. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> So how do we want to go forward, particularly the EXTRA_EXTRA >>>>>>>>>>>>>> checking >>>>>>>>>>>>>> issue :-) >>>>>>>>>>>>> >>>>>>>>>>>>> There is at least one PR where we have a table where elements >>>>>>>>>>>>> _in_ the >>>>>>>>>>>>> table are never compared against each other but always against >>>>>>>>>>>>> another >>>>>>>>>>>>> object (I guess that's usual even), but the setup is in a way >>>>>>>>>>>>> that the >>>>>>>>>>>>> comparison function only works with those. With the patch we >>>>>>>>>>>>> verify >>>>>>>>>>>>> hashing/comparison for something that is never used. >>>>>>>>>>>>> >>>>>>>>>>>>> So - wouldn't it be more "correct" to only verify >>>>>>>>>>>>> comparison/hashing >>>>>>>>>>>>> at lookup time, using the object from the lookup and verify that >>>>>>>>>>>>> against >>>>>>>>>>>>> all other elements? >>>>>>>>>>>> >>>>>>>>>>>> I don't a have problem with that. Apparently this changes fixes >>>>>>>>>>>> PR90450 and PR87847. >>>>>>>>>>>> >>>>>>>>>>>> Changes from previous version: >>>>>>>>>>>> - verification happens only when an element is searched (not >>>>>>>>>>>> inserted) >>>>>>>>>>>> - new argument 'sanitize_eq_and_hash' added for >>>>>>>>>>>> hash_table::hash_table >>>>>>>>>>>> - new param has been introduced hash-table-verification-limit in >>>>>>>>>>>> order >>>>>>>>>>>> to limit number of elements that are compared within a table >>>>>>>>>>>> - verification happens only with flag_checking >= 2 >>>>>>>>>>>> >>>>>>>>>>>> I've been bootstrapping and testing the patch right now. >>>>>>>>>>> >>>>>>>>>>> Looks like I misremembered the original patch. The issue isn't >>>>>>>>>>> comparing random two elements in the table. >>>>>>>>>>> >>>>>>>>>>> That it fixes PR90450 is because LIM never calls find_slot_with_hash >>>>>>>>>>> without INSERTing. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> There's updated version of the patch where I check all find >>>>>>>>>> operations >>>>>>>>>> (both w/ and w/o insertion). >>>>>>>>>> >>>>>>>>>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests >>>>>>>>>> except for: >>>>>>>>>> >>>>>>>>>> $ ./xgcc -B. >>>>>>>>>> /home/marxin/Programming/gcc/gcc/testsuite/gcc.dg/torture/pr63941.c >>>>>>>>>> -O2 -c >>>>>>>>>> hash table checking failed: equal operator returns true for a pair >>>>>>>>>> of values with a different hash value >>>>>>>>>> during GIMPLE pass: lim >>>>>>>>>> /home/marxin/Programming/gcc/gcc/testsuite/gcc.dg/torture/pr63941.c: >>>>>>>>>> In function ‘fn1’: >>>>>>>>>> /home/marxin/Programming/gcc/gcc/testsuite/gcc.dg/torture/pr63941.c:6:1: >>>>>>>>>> internal compiler error: in hashtab_chk_error, at hash-table.h:1019 >>>>>>>>>> 6 | fn1 () >>>>>>>>>> | ^~~ >>>>>>>>>> 0x6c5725 hashtab_chk_error >>>>>>>>>> /home/marxin/Programming/gcc/gcc/hash-table.h:1019 >>>>>>>>>> 0xe504ea hash_table<mem_ref_hasher, false, >>>>>>>>>> xcallocator>::verify(ao_ref* const&, unsigned int) >>>>>>>>>> /home/marxin/Programming/gcc/gcc/hash-table.h:1040 >>>>>>>>>> 0xe504ea hash_table<mem_ref_hasher, false, >>>>>>>>>> xcallocator>::find_slot_with_hash(ao_ref* const&, unsigned int, >>>>>>>>>> insert_option) >>>>>>>>>> /home/marxin/Programming/gcc/gcc/hash-table.h:960 >>>>>>>>>> 0xe504ea gather_mem_refs_stmt >>>>>>>>>> /home/marxin/Programming/gcc/gcc/tree-ssa-loop-im.c:1501 >>>>>>>>>> 0xe504ea analyze_memory_references >>>>>>>>>> /home/marxin/Programming/gcc/gcc/tree-ssa-loop-im.c:1625 >>>>>>>>>> 0xe504ea tree_ssa_lim >>>>>>>>>> /home/marxin/Programming/gcc/gcc/tree-ssa-loop-im.c:2646 >>>>>>>>>> 0xe504ea execute >>>>>>>>>> /home/marxin/Programming/gcc/gcc/tree-ssa-loop-im.c:2708 >>>>>>>>>> >>>>>>>>>> Richi: it's after your recent patch. >>>>>>>>>> >>>>>>>>>> For some reason I don't see PR87847 issue any longer. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> May I install the patch with disabled sanitization in >>>>>>>>>> tree-ssa-loop-im.c ? >>>>>>>>> Don't we still need to deal with the naked fprintf when there's a >>>>>>>>> failure. ie, shouldn't we be raising it with a gcc_assert or >>>>>>>>> somesuch? >>>>>>>> >>>>>>>> Good point, I've just adjusted that. >>>>>>>> >>>>>>>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests. >>>>>>>> >>>>>>>> Ready to be installed? >>>>>>> >>>>>>> Ugh, the cselib one is really bad. But I don't hold my breath for >>>>>>> anyone >>>>>>> fixing it ... >>>>>> >>>>>> Yes :D It's been some time and there's no interest in the PR. >>>>>> >>>>>>> >>>>>>> One question - there's unconditional >>>>>>> >>>>>>> + if (m_sanitize_eq_and_hash) >>>>>>> + verify (comparable, hash); >>>>>>> >>>>>>> which will read a global variable and have (possibly not inline) call >>>>>>> to verify on a common path even with checking disabled. So I think >>>>>>> we want to compile this checking feature out for !CHECKING_P >>>>>>> or at least make the if __builtin_expect (..., 0), ::verify not >>>>>>> inlined and marked pure () (thus, !CHECKING_P is simplest ;)). >>>>>> >>>>>> Fixed. May I install the patch? The cselib issue can be solved later.. >>>>> >>>>> You missed the second occurance >>>>> >>>>> - m_searches++; >>>>> + if (m_sanitize_eq_and_hash) >>>>> + verify (comparable, hash); >>>> >>>> Yep ;) I've just install the patch. >>> >>> This is breaking my build: >>> >>> /home/jason/gt/gcc/hash-map.h:123:71: error: no matching function for >>> call to ‘hash_table<hash_map<mem_alloc_d\ >>> escription<mem_usage>::mem_location_hash, mem_usage*, >>> simple_hashmap_traits<default_hash_traits<mem_alloc_desc\ >>> ription<mem_usage>::mem_location_hash>, mem_usage*> >::hash_entry, >>> false, xcallocator>::hash_table(size_t&, bo\ >>> ol&, bool&, mem_alloc_origin, const char*&, int&, const char*&)’ >>> : m_table (n, ggc, gather_mem_stats, HASH_MAP_ORIGIN PASS_MEM_STAT) {} >>> >>> Looks like this needs to be updated to pass an argument to the new >>> sanitize_eq_and_hash parameter. >>> >>> Jason >> >> Sorry for the breakage, I've just fixed that in r272104. > > Thanks. I'm also seeing a massive compile time hit from this: A > constexpr testcase that I've been looking at went from compiling in 13 > seconds to 78 seconds, 6 times as long. I would expect template-heavy > code to see similar problems when sanitization is enabled for those > hash tables. Could we keep the parameter low or 0 by default, and > just do occasional sanitize runs with it explicitly enabled?
Makes sense to me. Can you please provide a test-case which I can measure? Martin > > Jason >