> Am 15.02.2024 um 18:06 schrieb Richard Sandiford <richard.sandif...@arm.com>:
>
> Richard Biener <rguent...@suse.de> writes:
>>> On Wed, 14 Feb 2024, Richard Biener wrote:
>>>
>>> For the testcase in PR113910 we spend a lot of time in PTA comparing
>>> bitmaps for looking up equivalence class members. This points to
>>> the very weak bitmap_hash function which effectively hashes set
>>> and a subset of not set bits. The following improves it by mixing
>>> that weak result with the population count of the bitmap, reducing
>>> the number of collisions significantly. It's still by no means
>>> a good hash function.
>>>
>>> One major problem with it was that it simply truncated the
>>> BITMAP_WORD sized intermediate hash to hashval_t which is
>>> unsigned int, effectively not hashing half of the bits. That solves
>>> most of the slowness. Mixing in the population count improves
>>> compile-time by another 30% though.
>>>
>>> This reduces the compile-time for the testcase from tens of minutes
>>> to 30 seconds and PTA time from 99% to 25%. bitmap_equal_p is gone
>>> from the profile.
>>>
>>> Bootstrap and regtest running on x86_64-unknown-linux-gnu, will
>>> push to trunk and branches.
>>
>> Ha, and it breaks bootstrap because I misunderstood
>> bitmap_count_bits_in_word (should be word_s_). Fixing this turns
>> out that hashing the population count doesn't help anything
>> so I'm re-testing the following simpler variant, giving up on the
>> cheap last 25% but solving the regression as well.
>>
>> Richard.
>>
>> From a76aebfdc4b6247db6a061e6395fd088a5694122 Mon Sep 17 00:00:00 2001
>> From: Richard Biener <rguent...@suse.de>
>> Date: Wed, 14 Feb 2024 12:33:13 +0100
>> Subject: [PATCH] tree-optimization/113910 - huge compile time during PTA
>> To: gcc-patches@gcc.gnu.org
>>
>> For the testcase in PR113910 we spend a lot of time in PTA comparing
>> bitmaps for looking up equivalence class members. This points to
>> the very weak bitmap_hash function which effectively hashes set
>> and a subset of not set bits.
>>
>> The major problem with it is that it simply truncates the
>> BITMAP_WORD sized intermediate hash to hashval_t which is
>> unsigned int, effectively not hashing half of the bits.
>>
>> This reduces the compile-time for the testcase from tens of minutes
>> to 42 seconds and PTA time from 99% to 46%.
>>
>> PR tree-optimization/113910
>> * bitmap.cc (bitmap_hash): Mix the full element "hash" to
>> the hashval_t hash.
>> ---
>> gcc/bitmap.cc | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/gcc/bitmap.cc b/gcc/bitmap.cc
>> index 6cf326bca5a..459e32c1ad1 100644
>> --- a/gcc/bitmap.cc
>> +++ b/gcc/bitmap.cc
>> @@ -2706,7 +2706,7 @@ bitmap_hash (const_bitmap head)
>> for (ix = 0; ix != BITMAP_ELEMENT_WORDS; ix++)
>> hash ^= ptr->bits[ix];
>> }
>> - return (hashval_t)hash;
>> + return iterative_hash (&hash, sizeof (hash), 0);
>> }
>>
>>
>
> LGTM FWIW, but just curious: does using the iterative hash routines for
> each update (instead of ^) help too, or is it too slow?
It helps but is too slow.
> Or maybe do an
> iterative hash for the idx part and keep ^ for the bits accumulation?
> Also wonder whether using + rather than ^ for the bits accumulation
> would help...
I have a patch picking the new BFD string hash for this which is fast. I’ll do
this for stage1, trying to replace our generic hash function.
Richard
> Thanks,
> Richard
>
>