Re: [PATCH] tree-optimization/113910 - huge compile time during PTA

Richard Biener Thu, 15 Feb 2024 09:26:34 -0800

> Am 15.02.2024 um 18:06 schrieb Richard Sandiford <richard.sandif...@arm.com>:
> 
> Richard Biener <rguent...@suse.de> writes:
>>> On Wed, 14 Feb 2024, Richard Biener wrote:
>>> 
>>> For the testcase in PR113910 we spend a lot of time in PTA comparing
>>> bitmaps for looking up equivalence class members.  This points to
>>> the very weak bitmap_hash function which effectively hashes set
>>> and a subset of not set bits.  The following improves it by mixing
>>> that weak result with the population count of the bitmap, reducing
>>> the number of collisions significantly.  It's still by no means
>>> a good hash function.
>>> 
>>> One major problem with it was that it simply truncated the
>>> BITMAP_WORD sized intermediate hash to hashval_t which is
>>> unsigned int, effectively not hashing half of the bits.  That solves
>>> most of the slowness.  Mixing in the population count improves
>>> compile-time by another 30% though.
>>> 
>>> This reduces the compile-time for the testcase from tens of minutes
>>> to 30 seconds and PTA time from 99% to 25%.  bitmap_equal_p is gone
>>> from the profile.
>>> 
>>> Bootstrap and regtest running on x86_64-unknown-linux-gnu, will
>>> push to trunk and branches.
>> 
>> Ha, and it breaks bootstrap because I misunderstood
>> bitmap_count_bits_in_word (should be word_s_).  Fixing this turns
>> out that hashing the population count doesn't help anything
>> so I'm re-testing the following simpler variant, giving up on the
>> cheap last 25% but solving the regression as well.
>> 
>> Richard.
>> 
>> From a76aebfdc4b6247db6a061e6395fd088a5694122 Mon Sep 17 00:00:00 2001
>> From: Richard Biener <rguent...@suse.de>
>> Date: Wed, 14 Feb 2024 12:33:13 +0100
>> Subject: [PATCH] tree-optimization/113910 - huge compile time during PTA
>> To: gcc-patches@gcc.gnu.org
>> 
>> For the testcase in PR113910 we spend a lot of time in PTA comparing
>> bitmaps for looking up equivalence class members.  This points to
>> the very weak bitmap_hash function which effectively hashes set
>> and a subset of not set bits.
>> 
>> The major problem with it is that it simply truncates the
>> BITMAP_WORD sized intermediate hash to hashval_t which is
>> unsigned int, effectively not hashing half of the bits.
>> 
>> This reduces the compile-time for the testcase from tens of minutes
>> to 42 seconds and PTA time from 99% to 46%.
>> 
>>    PR tree-optimization/113910
>>    * bitmap.cc (bitmap_hash): Mix the full element "hash" to
>>    the hashval_t hash.
>> ---
>> gcc/bitmap.cc | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/gcc/bitmap.cc b/gcc/bitmap.cc
>> index 6cf326bca5a..459e32c1ad1 100644
>> --- a/gcc/bitmap.cc
>> +++ b/gcc/bitmap.cc
>> @@ -2706,7 +2706,7 @@ bitmap_hash (const_bitmap head)
>>       for (ix = 0; ix != BITMAP_ELEMENT_WORDS; ix++)
>>    hash ^= ptr->bits[ix];
>>     }
>> -  return (hashval_t)hash;
>> +  return iterative_hash (&hash, sizeof (hash), 0);
>> }
>> 
>> 
> 
> LGTM FWIW, but just curious: does using the iterative hash routines for
> each update (instead of ^) help too, or is it too slow?

It helps but is too slow.

>  Or maybe do an
> iterative hash for the idx part and keep ^ for the bits accumulation?
> Also wonder whether using + rather than ^ for the bits accumulation
> would help...

I have a patch picking the new BFD string hash for this which is fast.  I’ll do 
this for stage1, trying to replace our generic hash function.

Richard 

> Thanks,
> Richard
> 
>
Re: [PATCH] tree-optimization/113910 - huge compile time during PTA

Reply via email to