On Tue, Feb 28, 2017 at 3:08 PM, Martin Jambor <mjam...@suse.cz> wrote: > Hi, > > On Mon, Feb 27, 2017 at 10:34:38AM +0100, Richard Biener wrote: >> On Wed, Feb 22, 2017 at 11:11 AM, Martin Jambor <mjam...@suse.cz> wrote: >> > Hello, >> > >> > this is a fix for PR 78140 which is about LTO WPA of Firefox taking >> > 1GB memory more than gcc 6. >> > >> > It works by reusing the ipa_bits and value_range that we previously >> > had directly in jump functions and which are just too big to be >> > allocated for each actual argument in all of Firefox. Reusing is >> > achieved by two hash table traits derived from ggc_cache_remove which >> > apparently has been created just for this purpose and once I >> > understood them they made my life a lot easier. In future, I will >> > have a look at applying this method to other parts of jump functions >> > as well. >> > >> > According to my measurements, the patch saves about 1.2 GB of memory. >> > The problem is that some change last week (between revision 245382 and >> > 245595) has more than invalidated this: >> > >> > | compiler | WPA mem (GB) | >> > |---------------------+--------------| >> > | gcc 6 branch | 3.86 | >> > | trunk rev. 245382 | 5.21 | >> > | patched rev. 245382 | 4.06 | >> > | trunk rev. 245595 | 6.59 | >> > | patched rev. 245595 | 5.25 | >> > >> > (I have verified this by Martin's way of measuring things.) I will >> > try to bisect what commit has caused the increase. Still, the patch >> > helps a lot. >> > >> > There is one thing in the patch that intrigues me, I do not understand >> > why I had to mark value_range with GTY((for_user)) - as opposed to >> > just GTY(()) that was there before - whereas ipa_bits does not need >> > it. If anyone could enlighten me, that would be great. But I suppose >> > this is not an indication of anything being wrong under the hood. >> > >> > I have bootstrapped and LTO-bootstrapped the patch on x86_64-linux and >> > also bootstrapped (C, C++ and Fortran) on an aarch64 and i686 cfarm >> > machine. I have also LTO-built Firefox with the patch and used it to >> > browse for a while and it seemed fine. >> > >> > OK for trunk? >> >> The idea looks good to me. I wonder what a statistic over ranges >> would look like (do they mostly look useful?). >> > > So, at the jump function level (on trunk from last week), we have: > > no. of callsites: 1064109 > no. of actual arguments: 2465511 (of all types) > no. of unknown VRs: 1628727 (not too bad, considering that we only > track them for integers and non-NULL > for pointers) > no. of known VRs: 836784 > no. of distinct VRs: 1746 > the 20 most popular VRs with their frequencies are: > > 706245 VR ~[0, 0] > 59691 VR [0, 1] > 32660 VR [0, -1] > 14039 VR [0, 4294967295] > 1607 VR [0, 255] > 1351 VR [0, 2147483647] > 1350 VR ~[2147483648, -2147483649] > 1285 VR [0, 65535] > 1259 VR [1, 4294967296] > 1241 VR [0, 31] > 903 VR [-2147483648, 2147483647] > 853 VR [-32768, 32767] > 827 VR [1, -1] > 806 VR [0, -2] > 794 VR [1, -2] > 696 VR [-128, 127] > 662 VR [0, 7] > 654 VR [0, 4294967294] > 601 VR [0, 15] > 475 VR [0, 4611686018427387903] > > > At the other end of the propagation we store value ranges of 165010 > formal parameters out of the total of 678762 (but again, of all > types). The 20 most popular ones are: > > 119319 Storing VR ~[0, 0] > 13169 Storing VR [0, -1] > 8781 Storing VR [0, 0] > 3181 Storing VR [1, 1] > 3081 Storing VR [0, 4294967295] > 2089 Storing VR [0, 1] > 918 Storing VR [-1, -1] > 870 Storing VR [2147483647, 2147483647] > 697 Storing VR [2, 2] > 554 Storing VR [0, 2] > 527 Storing VR [1, -1] > 491 Storing VR [0, 3] > 361 Storing VR [1, 2] > 350 Storing VR [0, 255] > 323 Storing VR [0, 31] > 300 Storing VR [-32768, 32767] > 285 Storing VR [0, 2147483647] > 260 Storing VR [0, 65535] > 240 Storing VR [5, 5] > 220 Storing VR [8, 8] > > I haven't had a look at how this translated to the final code, but it > is safe to say that the propagation itself does something.
Nice. Thanks, Richard. > Martin