On Tue, Oct 4, 2022 at 4:34 PM Richard Biener <richard.guent...@gmail.com> wrote: > > > > > Am 04.10.2022 um 16:30 schrieb Aldy Hernandez <al...@redhat.com>: > > > > On Tue, Oct 4, 2022 at 3:27 PM Andrew MacLeod <amacl...@redhat.com> wrote: > >> > >> > >>> On 10/4/22 08:13, Aldy Hernandez via Gcc-patches wrote: > >>>> On Tue, Oct 4, 2022, 13:28 Aldy Hernandez <al...@redhat.com> wrote: > >>> > >>>> On Tue, Oct 4, 2022 at 9:55 AM Richard Biener > >>>> <richard.guent...@gmail.com> wrote: > >>>>> > >>>>> Am 04.10.2022 um 09:36 schrieb Aldy Hernandez via Gcc-patches < > >>>> gcc-patches@gcc.gnu.org>: > >>>>>> The reason the nonzero mask was kept in a tree was basically inertia, > >>>>>> as everything in irange is a tree. However, there's no need to keep > >>>>>> it in a tree, as the conversions to and from wide ints are very > >>>>>> annoying. That, plus special casing NULL masks to be -1 is prone > >>>>>> to error. > >>>>>> > >>>>>> I have not only rewritten all the uses to assume a wide int, but > >>>>>> have corrected a few places where we weren't propagating the masks, or > >>>>>> rather pessimizing them to -1. This will become more important in > >>>>>> upcoming patches where we make better use of the masks. > >>>>>> > >>>>>> Performance testing shows a trivial improvement in VRP, as things like > >>>>>> irange::contains_p() are tied to a tree. Ughh, can't wait for trees in > >>>>>> iranges to go away. > >>>>> You want trailing wide int storage though. A wide_int is quite large. > >>>> Absolutely, this is only for short term storage. Any time we need > >>>> long term storage, say global ranges in SSA_NAME_RANGE_INFO, we go > >>>> through vrange_storage which will stream things in a more memory > >>>> efficient manner. For irange, vrange_storage will stream all the > >>>> sub-ranges, including the nonzero bitmask which is the first entry in > >>>> such storage, as trailing_wide_ints. > >>>> > >>>> See irange_storage_slot to see how it lives in GC memory. > >>>> > >>> That being said, the ranger's internal cache uses iranges, albeit with a > >>> squished down number of subranges (the minimum amount to represent the > >>> range). So each cache entry will now be bigger by the difference between > >>> one tree and one wide int. > >>> > >>> I wonder if we should change the cache to use vrange_storage. If not now, > >>> then when we convert all the subranges to wide ints. > >>> > >>> Of course, the memory pressure of the cache is not nearly as problematic > >>> as > >>> SSA_NAME_RANGE_INFO. The cache only stores names it cares about. > >> > >> Rangers cache can be a memory bottleneck in pathological cases.. > >> Certainly not as bad as it use to be, but I'm sure it can still be > >> problematic. Its suppose to be a memory efficient representation > >> because of that. The cache can have an entry for any live ssa-name > >> (which means all of them at some point in the IL) multiplied by a factor > >> involving the number of dominator blocks and outgoing edges ranges are > >> calculated on. So while SSA_NAME_RANGE_INFO is a linear thing, the > >> cache lies somewhere between a logarithmic and exponential factor based > >> on the CFG size. > > > > Hmmm, perhaps the ultimate goal here should be to convert the cache to > > use vrange_storage, which uses trailing wide ints for all of the end > > points plus the masks (known_ones included for the next release). > > > >> > >> if you are growing the common cases of 1 to 2 endpoints to more than > >> double in size (and most of the time not be needed), that would not be > >> very appealing :-P If we have any wide-ints, they would need to be a > >> memory efficient version. The Cache uses an irange_allocator, which is > >> suppose to provide a memory efficient objects.. hence why it trims the > >> number of ranges down to only what is needed. It seems like a trailing > >> wide-Int might be in order based on that.. > >> > >> Andrew > >> > >> > >> PS. which will be more problematic if you eventually introduce a > >> known_ones wide_int. I thought the mask tracking was/could be > >> something simple like HOST_WIDE_INT.. then you only tracks masks in > >> types up to the size of a HOST_WIDE_INT. then storage and masking is > >> all trivial without going thru a wide_int. Is that not so/possible? > > > > That's certainly easy and cheaper to do. The hard part was fixing all > > the places where we weren't keeping the masks up to date, and that's > > done (sans any bugs ;-)). > > > > Can we get consensus here on only tracking masks for type sizes less > > than HOST_WIDE_INT? I'd hate to do all the work only to realize we > > need to track 512 bit masks on a 32-bit host cross :-). > > 64bits are not enough, 128 might be. But there’s trailing wide int storage > so I don’t see the point in restricting ourselves?
Fair enough. Perhaps we should bite the bullet and convert the cache to vrange_storage which is all set up for streaming irange's with trailing_wide_ints. No changes should be necessary for irange, since we never have more than 3-4 live at any one time. It's the cache that needs twiddling. Aldy