On Wed, Apr 16, 2025 at 10:55 PM Andrew MacLeod <amacl...@redhat.com> wrote: > > This was a fun one! An actual bug, and it took a while to sort out. > After chasing down some red herrings, this turns out to be an issue of > interaction between the range and value masks and intervening calculations. > > The original patch from 11/2023 adjusts intersection so that it can > enhance subranges based on the value mask. ie in this testcase > > [irange] int [-INF, 2147483644] MASK 0xfffffffc VALUE 0x1 > > If adjust_range() were called on this, it would eliminate the trailing > mask/value bit ranges that are invalid and turn it into : > > [-INF, -3][1, 1][4, 2147483626] MASK 0xfffffffc VALUE 0x1 > > reflecting the lower bits into the range. The problem develops because > we only apply adjust_range () during intersection in an attempt to > avoid expensive work when it isnt needed. > > Unfortunately, that is what triggers this infinite loop. Rangers cache > propagates changes, and the algorithm is designed to always improve the > range. In this case, the first iteration through, _11 receives the > above value, [irange] int [-INF, 2147483644] MASK 0xfffffffc VALUE 0x1 > which via the mask, excludes 0, 2 and 3. > > The ensuing calculations in block 7 do not trigger a successful > intersection operation, and thus the range pairs are never expanded to > eliminate the lower ranges, and it triggers another change in values > which leads to the next iteration being less precise, but not obviously > so. [irange] int [-INF, 2147483644] MASK 0xfffffffd VALUE 0x0 is a > result of the calculation. As ranges as suppose to always get better > with this algorithm, we simply compare for difference.. and this range > is different, and thus we replace it. It only excludes 2 and 3. > > Next iteration through the less precise range DOES trigger an > intersection operation in block 7, and when that is expanded to [irange] > int [-INF, 1][4, 2147483644] MASK 0xfffffffd VALUE 0x0 using that we can > again create the more precise range for _11 that started the cycle. and > we go on and on and on. > > If we fix this so that we always expand subranges to reflect the lower > bits in a bitmask, the initial value starts with > > [irange] int [-INF, -3][1, 1][4, 2147483644] MASK 0xfffffffc VALUE 0x1 > > And everything falls into place as it should. The fix is to be > consistent about expanding those lower subranges. > > I also added a couple of minor performance tweaks to avoid unnecessary > work, along with removing adjust_range () directly into > set_range_from_bitmask () . > > I started at a 0.2% overall compilation increase (1.8% in VRP). In the > end, this patch is down to 0.6% in VRP, and only 0.08% overall, so > manageable for all the extra work. > > It also causes a few ripples in the testsuite so 3 test cases also > needed adjustment: > > * gcc.dg/pr83072-2.c : With the addition of the expanded ranges, CCP > use to export a global: > Global Exported: c_3 = [irange] int [-INF, +INF] MASK 0xfffffffe > VALUE 0x1 > and now > Global Exported: c_3 = [irange] int [-INF, -1][1, +INF] MASK > 0xfffffffe VALUE 0x1 > Which in turn enables forwprop to collapse part of the testcase much > earlier. So I turned off forwprop for the testcase > > * gcc.dg/tree-ssa/phi-opt-value-5.c : WIth the expanded ranges, CCP2 > pass use to export: > Global Exported: d_3 = [irange] int [-INF, +INF] MASK 0xfffffffe > VALUE 0x1 > and now > Global Exported: d_3 = [irange] int [-INF, -1][1, +INF] MASK > 0xfffffffe VALUE 0x1 > which in turn makes the following comment obsolete as the optimization > does happen earlier.: > /* fdiv1 requires until later than phiopt2 to be able to detect that > d is non-zero. to be able to remove the conditional. */ > Adjusted the testcase to expect everything to be taken care of by > phi-opt2 pass. > > * gcc.dg/tree-ssa/vrp122.c : Previously, CCP exported: > Global Exported: g_4 = [irange] unsigned int [0, +INF] MASK > 0xfffffff0 VALUE 0x0 > and then EVRP refined that and stored it, then the testcase tested for: > Global Exported: g_4 = [irange] unsigned int [0, 0][16, +INF] MASK > 0xfffffff0 VALUE 0x0 > Now, CCP itself exported the expanded range, so there is nothing for VRP > to do. > adjusted the testcase to look for the expanded range in CCP. > > Now we never get into this situation where the bitmask is explicitly > applied in some places and not others. > > Bootstraps on x86_64-pc-linux-gnu with no regressions. Finally. Is > this OK for trunk, or should I hold off a little bit?
Please wait a little bit until after 15.1 is out. It's then OK for trunk in stage1 and backports when no issues show up. Thanks, Richard. > > Andrew