On Wed, Apr 16, 2025 at 10:55 PM Andrew MacLeod <amacl...@redhat.com> wrote:
>
> This was a fun one!   An actual bug, and it took a while to sort out.
> After chasing down some red herrings, this turns out to be an issue of
> interaction between the range and value masks and intervening calculations.
>
> The original patch from 11/2023 adjusts intersection so that it can
> enhance subranges based on the value mask.  ie in this testcase
>
> [irange] int [-INF, 2147483644] MASK 0xfffffffc VALUE 0x1
>
>   If adjust_range() were called on this, it would eliminate the trailing
> mask/value bit ranges that are invalid and turn it into :
>
> [-INF, -3][1, 1][4, 2147483626] MASK 0xfffffffc VALUE 0x1
>
> reflecting the lower bits into the range.   The problem develops because
> we only apply adjust_range ()  during intersection in an attempt to
> avoid expensive work when it isnt needed.
>
> Unfortunately, that is what triggers this infinite loop.  Rangers cache
> propagates changes, and the algorithm is designed to always improve the
> range.  In this case, the first iteration through, _11 receives the
> above value, [irange] int [-INF, 2147483644] MASK 0xfffffffc VALUE 0x1
> which via the mask, excludes 0, 2 and 3.
>
> The ensuing calculations in block 7 do not trigger a successful
> intersection operation, and thus the range pairs are never expanded to
> eliminate the lower ranges, and it triggers another change in values
> which leads to the next iteration being less precise, but not obviously
> so. [irange] int [-INF, 2147483644] MASK 0xfffffffd VALUE 0x0 is a
> result of the calculation.   As ranges as suppose to always get better
> with this algorithm, we simply compare for difference.. and this range
> is different, and thus we replace it. It only excludes 2 and 3.
>
> Next iteration through the less precise range DOES trigger an
> intersection operation in block 7, and when that is expanded to [irange]
> int [-INF, 1][4, 2147483644] MASK 0xfffffffd VALUE 0x0 using that we can
> again create the more precise range for _11 that started the cycle. and
> we go on and on and on.
>
> If we fix this so that we always expand subranges to reflect the lower
> bits in a bitmask, the initial value starts with
>
> [irange] int [-INF, -3][1, 1][4, 2147483644] MASK 0xfffffffc VALUE 0x1
>
> And everything falls into place as it should.  The fix is to be
> consistent about expanding those lower subranges.
>
> I also added a couple of minor performance tweaks to avoid unnecessary
> work, along with removing adjust_range () directly into
> set_range_from_bitmask () .
>
> I started at a 0.2% overall compilation increase (1.8% in VRP). In the
> end, this patch is down to 0.6% in VRP, and only 0.08% overall, so
> manageable for all the extra work.
>
> It also causes a few ripples in the testsuite so 3 test cases also
> needed adjustment:
>
>   * gcc.dg/pr83072-2.c :  With the addition of the expanded ranges, CCP
> use to export a global:
>      Global Exported: c_3 = [irange] int [-INF, +INF] MASK 0xfffffffe
> VALUE 0x1
> and now
>     Global Exported: c_3 = [irange] int [-INF, -1][1, +INF] MASK
> 0xfffffffe VALUE 0x1
> Which in turn enables forwprop to collapse part of the testcase much
> earlier.     So I turned off forwprop for the testcase
>
> * gcc.dg/tree-ssa/phi-opt-value-5.c  : WIth the expanded ranges, CCP2
> pass use to export:
>     Global Exported: d_3 = [irange] int [-INF, +INF] MASK 0xfffffffe
> VALUE 0x1
> and now
>     Global Exported: d_3 = [irange] int [-INF, -1][1, +INF] MASK
> 0xfffffffe VALUE 0x1
> which in turn makes the following comment obsolete as the optimization
> does happen earlier.:
> /* fdiv1 requires until later than phiopt2 to be able to detect that
>     d is non-zero. to be able to remove the conditional.  */
> Adjusted the testcase to expect everything to be taken care of by
> phi-opt2 pass.
>
>   * gcc.dg/tree-ssa/vrp122.c : Previously, CCP exported:
>     Global Exported: g_4 = [irange] unsigned int [0, +INF] MASK
> 0xfffffff0 VALUE 0x0
> and then EVRP refined that and stored it, then the testcase tested for:
>     Global Exported: g_4 = [irange] unsigned int [0, 0][16, +INF] MASK
> 0xfffffff0 VALUE 0x0
> Now, CCP itself exported the expanded range, so there is nothing for VRP
> to do.
> adjusted the testcase to look for the expanded range in CCP.
>
> Now we never get into this situation where the bitmask is explicitly
> applied in some places and not others.
>
> Bootstraps on x86_64-pc-linux-gnu with no regressions. Finally.   Is
> this OK for trunk, or should I hold off a little bit?

Please wait a little bit until after 15.1 is out.  It's then OK for trunk in
stage1 and backports when no issues show up.

Thanks,
Richard.

>
> Andrew

Reply via email to