Hi Marek,

On 23 Jan 2025, at 16:45, Marek Polacek wrote:

> On Thu, Jan 23, 2025 at 02:40:09PM +0000, Simon Martin wrote:
>> Hi Jason,
>>
>> On 20 Jan 2025, at 22:49, Jason Merrill wrote:
>>
>>> On 1/20/25 2:11 PM, Simon Martin wrote:
>>>> Hi Jason,
>>>>
>>>> On 15 Jan 2025, at 22:42, Jason Merrill wrote:
>>>>
>>>>> On 12/12/24 2:07 PM, Simon Martin wrote:
>>>>>> We currently ICE upon the following valid (under -Wno-vla) code
>>>>>>
>>>>>> === cut here ===
>>>>>> void f(int c) {
>>>>>>     constexpr int r = 4;
>>>>>>     [&](auto) { int t[r * c]; }(0);
>>>>>> }
>>>>>> === cut here ===
>>>>>>
>>>>>> The problem is that when parsing the lambda body, and more
>>>>>> specifically
>>>>>> the multiplication, we mark the lambda as
>>>>>> LAMBDA_EXPR_CAPTURE_OPTIMIZED
>>>>>> even though the replacement of r by 4 is "undone" by the call to
>>>>>> build_min_non_dep in build_x_binary_op. This makes
>>>>>> prune_lambda_captures
>>>>>> remove the proxy declaration while it should not, and we trip on 
>>>>>> an
>>
>>>>>> assert at instantiation time.
>>>>>
>>>>> Why does prune_lambda_captures remove the declaration if it's 
>>>>> still
>>>>> used in the function body?  Setting LAMBDA_EXPR_CAPTURE_OPTIMIZED
>>>>> just
>>>>> means "we might have optimized away some captures", the tree walk
>>>>> should have found the use put back by build_x_binary_op.
>>>> I think that this is due to a bug in mark_const_cap_r, that’s 
>>>> been
>>>> here since the beginning, i.e. r8-7213-g1577f10a637352: it decides
>>
>>>> NOT
>>>> to walk sub-trees when encountering a DECL_EXPR for a constant
>>>> capture
>>>> proxy (at lambda.cc:1769). I don’t understand why we wouldn’t
>>>> want
>>>> to continue.
>>>
>>> Because that's the declaration of the capture proxy, not a use of 
>>> it.
>> Indeed, thanks for clarifying.
>>
>>> Why aren't we finding the use in the declaration of t?
>> After further investigation, the problem is rather that neither
>> walk_tree_1 nor cp_walk_subtrees walk the dimensions of array types, 
>> so
>> we miss the uses.
>>
>>>> Removing that line fixes the PR, but breaks 3 existing tests
>>>> (lambda-capture1.C, lambda-const11.C and lambda-const11a.C, that 
>>>> all
>>>> assert that we optimise out the capture); that’s why I did not
>>>> pursue
>>>> it in the first place.
>>>
>>> Right.
>>>
>>>> But taking another look, it might not be that big a deal that we
>>>> don’t
>>>> optimise those out: as soon as we use -O1 or above, the assignment 
>>>> to
>>
>>>> the closure field actually disappears.
>>>
>>> Completely breaking this optimization is a big deal, particularly
>>> since it affects the layout of closure objects.  We can't always
>>> optimize everything away.
>> ACK.
>>
>> The attached updated patch makes cp_walk_subtrees walk array type
>> dimensions, which fixes the initial PR without those 3 regressions.
>>
>> Successfully tested on x86_64-pc-linux-gnu. Is it OK?
>>
>> Simon
>>
>> PS: In theory I think it’d make sense to do do this change in
>>
>> walk_tree_1 since C also supports VLAs, but doing so breaks some OMP
>> tests. OMP can do interesting things with array bounds (see
>> r9-5354-gda972c05f48637), and fixing those would require teaching
>> walk_tree_1 about the “omp dummy var” array bounds, which I think
>> would be a bit ugly. And I’m not aware of any C case that would be
>> improved/fixed by doing this change, so we’re probably fine not 
>> doing
>> it.
>
>> From e19bb6c943a422b1201c5c9a2a1d4f32141baf84 Mon Sep 17 00:00:00 
>> 2001
>> From: Simon Martin <si...@nasilyan.com>
>> Date: Wed, 22 Jan 2025 16:19:47 +0100
>> Subject: [PATCH] c++: Don't prune constant capture proxies only used 
>> in array
>>  dimensions [PR114292]
>>
>> We currently ICE upon the following valid (under -Wno-vla) code
>>
>> === cut here ===
>> void f(int c) {
>>   constexpr int r = 4;
>>   [&](auto) { int t[r * c]; }(0);
>> }
>> === cut here ===
>>
>> When parsing the lambda body, and more specifically the 
>> multiplication,
>> we mark the lambda as LAMBDA_EXPR_CAPTURE_OPTIMIZED, which indicates 
>> to
>> prune_lambda_captures that it might be possible to optimize out some
>> captures.
>>
>> The problem is that prune_lambda_captures then misses the use of the 
>> r
>> capture (because neither walk_tree_1 nor cp_walk_subtrees walks the
>> dimensions of array types - here "r * c"), hence believes the capture
>> can be pruned... and we trip on an assert when instantiating the 
>> lambda.
>>
>> This patch changes cp_walk_subtrees so that when walking a 
>> declaration
>> with an array type, it walks that array type's dimensions. Since C 
>> also
>> supports VLAs, I thought it'd make sense to do this in walk_tree_1, 
>> but
>> this breaks some omp-low related test cases (because OMP can do funky
>> things with array bounds when adjust_array_error_bounds is set), and 
>> I
>> don't really want to add code to walk_tree_1 to skips arrays whose
>> dimension is a temporary variable with the "omp dummy var" attribute.
>>
>> Successfully tested on x86_64-pc-linux-gnu.
>>
>>      PR c++/114292
>>
>> gcc/cp/ChangeLog:
>>
>>      * tree.cc (cp_walk_subtrees): Walk array type dimensions.
>>
>> gcc/testsuite/ChangeLog:
>>
>>      * g++.dg/cpp1y/lambda-ice4.C: New test.
>>
>> ---
>>  gcc/cp/tree.cc                           | 11 ++++++
>>  gcc/testsuite/g++.dg/cpp1y/lambda-ice4.C | 44 
>> ++++++++++++++++++++++++
>>  2 files changed, 55 insertions(+)
>>  create mode 100644 gcc/testsuite/g++.dg/cpp1y/lambda-ice4.C
>>
>> diff --git a/gcc/cp/tree.cc b/gcc/cp/tree.cc
>> index 36581865a17..fc9a2fbff32 100644
>> --- a/gcc/cp/tree.cc
>> +++ b/gcc/cp/tree.cc
>> @@ -5844,6 +5844,17 @@ cp_walk_subtrees (tree *tp, int 
>> *walk_subtrees_p, walk_tree_fn func,
>>        break;
>>
>>      default:
>> +      /* If this is an array, walk its dimensions.  */
>> +      if (DECL_P (t) && TREE_TYPE (t)
>> +      && TREE_CODE (TREE_TYPE (t)) == ARRAY_TYPE)
>> +    {
>> +      tree domain = TYPE_DOMAIN (TREE_TYPE (t));
>> +      if (domain)
>> +        {
>> +          WALK_SUBTREE (TYPE_MIN_VALUE (domain));
>> +          WALK_SUBTREE (TYPE_MAX_VALUE (domain));
>> +        }
>> +    }
>>        return NULL_TREE;
>>      }
>
> Is there a reason I'm missing not to put this into the TYPE_P block 
> above?
Do you mean put it in that block as well, or instead of where I put it? 
If the latter, we don’t see any TYPE_P node for “int[r*n]”, so we 
don’t go into that block, and continue ICE’ing without the change in 
the “default:”.

Anyway it’s a very good call (thanks!), because it got me to check 
what we get if we change the lambda body to just do “typedef int 
MyT[c*r];”, and we still ICE. And from a quick look, doing something 
similar in the TYPE_P block does not fix it.

I’ll work something out and report back.

Thanks again,
   Simon

Reply via email to