Hi Marek, On 23 Jan 2025, at 16:45, Marek Polacek wrote:
> On Thu, Jan 23, 2025 at 02:40:09PM +0000, Simon Martin wrote: >> Hi Jason, >> >> On 20 Jan 2025, at 22:49, Jason Merrill wrote: >> >>> On 1/20/25 2:11 PM, Simon Martin wrote: >>>> Hi Jason, >>>> >>>> On 15 Jan 2025, at 22:42, Jason Merrill wrote: >>>> >>>>> On 12/12/24 2:07 PM, Simon Martin wrote: >>>>>> We currently ICE upon the following valid (under -Wno-vla) code >>>>>> >>>>>> === cut here === >>>>>> void f(int c) { >>>>>> constexpr int r = 4; >>>>>> [&](auto) { int t[r * c]; }(0); >>>>>> } >>>>>> === cut here === >>>>>> >>>>>> The problem is that when parsing the lambda body, and more >>>>>> specifically >>>>>> the multiplication, we mark the lambda as >>>>>> LAMBDA_EXPR_CAPTURE_OPTIMIZED >>>>>> even though the replacement of r by 4 is "undone" by the call to >>>>>> build_min_non_dep in build_x_binary_op. This makes >>>>>> prune_lambda_captures >>>>>> remove the proxy declaration while it should not, and we trip on >>>>>> an >> >>>>>> assert at instantiation time. >>>>> >>>>> Why does prune_lambda_captures remove the declaration if it's >>>>> still >>>>> used in the function body? Setting LAMBDA_EXPR_CAPTURE_OPTIMIZED >>>>> just >>>>> means "we might have optimized away some captures", the tree walk >>>>> should have found the use put back by build_x_binary_op. >>>> I think that this is due to a bug in mark_const_cap_r, that’s >>>> been >>>> here since the beginning, i.e. r8-7213-g1577f10a637352: it decides >> >>>> NOT >>>> to walk sub-trees when encountering a DECL_EXPR for a constant >>>> capture >>>> proxy (at lambda.cc:1769). I don’t understand why we wouldn’t >>>> want >>>> to continue. >>> >>> Because that's the declaration of the capture proxy, not a use of >>> it. >> Indeed, thanks for clarifying. >> >>> Why aren't we finding the use in the declaration of t? >> After further investigation, the problem is rather that neither >> walk_tree_1 nor cp_walk_subtrees walk the dimensions of array types, >> so >> we miss the uses. >> >>>> Removing that line fixes the PR, but breaks 3 existing tests >>>> (lambda-capture1.C, lambda-const11.C and lambda-const11a.C, that >>>> all >>>> assert that we optimise out the capture); that’s why I did not >>>> pursue >>>> it in the first place. >>> >>> Right. >>> >>>> But taking another look, it might not be that big a deal that we >>>> don’t >>>> optimise those out: as soon as we use -O1 or above, the assignment >>>> to >> >>>> the closure field actually disappears. >>> >>> Completely breaking this optimization is a big deal, particularly >>> since it affects the layout of closure objects. We can't always >>> optimize everything away. >> ACK. >> >> The attached updated patch makes cp_walk_subtrees walk array type >> dimensions, which fixes the initial PR without those 3 regressions. >> >> Successfully tested on x86_64-pc-linux-gnu. Is it OK? >> >> Simon >> >> PS: In theory I think it’d make sense to do do this change in >> >> walk_tree_1 since C also supports VLAs, but doing so breaks some OMP >> tests. OMP can do interesting things with array bounds (see >> r9-5354-gda972c05f48637), and fixing those would require teaching >> walk_tree_1 about the “omp dummy var” array bounds, which I think >> would be a bit ugly. And I’m not aware of any C case that would be >> improved/fixed by doing this change, so we’re probably fine not >> doing >> it. > >> From e19bb6c943a422b1201c5c9a2a1d4f32141baf84 Mon Sep 17 00:00:00 >> 2001 >> From: Simon Martin <si...@nasilyan.com> >> Date: Wed, 22 Jan 2025 16:19:47 +0100 >> Subject: [PATCH] c++: Don't prune constant capture proxies only used >> in array >> dimensions [PR114292] >> >> We currently ICE upon the following valid (under -Wno-vla) code >> >> === cut here === >> void f(int c) { >> constexpr int r = 4; >> [&](auto) { int t[r * c]; }(0); >> } >> === cut here === >> >> When parsing the lambda body, and more specifically the >> multiplication, >> we mark the lambda as LAMBDA_EXPR_CAPTURE_OPTIMIZED, which indicates >> to >> prune_lambda_captures that it might be possible to optimize out some >> captures. >> >> The problem is that prune_lambda_captures then misses the use of the >> r >> capture (because neither walk_tree_1 nor cp_walk_subtrees walks the >> dimensions of array types - here "r * c"), hence believes the capture >> can be pruned... and we trip on an assert when instantiating the >> lambda. >> >> This patch changes cp_walk_subtrees so that when walking a >> declaration >> with an array type, it walks that array type's dimensions. Since C >> also >> supports VLAs, I thought it'd make sense to do this in walk_tree_1, >> but >> this breaks some omp-low related test cases (because OMP can do funky >> things with array bounds when adjust_array_error_bounds is set), and >> I >> don't really want to add code to walk_tree_1 to skips arrays whose >> dimension is a temporary variable with the "omp dummy var" attribute. >> >> Successfully tested on x86_64-pc-linux-gnu. >> >> PR c++/114292 >> >> gcc/cp/ChangeLog: >> >> * tree.cc (cp_walk_subtrees): Walk array type dimensions. >> >> gcc/testsuite/ChangeLog: >> >> * g++.dg/cpp1y/lambda-ice4.C: New test. >> >> --- >> gcc/cp/tree.cc | 11 ++++++ >> gcc/testsuite/g++.dg/cpp1y/lambda-ice4.C | 44 >> ++++++++++++++++++++++++ >> 2 files changed, 55 insertions(+) >> create mode 100644 gcc/testsuite/g++.dg/cpp1y/lambda-ice4.C >> >> diff --git a/gcc/cp/tree.cc b/gcc/cp/tree.cc >> index 36581865a17..fc9a2fbff32 100644 >> --- a/gcc/cp/tree.cc >> +++ b/gcc/cp/tree.cc >> @@ -5844,6 +5844,17 @@ cp_walk_subtrees (tree *tp, int >> *walk_subtrees_p, walk_tree_fn func, >> break; >> >> default: >> + /* If this is an array, walk its dimensions. */ >> + if (DECL_P (t) && TREE_TYPE (t) >> + && TREE_CODE (TREE_TYPE (t)) == ARRAY_TYPE) >> + { >> + tree domain = TYPE_DOMAIN (TREE_TYPE (t)); >> + if (domain) >> + { >> + WALK_SUBTREE (TYPE_MIN_VALUE (domain)); >> + WALK_SUBTREE (TYPE_MAX_VALUE (domain)); >> + } >> + } >> return NULL_TREE; >> } > > Is there a reason I'm missing not to put this into the TYPE_P block > above? Do you mean put it in that block as well, or instead of where I put it? If the latter, we don’t see any TYPE_P node for “int[r*n]”, so we don’t go into that block, and continue ICE’ing without the change in the “default:”. Anyway it’s a very good call (thanks!), because it got me to check what we get if we change the lambda body to just do “typedef int MyT[c*r];”, and we still ICE. And from a quick look, doing something similar in the TYPE_P block does not fix it. I’ll work something out and report back. Thanks again, Simon