On Mon, Jun 30, 2025 at 05:14:06PM +0200, Jakub Jelinek wrote: > 1) for idx being INTEGER_CST (case for which I believe the change has been > added) it keeps doing what it did > 2) if both op1 and op2 are arrays with invariant address or pointers to > something with invariant address, it uses > SAVE_EXPR <op0>, SAVE_EXPR <idx> SAVE_EXPR <op0> > as COND_EXPR condition and otherwise what it did > 3) otherwise punt > > Now, perhaps a predicate (but how to call it) whether idx doesn't > contain any SAVE_EXPRs/TARGET_EXPRs/BIND_EXPRs and isn't really large > expression (but how to measure it; perhaps also no CALL_EXPRs, and the > question if it should allow say multiplication/division/modulo, those might > need larger code as well) could be used instead of the INTEGER_CST check, > either immediately or incrementally. > But guess I should first try to reproduce the ICE you've mentioned.
So, the thing with the ICE is that the parsing gives us a COND_EXPR of ARRAY_TYPE with ARRAY_TYPE operands. So, when we punt (either always by removing the COND_EXPR cp_build_array_ref optimization altogether or conditionally), we risk the bogus COND_EXPR with ARRAY_TYPE remains in the IL. Now, with the v2 patch I've posted, neither extern int a1[]; extern int a2[]; void foo(int p) { int x = (p ? a1 : a2)[1]; } nor void bar(int p, int q) { int x = (p ? a1 : a2)[q]; } actually ICE, because in the first case idx is INTEGER_CST and in the second case a1 and a2 are invariant address arrays. But e.g. extern int a1[], a2[], a3[], a4[]; int foo (int p) { return (p ? a1 : a2)[1]; } int bar (int p, int q) { return (p ? a1 : a2)[q]; } int baz (int p, int q) { return (p ? q ? a1 : a2 : q ? a3 : a4)[1]; } int qux (int p, int q, int r) { return (p ? q ? a1 : a2 : q ? a3 : a4)[r]; } already ICEs with my patch in baz and qux, because the gimplifier can't handle ARRAY_TYPE COND_EXPRs with unknown bounds for obvious reasons. So, what about the following patch then (or its variant with some predicate of not large expression without SAVE_EXPR/TARGET_EXPR/BIND_EXPR/CALL_EXPR/STATEMENT_EXPR/LABEL_EXPR))? 2025-06-30 Jakub Jelinek <ja...@redhat.com> PR c++/120471 * typeck.cc (cp_build_array_ref) <case COND_EXPR>: If idx is not INTEGER_CST, don't optimize the case (but cp_default_conversion on array early if it has ARRAY_TYPE) or use SAVE_EXPR <op0>, SAVE_EXPR <idx>, SAVE_EXPR <op0> as new op0 depending on flag_strong_eval_order and whether op1 and op2 are arrays with invariant address or tree invariant pointers. Formatting fixes. * g++.dg/ubsan/pr120471.C: New test. * g++.dg/parse/pr120471.C: New test. --- gcc/cp/typeck.cc.jj 2025-06-30 12:19:16.398242029 +0200 +++ gcc/cp/typeck.cc 2025-06-30 18:59:51.289774002 +0200 @@ -4001,13 +4001,91 @@ cp_build_array_ref (location_t loc, tree } case COND_EXPR: - ret = build_conditional_expr - (loc, TREE_OPERAND (array, 0), - cp_build_array_ref (loc, TREE_OPERAND (array, 1), idx, - complain), - cp_build_array_ref (loc, TREE_OPERAND (array, 2), idx, - complain), - complain); + tree op0, op1, op2; + op0 = TREE_OPERAND (array, 0); + op1 = TREE_OPERAND (array, 1); + op2 = TREE_OPERAND (array, 1); + if (TREE_CODE (idx) != INTEGER_CST) + { + /* If idx could possibly have some SAVE_EXPRs, turning + (op0 ? op1 : op2)[idx] into + op0 ? op1[idx] : op2[idx] can lead into temporaries + initialized in one conditional path and uninitialized + uses of them in the other path. + And if idx is a really large expression, evaluating it + twice is also not optimal. + On the other side, op0 must be sequenced before evaluation + of op1 and op2 and for C++17 op0, op1 and op2 must be + sequenced before idx. + If idx is INTEGER_CST, we can just do the optimization + without any SAVE_EXPRs, if op1 and op2 are both ARRAY_TYPE + VAR_DECLs or COMPONENT_REFs thereof (so their address + is constant or relative to frame), optimize into + (SAVE_EXPR <op0>, SAVE_EXPR <idx>, SAVE_EXPR <op0>) + ? op1[SAVE_EXPR <idx>] : op2[SAVE_EXPR <idx>] + Otherwise avoid this optimization. */ + if (flag_strong_eval_order == 2) + { + if (TREE_CODE (TREE_TYPE (op1)) == ARRAY_TYPE) + { + tree xop1 = op1; + while (TREE_CODE (xop1) == COMPONENT_REF) + xop1 = TREE_OPERAND (xop1, 0); + STRIP_ANY_LOCATION_WRAPPER (xop1); + if (!decl_address_invariant_p (xop1)) + { + /* Force default conversion on array if + we can't optimize this and array has ARRAY_TYPE + COND_EXPR, we can't leave COND_EXPRs with + ARRAY_TYPE in the IL. */ + if (TREE_CODE (TREE_TYPE (array)) == ARRAY_TYPE) + { + array = cp_default_conversion (array, complain); + if (error_operand_p (array)) + return error_mark_node; + } + break; + } + } + else if (!POINTER_TYPE_P (TREE_TYPE (op1)) + || !tree_invariant_p (op1)) + break; + if (TREE_CODE (TREE_TYPE (op2)) == ARRAY_TYPE) + { + tree xop2 = op2; + while (TREE_CODE (xop2) == COMPONENT_REF) + xop2 = TREE_OPERAND (xop2, 0); + STRIP_ANY_LOCATION_WRAPPER (xop2); + if (!decl_address_invariant_p (xop2)) + { + /* Force default conversion on array if + we can't optimize this and array has ARRAY_TYPE + COND_EXPR, we can't leave COND_EXPRs with + ARRAY_TYPE in the IL. */ + if (TREE_CODE (TREE_TYPE (array)) == ARRAY_TYPE) + { + array = cp_default_conversion (array, complain); + if (error_operand_p (array)) + return error_mark_node; + } + break; + } + } + else if (!POINTER_TYPE_P (TREE_TYPE (op2)) + || !tree_invariant_p (op2)) + break; + } + if (TREE_SIDE_EFFECTS (idx)) + { + idx = save_expr (idx); + op0 = save_expr (op0); + tree tem = build_compound_expr (loc, op0, idx); + op0 = build_compound_expr (loc, tem, op0); + } + } + op1 = cp_build_array_ref (loc, op1, idx, complain); + op2 = cp_build_array_ref (loc, op2, idx, complain); + ret = build_conditional_expr (loc, op0, op1, op2, complain); protected_set_expr_location (ret, loc); return ret; --- gcc/testsuite/g++.dg/ubsan/pr120471.C.jj 2025-06-30 18:19:57.290356305 +0200 +++ gcc/testsuite/g++.dg/ubsan/pr120471.C 2025-06-30 18:19:57.290356305 +0200 @@ -0,0 +1,21 @@ +// PR c++/120471 +// { dg-do run } +// { dg-options "-fsanitize=undefined" } + +volatile int b[1], a[1]; + +void +foo (int x) +{ + volatile int c = 21; + volatile int v = (x % 2 ? b : a)[c % 3]; + if (v != 0) + __builtin_abort (); +} + +int +main () +{ + foo (1); + foo (2); +} --- gcc/testsuite/g++.dg/parse/pr120471.C.jj 2025-06-30 19:01:55.875225849 +0200 +++ gcc/testsuite/g++.dg/parse/pr120471.C 2025-06-30 19:03:08.289326001 +0200 @@ -0,0 +1,28 @@ +// PR c++/120471 +// { dg-do compile } + +extern int a1[], a2[], a3[], a4[]; + +int +foo (int p) +{ + return (p ? a1 : a2)[1]; +} + +int +bar (int p, int q) +{ + return (p ? a1 : a2)[q]; +} + +int +baz (int p, int q) +{ + return (p ? q ? a1 : a2 : q ? a3 : a4)[1]; +} + +int +qux (int p, int q, int r) +{ + return (p ? q ? a1 : a2 : q ? a3 : a4)[r]; +} Jakub