https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101925
--- Comment #8 from rguenther at suse dot de <rguenther at suse dot de> --- On Mon, 16 Aug 2021, ebotcazou at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101925 > > --- Comment #7 from Eric Botcazou <ebotcazou at gcc dot gnu.org> --- > > Disabling SRA fixes it also, and I think that SRA drops the rev storage > > order > > access attribute. Oddly enough for be_ip6_addr I see the rc.u.addr8[] > > accesses do _not_ result in reverse_storage_order_for_component_p being > > true. > > Why's that so? How should I detect this is subject to re-ordering? > > Because semantically this does not change anything, but I agree that the flag > should be set in this case too for the sake of the optimizer. Please remove > the test on QImode on line 8844 in c/c-decl.c. OK, so that fixes it with the vectorizer patch. But then I have no idea why it breaks in the first place. With -fdbg-cnt=4:4 (the only "single" vect case that breaks) I see <bb 2> [local count: 1073741824]: ip$is_v4_20 = ip.is_v4; - rc = {}; + MEM <char[3]> [(struct _be_net_addr *)&rc + 1B] = {}; if (ip$is_v4_20 != 0) (huh, guess DSE goes bonkers) - _85 = MEM <unsigned int> [(union *)&ip + 4B]; + vect_ip6_u_addr8_0_21.88_86 = MEM <const vector(4) char> [(union *)&ip + 4B]; ... - MEM <unsigned int> [(union *)&rc + 4B] = _85; - rc$u$addr_41 ={rev} MEM <int32_t> [(union *)&rc + 4B]; + _39 = VIEW_CONVERT_EXPR<int>(vect_ip6_u_addr8_0_21.88_86); looks like bogus FRE here. So both not really vectorization but followup opt issues. -fdisable-tree-fre5 makes the testcase work. We're copying REF_REVERSE_STORAGE_ORDER in copy_reference_ops_from_ref but vn_reference_eq doesn't compare .reverse, it only has some early complete out on V_C_E with reverse (aka storage order barriers - whatever those exactly are). Also vn_reference_lookup_3 uses contains_storage_order_barrier_p in two places but that again only checks for the V_C_E barrier, not for reverse accesses. So we're running into /* 4) Assignment from an SSA name which definition we may be able to access pieces from or we can combine to a larger entity. */ else if (known_eq (ref->size, maxsize) && is_gimple_reg_type (vr->type) && !contains_storage_order_barrier_p (vr->operands) && gimple_assign_single_p (def_stmt) && TREE_CODE (gimple_assign_rhs1 (def_stmt)) == SSA_NAME) { ... reverse = reverse_storage_order_for_component_p (lhs); tree def_rhs = gimple_assign_rhs1 (def_stmt); if (!reverse && !storage_order_barrier_p (lhs) && known_size_p (maxsize2) && known_eq (maxsize2, size2) && adjust_offsets_for_equal_base_address (base, &offset, base2, &offset2)) { since LHS is MEM <vector(4) char> [(union *)&rc + 4B] = vect_ip6_u_addr8_0_21.88_86; that doesn't trigger anything. But the ref we're looking at (vr) is from rc$u$addr_41 ={rev} MEM <int32_t> [(union *)&rc + 4B]; but we don't check that for reverse_storage_order_for_component_p. We do have check contains_storage_order_barrier_p which easily(?) could at least look for other .reverse == 1 compontents but reverse_storage_order_for_component_p is a bit more complicated due to how it handles array and component refs based on their aggregate base TYPE_REVERSE_STORAGE_ORDER. I suppose we could set .reverse for those in copy_reference_ops_from_ref as well. That fixes it.