Hi, As PR107338 shows, with the use of widening loads, the container_type can become a wider type, it causes us to get wrong shift_n since the BIT_FIELD_REF offset actually becomes bigger on BE. Taking the case in PR107338 as example, at the beginning the container type is short and BIT_FIELD_REF offset is 8 and size is 4, with unpacking to wider type int, the high 16 bits are zero, by viewing it as type int, its offset actually becomes to 24. So the shift_n should be 4 (32 - 24 - 4) instead of 20 (32 - 8 - 4).
I noticed that if we move shift_n calculation early before the adjustments for widening loads (container type change), it's based on all the stuffs of the original container, the shfit_n calculated there is exactly what we want, it can be independent of widening. Besides, I add prec adjustment together with the current adjustments for widening loads, although prec's subsequent uses don't require this change for now, since the container type gets changed, we should keep the corresponding prec consistent. Bootstrapped and regtested on x86_64-redhat-linux, aarch64-linux-gnu, powerpc64-linux-gnu P7 and P8 and powerpc64le-linux-gnu P9 and P10. Is it ok for trunk? BR, Kewen ----- PR tree-optimization/107338 gcc/ChangeLog: * tree-vect-patterns.cc (vect_recog_bitfield_ref_pattern): Move shfit_n calculation before the adjustments for widening loads. --- gcc/tree-vect-patterns.cc | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc index 777ba2f5903..01094e8cb86 100644 --- a/gcc/tree-vect-patterns.cc +++ b/gcc/tree-vect-patterns.cc @@ -1925,6 +1925,16 @@ vect_recog_bitfield_ref_pattern (vec_info *vinfo, stmt_vec_info stmt_info, tree container_type = TREE_TYPE (container); tree vectype = get_vectype_for_scalar_type (vinfo, container_type); + /* Calculate shift_n before the adjustments for widening loads, otherwise + the container may change and we have to consider offset change for + widening loads on big endianness. The shift_n calculated here can be + independent of widening. */ + unsigned HOST_WIDE_INT shift_n = bit_field_offset (bf_ref).to_constant (); + unsigned HOST_WIDE_INT mask_width = bit_field_size (bf_ref).to_constant (); + unsigned HOST_WIDE_INT prec = tree_to_uhwi (TYPE_SIZE (container_type)); + if (BYTES_BIG_ENDIAN) + shift_n = prec - shift_n - mask_width; + /* We move the conversion earlier if the loaded type is smaller than the return type to enable the use of widening loads. */ if (TYPE_PRECISION (TREE_TYPE (container)) < TYPE_PRECISION (ret_type) @@ -1935,6 +1945,7 @@ vect_recog_bitfield_ref_pattern (vec_info *vinfo, stmt_vec_info stmt_info, NOP_EXPR, container); container = gimple_get_lhs (pattern_stmt); container_type = TREE_TYPE (container); + prec = tree_to_uhwi (TYPE_SIZE (container_type)); vectype = get_vectype_for_scalar_type (vinfo, container_type); append_pattern_def_seq (vinfo, stmt_info, pattern_stmt, vectype); } @@ -1953,12 +1964,6 @@ vect_recog_bitfield_ref_pattern (vec_info *vinfo, stmt_vec_info stmt_info, shift_first = false; } - unsigned HOST_WIDE_INT shift_n = bit_field_offset (bf_ref).to_constant (); - unsigned HOST_WIDE_INT mask_width = bit_field_size (bf_ref).to_constant (); - unsigned HOST_WIDE_INT prec = tree_to_uhwi (TYPE_SIZE (container_type)); - if (BYTES_BIG_ENDIAN) - shift_n = prec - shift_n - mask_width; - /* If we don't have to shift we only generate the mask, so just fix the code-path to shift_first. */ if (shift_n == 0) -- 2.35.4