https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114966
--- Comment #6 from rguenther at suse dot de <rguenther at suse dot de> --- On Wed, 17 Jul 2024, liuhongt at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114966 > > --- Comment #5 from Hongtao Liu <liuhongt at gcc dot gnu.org> --- > I saw pass_eras optimize BIT_FIELD_REF of big memory into load from small > memory > > > Created a replacement for D.161366 offset: 0, size: 64: SR.20D.170101 > Created a replacement for D.161366 offset: 64, size: 64: SR.21D.170102 > Created a replacement for D.161366 offset: 128, size: 64: SR.22D.170103 > Created a replacement for D.161547 offset: 0, size: 256: SR.23D.170104 > > > _8 = BIT_FIELD_REF <MEM[(const struct _SimdWrapper > *)&D.159286].D.158970._M_data, 64, 0>; > _9 = BIT_FIELD_REF <MEM[(const struct _SimdWrapper > *)&D.159286].D.158970._M_data, 64, 64>; > _10 = BIT_FIELD_REF <MEM[(const struct _SimdWrapper > *)&D.159286].D.158970._M_data, 64, 128>; > _11 = {0, _8, _9, _10}; > > to > > SR.20_3 = MEM <const long unsigned int> [(struct simd *)&data]; > SR.21_13 = MEM <const long unsigned int> [(struct simd *)&data + 8B]; > SR.22_14 = MEM <const long unsigned int> [(struct simd *)&data + 16B]; > _7 = SR.20_3; > _8 = SR.21_13; > _9 = SR.22_14; > _10 = {0, _7, _8, _9}; > > > So I guess for the later GCC somehow can't be sure the whole 256-bit memory is > valid and fail to optimize it with vec_perm_expr? I think the above would be a candidate for SLP vectorization of the vector CTOR. A specific example we don't handle right now of course. Or alternatively by instruction combination in simplify_vector_constructor.