https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79745
Bug ID: 79745 Summary: vec_init<> expander misses V2TImode with AVX and V2OImode and V2TImode with AVX512 Product: gcc Version: 7.0.1 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: rguenth at gcc dot gnu.org Blocks: 65832 Target Milestone: --- Target: x86_64-*-*, i?86-*-* The AVX case pessimizes CPUv6 x264 when vectorized with 256bit vectors. The vectorizer tries to load the two 128bit halves and build a 256bit vector (the halves are separated by a gap) via /* Avoid emitting a constructor of vector elements by performing the loads using an integer type of the same size, constructing a vector of those and then re-interpreting it as the original vector type. This works around the fact that the vec_init optab was only designed for scalar element modes and thus expansion goes through memory. This avoids a huge runtime penalty due to the general inability to perform store forwarding from smaller stores to a larger load. */ unsigned lsize = group_size * TYPE_PRECISION (TREE_TYPE (vectype)); enum machine_mode elmode = mode_for_size (lsize, MODE_INT, 0); enum machine_mode vmode = mode_for_vector (elmode, nunits / group_size); /* If we can't construct such a vector fall back to element loads of the original vector type. */ if (VECTOR_MODE_P (vmode) && optab_handler (vec_init_optab, vmode) != CODE_FOR_nothing) { nloads = nunits / group_size; lnel = group_size; ltype = build_nonstandard_integer_type (lsize, 1); lvectype = build_vector_type (ltype, nloads); } See also PR65832 which is broader. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65832 [Bug 65832] Inefficient vector construction