Richard Guenther wrote: > On Thu, 23 Feb 2012, Ulrich Weigand wrote: > > The assert in question looks like: > > > > if (nested_in_vect_loop > > && (TREE_INT_CST_LOW (STMT_VINFO_DR_STEP (stmt_info)) > > % GET_MODE_SIZE (TYPE_MODE (vectype)) != 0)) > > { > > gcc_assert (alignment_support_scheme != > > dr_explicit_realign_optimized); > > compute_in_loop = true; > > } > > > > where your patch changed DR_STEP to STMT_VINFO_DR_STEP (reverting just this > > one change makes the ICEs go away). > > > > However, at the place where the decision to use the > > dr_explicit_realign_optimized > > strategy is made (tree-vect-data-refs.c:vect_supportable_dr_alignment), we > > still > > have: > > > > if ((nested_in_vect_loop > > && (TREE_INT_CST_LOW (DR_STEP (dr)) > > != GET_MODE_SIZE (TYPE_MODE (vectype)))) > > || !loop_vinfo) > > return dr_explicit_realign; > > else > > return dr_explicit_realign_optimized; > > > > Should this now also use STMT_VINFO_DR_STEP? > > Yes, I think so.
Hmmm. Reading the comment in vect_supportable_dr_alignment: However, in the case of outer-loop vectorization, when vectorizing a memory access in the inner-loop nested within the LOOP that is now being vectorized, while it is guaranteed that the misalignment of the vectorized memory access will remain the same in different outer-loop iterations, it is *not* guaranteed that is will remain the same throughout the execution of the inner-loop. This is because the inner-loop advances with the original scalar step (and not in steps of VS). If the inner-loop step happens to be a multiple of VS, then the misalignment remains fixed and we can use the optimized realignment scheme. it would appear that in this case, checking the inner-loop step is deliberate. Given the comment in vectorizable_load: /* If the misalignment remains the same throughout the execution of the loop, we can create the init_addr and permutation mask at the loop preheader. Otherwise, it needs to be created inside the loop. This can only occur when vectorizing memory accesses in the inner-loop nested within an outer-loop that is being vectorized. */ this looks to me that, since the check is intended to verify that "misalignment remains the same throughout the execuction of the loop", we actually want to check the inner-loop step here as well, i.e. revert this chunk of your patch ... Bye, Ulrich -- Dr. Ulrich Weigand GNU Toolchain for Linux on System z and Cell BE ulrich.weig...@de.ibm.com