Resolving an issue of bootstrap failure from vectorization.
Hi My name is Cong Hou, and I am a Noogler working in the compiler optimization team at Google. When we were trying moving the vectorization from O3 to O2 in GCC 4.9, we met a bootstrap failure from comparison between stage 2&3. This failure is caused by a potential bug in GCC as stated below. In the file tree-vect-data-refs.c, there is a qsort() function call which sorts a group of data references using a comparison function called "dr_group_sort_cmp()". In this function, the iterative hash values of tree nodes are used for comparisons. For a declaration tree node, its UID participates in the calculation of the hash value. However, a specific declaration may have different UIDs whether the debug information is switched on/off. In consequence, the results of comparisons may vary in stage 2&3 during bootstrapping. As a solution, I think we need a function to compare two tree nodes that does not rely on UIDs. An apparent idea is comparing the tree code first, and if they have the same tree code, then compare their subnodes recursively. To resolve the issue we have now, I just composed a very basic comparison function which only compares tree codes, and this patch can make the bootstrap get passed. I have attached the patch here. I am wondering if this is a valid solution and appreciate if someone could give me any feedback. Thank you! Cong patch.diff Description: Binary data
[RFC] GCC vectorizer misses an opportunity to hoist loop invariant load after loop versioning.
First, look as the code below. void foo(int* a, int* b, int n) { int i; for (i = 0; i < n; ++i) a[i] = *b; } This loop contains possible aliasing between a[i] and *b, and in order to vectorize this loop, GCC produces two versions of the loop, and only vectorizes the one in which there is no aliasing between a[i] and *b. In this version we can assert *b is a loop variant and thereby can hoist the load and shuffle operations outside of the loop. But the current implementation does not do this. If we replace *b by a stack variable then during the vectorization pass the load and shuffle are already hoisted. So I think we can just do it during the vectorization pass without passing additional information to other passes (e.g. pass_lim). Is it safe for us to assume that there is no aliasing between a variable accessed via an address which is a loop invariant and any other variable modified in the loop (the version to be vectorized)? thanks, Cong
Re: [RFC] GCC vectorizer misses an opportunity to hoist loop invariant load after loop versioning.
I have submitted a bugreport through GCC Bugzilla. The link is http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58508. thanks, Cong On Fri, Sep 20, 2013 at 12:57 AM, Richard Biener wrote: > On Wed, Sep 18, 2013 at 10:59 PM, Xinliang David Li > wrote: >> On Wed, Sep 18, 2013 at 1:23 PM, Cong Hou wrote: >>> First, look as the code below. >>> >>> >>> void foo(int* a, int* b, int n) { >>> int i; >>> for (i = 0; i < n; ++i) >>> a[i] = *b; >>> } >>> >>> >>> This loop contains possible aliasing between a[i] and *b, and in order >>> to vectorize this loop, GCC produces two versions of the loop, and >>> only vectorizes the one in which there is no aliasing between a[i] and >>> *b. In this version we can assert *b is a loop variant and thereby can >>> hoist the load and shuffle operations outside of the loop. But the >>> current implementation does not do this. >>> >>> If we replace *b by a stack variable then during the vectorization >>> pass the load and shuffle are already hoisted. So I think we can just >>> do it during the vectorization pass without passing additional >>> information to other passes (e.g. pass_lim). Is it safe for us to >>> assume that there is no aliasing between a variable accessed via an >>> address which is a loop invariant and any other variable modified in >>> the loop (the version to be vectorized)? >>> >> >> For 0-stride read accesses, it should be safe to do. If vectorizer >> does not already do this hoisting for non-aliased scalars, it might be >> tricky to pass the info to the aliaser or the following LIM pass. > > The vectorizer already considers *b not aliasing a[i] in the vectorized > body (I've added this check) - do you say I forgot to handle it in > versioning for alias? Yes, I didn't bother implementing the hoisting, > it seemed less important than getting the vectorization itself done ;) > (happens frequently with fortran and scalars passed by reference). > > Please file missed-optimization bugreports. > > Richard. > >> David >> >>> >>> thanks, >>> Cong