[Bug tree-optimization/32309] New: Unnecessary conversion from short to unsigend short breaks vectorization
void Sub(short * __restrict src1row, short * __restrict src2row, int num_in_row) { for(int i=num_in_row; i--;) { *src1row -= *src2row; ++src1row; ++src2row; } } In the test case above, GCC inserts several explicit conversions soon after the gimple transformation stage and gets, D.2097 = *src1row; D.2098 = (short unsigned int) D.2097; D.2099 = *src2row; D.2100 = (short unsigned int) D.2099; D.2101 = D.2098 - D.2100; D.2102 = (short int) D.2101; These conversions breaks the vectorization and GCC reports, /* i686-unknown-linux-gnu-gcc -O3 -ftree-vectorize -ftree-vectorizer-verbose=5 -march=nocona -fno-strict-aliasing -c test.cc */ .. test.cc:2: note: not vectorized: relevant stmt not supported: D.2430_11 = (short unsigned int) D.2429_10 test.cc:1: note: vectorized 0 loops in function. -- Summary: Unnecessary conversion from short to unsigend short breaks vectorization Product: gcc Version: 4.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: gangren at google dot com GCC build triplet: i686-unknown-linux-gnu-gcc GCC host triplet: i686-unknown-linux-gnu-gcc GCC target triplet: i686-unknown-linux-gnu-gcc http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32309
[Bug tree-optimization/32309] Unnecessary conversion from short to unsigend short breaks vectorization
--- Comment #2 from gangren at google dot com 2007-06-12 17:28 --- (In reply to comment #1) > The conversions are not Unnecessary, they are necessary because > short_var+short_var when that would overflow the range of short is still > defined. > Do you mean that short_var + short_var is defined as (short)((unsigned short)short_var + (unsigned short)short_var)? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32309
[Bug tree-optimization/32309] Unnecessary conversion from short to unsigend short breaks vectorization
--- Comment #5 from gangren at google dot com 2007-06-12 17:53 --- (In reply to comment #3) > >Do you mean that short_var + short_var is defined as > > (short)((unsigned short)short_var + (unsigned short)short_var)? > > Kinda, because it is really defined by the C standard as: > (short)((int)short_var + (int)short_var) > And then GCC's middle-end optimizes it to: > (short)((unsigned short)short_var + (unsigned short)short_var) > > *** This bug has been marked as a duplicate of 26128 *** > I'm aware of integral promotion. But not quite understand why we can optimize (short)((int)short_var + (int)short_var) to (short)((unsigned short)short_var + (unsigned short)short_var), but not to (short)((short)short_var + (short)short_var)? Is it because unsigned short has different overflow handling? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32309
[Bug tree-optimization/32309] Unnecessary conversion from short to unsigend short breaks vectorization
--- Comment #7 from gangren at google dot com 2007-06-12 18:10 --- (In reply to comment #6) > Subject: Re: Unnecessary conversion from short to unsigend short breaks > vectorization > > On 12 Jun 2007 17:53:19 -0000, gangren at google dot com > <[EMAIL PROTECTED]> wrote: > > > I'm aware of integral promotion. But not quite understand why we can > > optimize > > (short)((int)short_var + (int)short_var) to (short)((unsigned > > short)short_var + > > (unsigned short)short_var), but not to (short)((short)short_var + > > (short)short_var)? Is it because unsigned short has different overflow > > handling? > > Yes, signed short has undefined overflow, while unsigned is defined as > wrapping. > > --Pinski > Thanks. So even if the underlining architecture does not trigger an overflow on signed short (like AltiVec if I remember correctly), we still need to have such conversions? In addition, does "undefined overflow" include "no overflow"? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32309
[Bug tree-optimization/32309] Unnecessary conversion from short to unsigend short breaks vectorization
--- Comment #9 from gangren at google dot com 2007-06-12 18:58 --- (In reply to comment #8) > if later compilation passes could prove that the computation > overflowed in short, then the result would be different than if the > computation > were done in int. The result could be different. But in some cases, such as this example, the result (variable) would be the same. In general, integral promotion might not be necessary when both destination and sources are short integers? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32309