http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58493
Bug ID: 58493 Summary: loop is not correctly optimized with O3 and AVX Product: gcc Version: 4.8.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: vgrebinski at gmail dot com Host: Debian 3.9.6-1 x86_64 GNU/Linux; Intel(R) Xeon(R) CPU E5-2687W Target: amd64/sandy bridge The following example (simplified from a proprietary numerical quant library issue) shows that AVX loop optimizer mis-compiles code below with both gcc-4.8.1 and gcc-4.7.3 : the first eight weights[2*i+1] values are wrong. Happens only when AVX instructions are available (i.e. "-O3 -mavx"). Does not happen with "O2" or sse2. The fact that points[] and weights[] assignments are interleaved is important (no bug otherwise). Obviously, example has to be run on a system that supports AVX. ///////////////////////// start #include <iostream> #include <vector> using namespace std; void omb(size_t n, vector<double>& points, vector<double>& weights) { points.resize(n); weights.resize(n); for(int i=0;i<n/2;i++) { points[2*i] = .7; weights[2*i]= 5.; points[2*i+1] = -.7; weights[2*i+1]=weights[2*i]; // mis-comiled } } int main() { vector<double> p,w; omb(18, p, w); for(size_t i=0; i!= p.size(); ++i) cout << "i= " << i << " p= " << p[i] << " w= " << w[i] << endl; } /////////////////////////////// end With gcc-4.7.3: > g++-4.7 --version && g++-4.7 -O3 -mavx bug.cc && ./a.out g++-4.7 (Debian 4.7.3-6) 4.7.3 Copyright (C) 2012 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. i= 0 p= 0.7 w= 5 i= 1 p= -0.7 w= 0 ## BUG, should be 5 i= 2 p= 0.7 w= 5 i= 3 p= -0.7 w= 0 ## BUG i= 4 p= 0.7 w= 5 i= 5 p= -0.7 w= 0 ## BUG i= 6 p= 0.7 w= 5 i= 7 p= -0.7 w= 0 ## BUG i= 8 p= 0.7 w= 5 i= 9 p= -0.7 w= 0 ## BUG i= 10 p= 0.7 w= 5 i= 11 p= -0.7 w= 0 ## BUG i= 12 p= 0.7 w= 5 i= 13 p= -0.7 w= 0 ## BUG i= 14 p= 0.7 w= 5 i= 15 p= -0.7 w= 0 ## BUG i= 16 p= 0.7 w= 5 i= 17 p= -0.7 w= 5 ## right value With gcc-4.8: > g++-4.8 --version && g++-4.8 -O3 -mavx bug.cc && ./a.out g++-4.8 (Debian 4.8.1-10) 4.8.1 Copyright (C) 2013 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. i= 0 p= 0.7 w= 5 i= 1 p= -0.7 w= 0 # BUG, as well as the next 7 odd-numbered entries i= 2 p= 0.7 w= 5 i= 3 p= -0.7 w= 0 i= 4 p= 0.7 w= 5 i= 5 p= -0.7 w= 0 i= 6 p= 0.7 w= 5 i= 7 p= -0.7 w= 0 i= 8 p= 0.7 w= 5 i= 9 p= -0.7 w= 0 i= 10 p= 0.7 w= 5 i= 11 p= -0.7 w= 0 i= 12 p= 0.7 w= 5 i= 13 p= -0.7 w= 0 i= 14 p= 0.7 w= 5 i= 15 p= -0.7 w= 0 i= 16 p= 0.7 w= 5 i= 17 p= -0.7 w= 5 Same, but SSE2, not AVX: initialized correctly. > g++-4.8 --version && g++-4.8 -O3 -msse2 bug.cc && ./a.out g++-4.8 (Debian 4.8.1-10) 4.8.1 Copyright (C) 2013 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. i= 0 p= 0.7 w= 5 i= 1 p= -0.7 w= 5 # OK i= 2 p= 0.7 w= 5 i= 3 p= -0.7 w= 5 i= 4 p= 0.7 w= 5 i= 5 p= -0.7 w= 5 i= 6 p= 0.7 w= 5 i= 7 p= -0.7 w= 5 i= 8 p= 0.7 w= 5 i= 9 p= -0.7 w= 5 i= 10 p= 0.7 w= 5 i= 11 p= -0.7 w= 5 i= 12 p= 0.7 w= 5 i= 13 p= -0.7 w= 5 i= 14 p= 0.7 w= 5 i= 15 p= -0.7 w= 5 i= 16 p= 0.7 w= 5 i= 17 p= -0.7 w= 5 Now gcc-4.6 & AVX : no bug > g++-4.6 --version && g++-4.6 -O3 -mavx bug.cc && ./a.out g++-4.6 (Debian 4.6.4-4) 4.6.4 Copyright (C) 2011 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. i= 0 p= 0.7 w= 5 i= 1 p= -0.7 w= 5 # OK i= 2 p= 0.7 w= 5 i= 3 p= -0.7 w= 5 i= 4 p= 0.7 w= 5 i= 5 p= -0.7 w= 5 i= 6 p= 0.7 w= 5 i= 7 p= -0.7 w= 5 i= 8 p= 0.7 w= 5 i= 9 p= -0.7 w= 5 i= 10 p= 0.7 w= 5 i= 11 p= -0.7 w= 5 i= 12 p= 0.7 w= 5 i= 13 p= -0.7 w= 5 i= 14 p= 0.7 w= 5 i= 15 p= -0.7 w= 5 i= 16 p= 0.7 w= 5 i= 17 p= -0.7 w= 5 Regards, Vladimir