The following short testcase gets vectorized with 4.1.1 and doesn't with 4.2.0 revision 114610
============================================================ template <class T> class vec { public: vec(unsigned int n) : size_(n) { data_ = new T[n]; } vec& multiply(const vec& other) { const T* op=other.data_; for (unsigned int i=0; i<size_; ++i) { data_[i] *= op[i]; } return *this; } private: unsigned int size_; T* data_; }; template class vec<float>; ============================================================ /usr/local/4.2/bin/g++4.2.0 -O3 -ftree-vectorize -ftree-vectorizer-verbose=7 -march=pentium-m -c vectorizer.cpp vectorizer.cpp:16: note: ===== analyze_loop_nest ===== vectorizer.cpp:16: note: === vect_analyze_loop_form === vectorizer.cpp:16: note: split exit edge. vectorizer.cpp:16: note: === get_loop_niters === vectorizer.cpp:16: note: ==> get_loop_niters:D.2376_16 vectorizer.cpp:16: note: Symbolic number of iterations is D.2376_16 vectorizer.cpp:16: note: === vect_analyze_data_refs === vectorizer.cpp:16: note: get vectype with 4 units of type float vectorizer.cpp:16: note: vectype: vector float vectorizer.cpp:16: note: get vectype with 4 units of type const float vectorizer.cpp:16: note: vectype: const vector float vectorizer.cpp:16: note: get vectype with 4 units of type float vectorizer.cpp:16: note: vectype: vector float vectorizer.cpp:16: note: === vect_analyze_scalar_cycles === vectorizer.cpp:16: note: Analyze phi: SMT.6_28 = PHI <SMT.6_27(5), SMT.6_26(3)>; vectorizer.cpp:16: note: virtual phi. skip. vectorizer.cpp:16: note: Analyze phi: i_4 = PHI <i_23(5), 0(3)>; vectorizer.cpp:16: note: Access function of PHI: {0, +, 1}_1 vectorizer.cpp:16: note: step: 1, init: 0 vectorizer.cpp:16: note: Detected induction. vectorizer.cpp:16: note: === vect_pattern_recog === vectorizer.cpp:16: note: === vect_mark_stmts_to_be_vectorized === vectorizer.cpp:16: note: init: phi relevant? SMT.6_28 = PHI <SMT.6_27(5), SMT.6_26(3)>; vectorizer.cpp:16: note: init: phi relevant? i_4 = PHI <i_23(5), 0(3)>; vectorizer.cpp:16: note: init: stmt relevant? <L0>: vectorizer.cpp:16: note: init: stmt relevant? D.2378_9 = pretmp.24_1 vectorizer.cpp:16: note: init: stmt relevant? D.2379_10 = i_4 * 4 vectorizer.cpp:16: note: init: stmt relevant? D.2380_11 = (float *) D.2379_10 vectorizer.cpp:16: note: init: stmt relevant? D.2381_12 = pretmp.24_1 + D.2380_11 vectorizer.cpp:16: note: init: stmt relevant? D.2382_17 = *D.2381_12 vectorizer.cpp:16: note: init: stmt relevant? D.2383_19 = (const float *) D.2379_10 vectorizer.cpp:16: note: init: stmt relevant? D.2384_20 = D.2383_19 + op_3 vectorizer.cpp:16: note: init: stmt relevant? D.2385_21 = *D.2384_20 vectorizer.cpp:16: note: init: stmt relevant? D.2386_22 = D.2382_17 * D.2385_21 vectorizer.cpp:16: note: init: stmt relevant? *D.2381_12 = D.2386_22 vectorizer.cpp:16: note: vec_stmt_relevant_p: stmt has vdefs. vectorizer.cpp:16: note: mark relevant 1, live 0. vectorizer.cpp:16: note: init: stmt relevant? i_23 = i_4 + 1 vectorizer.cpp:16: note: init: stmt relevant? if (D.2376_16 > i_23) goto <L9>; else goto <L12>; vectorizer.cpp:16: note: init: stmt relevant? <L9>: vectorizer.cpp:16: note: worklist: examine stmt: *D.2381_12 = D.2386_22 vectorizer.cpp:16: note: vect_is_simple_use: operand D.2386_22 vectorizer.cpp:16: note: def_stmt: D.2386_22 = D.2382_17 * D.2385_21 vectorizer.cpp:16: note: type of def: 2. vectorizer.cpp:16: note: worklist: examine use 2: D.2386_22 vectorizer.cpp:16: note: mark relevant 1, live 0. vectorizer.cpp:16: note: worklist: examine stmt: D.2386_22 = D.2382_17 * D.2385_21 vectorizer.cpp:16: note: vect_is_simple_use: operand D.2382_17 vectorizer.cpp:16: note: def_stmt: D.2382_17 = *D.2381_12 vectorizer.cpp:16: note: type of def: 2. vectorizer.cpp:16: note: worklist: examine use 2: D.2382_17 vectorizer.cpp:16: note: mark relevant 1, live 0. vectorizer.cpp:16: note: vect_is_simple_use: operand D.2385_21 vectorizer.cpp:16: note: def_stmt: D.2385_21 = *D.2384_20 vectorizer.cpp:16: note: type of def: 2. vectorizer.cpp:16: note: worklist: examine use 2: D.2385_21 vectorizer.cpp:16: note: mark relevant 1, live 0. vectorizer.cpp:16: note: worklist: examine stmt: D.2385_21 = *D.2384_20 vectorizer.cpp:16: note: worklist: examine stmt: D.2382_17 = *D.2381_12 vectorizer.cpp:16: note: === vect_analyze_data_refs_alignment === vectorizer.cpp:16: note: vect_compute_data_ref_alignment: vectorizer.cpp:16: note: Unknown alignment for access: *pretmp.24_1 vectorizer.cpp:16: note: vect_compute_data_ref_alignment: vectorizer.cpp:16: note: Unknown alignment for access: *op_3 vectorizer.cpp:16: note: vect_compute_data_ref_alignment: vectorizer.cpp:16: note: Unknown alignment for access: *pretmp.24_1 vectorizer.cpp:16: note: === vect_determine_vectorization_factor === vectorizer.cpp:16: note: ==> examining statement: <L0>: vectorizer.cpp:16: note: skip. vectorizer.cpp:16: note: ==> examining statement: D.2378_9 = pretmp.24_1 vectorizer.cpp:16: note: skip. vectorizer.cpp:16: note: ==> examining statement: D.2379_10 = i_4 * 4 vectorizer.cpp:16: note: skip. vectorizer.cpp:16: note: ==> examining statement: D.2380_11 = (float *) D.2379_10 vectorizer.cpp:16: note: skip. vectorizer.cpp:16: note: ==> examining statement: D.2381_12 = pretmp.24_1 + D.2380_11 vectorizer.cpp:16: note: skip. vectorizer.cpp:16: note: ==> examining statement: D.2382_17 = *D.2381_12 vectorizer.cpp:16: note: vectype: vector float vectorizer.cpp:16: note: nunits = 4 vectorizer.cpp:16: note: ==> examining statement: D.2383_19 = (const float *) D.2379_10 vectorizer.cpp:16: note: skip. vectorizer.cpp:16: note: ==> examining statement: D.2384_20 = D.2383_19 + op_3 vectorizer.cpp:16: note: skip. vectorizer.cpp:16: note: ==> examining statement: D.2385_21 = *D.2384_20 vectorizer.cpp:16: note: vectype: const vector float vectorizer.cpp:16: note: nunits = 4 vectorizer.cpp:16: note: ==> examining statement: D.2386_22 = D.2382_17 * D.2385_21 vectorizer.cpp:16: note: get vectype for scalar type: float vectorizer.cpp:16: note: get vectype with 4 units of type float vectorizer.cpp:16: note: vectype: vector float vectorizer.cpp:16: note: vectype: vector float vectorizer.cpp:16: note: nunits = 4 vectorizer.cpp:16: note: ==> examining statement: *D.2381_12 = D.2386_22 vectorizer.cpp:16: note: vectype: vector float vectorizer.cpp:16: note: nunits = 4 vectorizer.cpp:16: note: ==> examining statement: i_23 = i_4 + 1 vectorizer.cpp:16: note: skip. vectorizer.cpp:16: note: ==> examining statement: if (D.2376_16 > i_23) goto <L9>; else goto <L12>; vectorizer.cpp:16: note: skip. vectorizer.cpp:16: note: ==> examining statement: <L9>: vectorizer.cpp:16: note: skip. vectorizer.cpp:16: note: === vect_analyze_dependences === vectorizer.cpp:16: note: dependence distance = 0. vectorizer.cpp:16: note: accesses have the same alignment. vectorizer.cpp:16: note: dependence distance modulo vf == 0 between *D.2381_12 and *D.2381_12 vectorizer.cpp:16: note: not vectorized: can't determine dependence between *D.2384_20 and *D.2381_12 vectorizer.cpp:16: note: bad data dependence. vectorizer.cpp:16: note: vectorized 0 loops in function. The workaround with "op" is not needed with the current autovect-branch BTW. -- Summary: [4.2 regression] missed optimization with -ftree- vectorize Product: gcc Version: tree-ssa Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: gcc at pdoerfler dot com GCC build triplet: i686-pc-linux-gnu GCC host triplet: i686-pc-linux-gnu GCC target triplet: i686-pc-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28029