https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108487
Bug ID: 108487 Summary: ~20-30x slowdown in populating std::vector from std::ranges::iota_view Product: gcc Version: 12.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: Mark_B53 at yahoo dot com Target Milestone: --- Using -std=c++20 -O3, comparing gcc 12.2 vs. gcc 10.3: * fn2 is 20-30x slower on gcc 12.2 (i.e. 2000-3000% more) * fn1 is ~20% slower on gcc 12.2 This test was run on an 52 core Intel Xeon Gold 6278C CPU. Tests on www.godbolt.org directionally align with these findings. It seems the slowdown was introduced in 10.4 & 11.1. The trunk has identical performance to 12.2. #include <vector> #include <ranges> #include <ctime> #include <iostream> __attribute__((noinline)) std::vector<int> fn1(int n) { auto v = std::vector<int>(n); for(int i = 0; i != n; ++i) v[i] = i; return v; } __attribute__((noinline)) std::vector<int> fn2(int n) { auto rng = std::ranges::iota_view{0, n}; return std::vector<int>{rng.begin(), rng.end()}; } int main() { int n = 100'000; int times = 100'000; auto t0 = std::clock(); for (int i = 0; i < times; ++i) fn1(n); auto t1 = std::clock(); for (int i = 0; i < times; ++i) fn2(n); auto t2 = std::clock(); std::cout << t1 - t0 << '\n'; std::cout << t2 - t1 << '\n'; return 0; } P.S. 20% slowdown for a common vector population is still significant IMO. I am not sure that qualifies as a bug. I did not file one on account of the 'fn1' slowdown.