https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44794
Richard Biener changed:
What|Removed |Added
Assignee|rguenth at gcc dot gnu.org |unassigned at gcc dot
gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44794
--- Comment #7 from Richard Biener ---
Heuristic to not unroll loops with prefetches is missing. The aprefetch pass
could set ->unroll to 1 in the loop structure.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44794
--- Comment #6 from CVS Commits ---
The master branch has been updated by Richard Biener :
https://gcc.gnu.org/g:a243ce2a52a6c62bc0d6be0b756a85dd9c1bceb7
commit r14-71-ga243ce2a52a6c62bc0d6be0b756a85dd9c1bceb7
Author: Richard Biener
Date: Th
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44794
Richard Biener changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot
gnu.org
La
--- Comment #4 from changpeng dot fang at amd dot com 2010-07-15 01:50
---
Created an attachment (id=21205)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=21205&action=view)
Do not unroll pre and post loops
I did a quick test on polyhedron before and after applying the preliminary
--- Comment #3 from changpeng dot fang at amd dot com 2010-07-06 18:35
---
Here is the impact of loop unrolling on the compilation time and code size
on polyhedron test_fpu.f90:
-O3 -ftree-vectorize -fno-prefetch-loop-arrays -fno-unroll-loops:
timing: 12.62s, size: 67069 bytes
-O3 -f
--- Comment #2 from changpeng dot fang at amd dot com 2010-07-06 17:58
---
We also need to handle the post loop of unrolling. Suppose the unroll_factor
is 16, then the post-loop should have up to 15 iterations.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44794
--- Comment #1 from rguenth at gcc dot gnu dot org 2010-07-03 10:48 ---
It would be interesting to know why/if number-of-iteration analysis fails
and if the code the vectorizer emits can be adjusted to fix that.
--
rguenth at gcc dot gnu dot org changed:
What|Removed