--- Comment #33 from eyal at geomage dot com 2008-02-13 16:06 ---
Hi All,
I've done some changes that hopefully prevent the memory from being a
performance bottleneck. I see a perf gain of ~10%. However the compiler still
gives me the warnings in comment #19 -
Test.cpp:24:
--- Comment #32 from eyal at geomage dot com 2008-02-12 11:28 ---
(In reply to comment #31)
> > I would appriciate, however, a further explaination about this issue.
> The explanation has to deal with CPU architecture and is not related to
> compilers. In case of cache mis
--- Comment #30 from eyal at geomage dot com 2008-02-12 08:43 ---
Hi,
Thanks a lot for the input about a potential memory bottle-neck. I indeed was
under the impression that once I got the loop vectorized, I'd immidiatly see a
performance boost.
I would appriciate, howev
--- Comment #27 from eyal at geomage dot com 2008-02-11 14:00 ---
Hi,
I am a bit lost and appriciate your guidelines. Up till now, after all those
emails, I still have no clue as to why such a simple test case doesnt work. As
far as I understood the vectorization should have shown
--- Comment #23 from eyal at geomage dot com 2008-02-10 15:47 ---
(In reply to comment #22)
> 1. It looks like vectorizer was enabled in both cases, since -O3 enables the
> vectorizer by the default. You need to add -fno-tree-vectorize to disable it
> explicitly.
> 2. T
--- Comment #21 from eyal at geomage dot com 2008-02-10 13:48 ---
(In reply to comment #14)
> Giving it another thought, this is not necessary an alias analysis issue, even
> that it fails to tell that the pointers not alias. Since in this case the
> pointers do differ, the run
--- Comment #20 from eyal at geomage dot com 2008-02-10 07:56 ---
Hi,
I've tried putting the loop to be vectorized in a different method and the
compiler output looks better, but the performance is still the same as the
non-vectorized code.
#include
#include
#include
ty
--- Comment #19 from eyal at geomage dot com 2008-02-10 07:42 ---
Hi,
This is the simplest test I have.
#include
#include
#include
typedef float ARRTYPE;
int main ( int argc, char *argv[] )
{
int m_nSamples = atoi( argv[1] );
int itBegin = atoi( argv[2
--- Comment #17 from eyal at geomage dot com 2008-02-08 08:58 ---
> Using malloc instead of new does generate better code and improves performance
> slightly for me, admittedly not as much as we would like; the kernel becomes:
> (using only -O3 -S -m64 -maltivec)
> .L29:
&
--- Comment #16 from eyal at geomage dot com 2008-02-08 08:55 ---
Thanks a lot Ira, I appriciate it.
If you need the full test code with .vect file and makefiles,please let me
know.
thanks,
eyal
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35117
--- Comment #12 from eyal at geomage dot com 2008-02-07 13:07 ---
(In reply to comment #11)
> (In reply to comment #10)
> > Is there some pragma or a coding convention I can use to make the compiler
> > understant those pointers have nothing to do with each ot
--- Comment #10 from eyal at geomage dot com 2008-02-07 12:58 ---
(In reply to comment #9)
> (In reply to comment #8)
> > {
> > float *pTempSumPhase_Temp_cre_angle = (float*) malloc (sizeof(float)
> > *m_nSamples);
> > float *pTempSum2Ph
--- Comment #8 from eyal at geomage dot com 2008-02-07 12:16 ---
Hi Ira,
Here is the compiler output for the real code.
Crs/CEE_CRE_2DSearch.cpp:1285: note: create runtime check for data references
*D.86651_134 and *D.8_160
Crs/CEE_CRE_2DSearch.cpp:1285: note: create runtime check
--- Comment #7 from eyal at geomage dot com 2008-02-07 11:06 ---
(In reply to comment #6)
> (In reply to comment #2)
> > Yes the loop is vectorized.
> ...
> > Eyal.cpp:34: note: created 9 versioning for alias checks.
> > Eyal.cpp:34: note: LOOP VECTORIZED.
--- Comment #5 from eyal at geomage dot com 2008-02-07 10:43 ---
(In reply to comment #3)
> I think this is a dup of another bug I filed with respect of the builtin
> operator new that getting the malloc attribute.
Are you refering to using malloc instead of new?
using malloc
--- Comment #2 from eyal at geomage dot com 2008-02-07 10:36 ---
Yes the loop is vectorized. What do you mean by memory bound? dont you think
that vectorization can help here? I see around 20% performance gain in the real
application.
Bellow is the compiler output:
Eyal.cpp:34: note
4 -o TestVec
Test.o
--
Summary: Vectorization on power PC
Product: gcc
Version: 4.3.0
Status: UNCONFIRMED
Severity: major
Priority: P3
Component: c++
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: eyal at g
17 matches
Mail list logo