Re: [Numpy-discussion] OT: performance in C extension; OpenMP, or SSE ?

2011-02-19 Thread Sturla Molden
Den 19.02.2011 18:13, skrev Sebastian Haase: > Can one assume that the cache line is always a few mega bytes ? Don't confuse the size of a cache with the size of a cache line. A "cache line" (which is the unit that gets marked dirty) is typically 8-512 bytes. Make sure your OpenMP threads stay

Re: [Numpy-discussion] OT: performance in C extension; OpenMP, or SSE ?

2011-02-19 Thread Matthieu Brucher
Write miss are indication that data had to be imported inside L1 before it could be written. I don't know if valgrind can give indication of false sharing, unfortunately. That's why I suggested you use a multiple of the cache line so that false sharing do not occur. Matthieu 2011/2/19 Sebastian H

Re: [Numpy-discussion] OT: performance in C extension; OpenMP, or SSE ?

2011-02-19 Thread Pauli Virtanen
On Sat, 19 Feb 2011 18:13:44 +0100, Sebastian Haase wrote: > Thanks a lot. Very informative. I guess what you say about "cache line > is dirtied" is related to the info I got with valgrind (see my email in > this thread: L1 Data Write Miss 3636). Can one assume that the cache > line is always a few

Re: [Numpy-discussion] OT: performance in C extension; OpenMP, or SSE ?

2011-02-19 Thread Sebastian Haase
Thanks a lot. Very informative. I guess what you say about "cache line is dirtied" is related to the info I got with valgrind (see my email in this thread: L1 Data Write Miss 3636). Can one assume that the cache line is always a few mega bytes ? Thanks, Sebastian On Sat, Feb 19, 2011 at 12:40 AM,

Re: [Numpy-discussion] How to tell if I succeeded to build numpy with amd, umfpack and lapack

2011-02-19 Thread Samuel John
Thanks Robin, that makes sense and explains why I could not find any reference. Perhaps the scipy.org wiki and install instructions should be updated. I mean how many people try to compile amd and umfpack, because they think it's good for numpy to have them, because the site.cfg contains those en