I don't have the slightest idea what I'm doing, but
file name - the_lib.c
___
#include
#include
#include
#include
void dists2d( double *a_ps, int na,
double *b_ps, int nb,
double *dist, int num_threads)
{
int i, j;
int dynamic=0;
Take a look at a nice project coming out of my department:
http://code.google.com/p/cudamat/
Best,
Jon.
On Tue, Feb 15, 2011 at 11:33 AM, Sebastian Haase wrote:
> Wes,
> I think I should have a couple of GPUs. I would be ready for anything
> ... if you think that I could do some easy(!) CUDA p
The `cdist` function in scipy spatial does what you want, and takes ~ 1ms on
my machine.
In [1]: import numpy as np
In [2]: from scipy.spatial.distance import cdist
In [3]: a = np.random.random((340, 2))
In [4]: b = np.random.random((329, 2))
In [5]: c = cdist(a, b)
In [6]: c.shape
Out[6]: (
Hi,
I'm trying to get started with f2py on a Windows 7 environment using the
Python(x,y) v 2.6.5.6 distribution.
I'm following the introductory example of the f2py userguide and try to wrap
the file FIB1.F using the command:
f2py.py -c fib1.f -m fib1
from the windows command line. I get the foll
Hi Eat,
I will surely try these routines tomorrow,
but I still think that neither scipy function does the complete
distance calculation of all possible pairs as done by my C code.
For 2 arrays, X and Y, of nX and nY 2d coordinates respectively, I
need to get nX times nY distances computed.
>From th
Hi,
On Tue, Feb 15, 2011 at 5:50 PM, Sebastian Haase wrote:
> Hi,
> I assume that someone here could maybe help me, and I'm hoping it's
> not too much off topic.
> I have 2 arrays of 2d point coordinates and would like to calculate
> all pairwise distances as fast as possible.
> Going from Python
I'm sorry that I don't have some example code for you, but you
probably need to break down the problem if you can't fit it into
memory: http://en.wikipedia.org/wiki/Overlap-add_method
Jonathan
On Tue, Feb 15, 2011 at 10:27 AM, wrote:
> On Tue, Feb 15, 2011 at 11:42 AM, Davide Cittaro
> wrote:
On Tue, Feb 15, 2011 at 11:42 AM, Davide Cittaro
wrote:
> Hi all,
> I have to work with huge numpy.array (i.e. up to 250 M long) and I have to
> perform either np.correlate or np.convolve between those.
> The process can only work on big memory machines but it takes ages. I'm
> writing to get so
On Tue, Feb 15, 2011 at 11:42 AM, Davide Cittaro
wrote:
> Hi all,
> I have to work with huge numpy.array (i.e. up to 250 M long) and I have to
> perform either np.correlate or np.convolve between those.
> The process can only work on big memory machines but it takes ages. I'm
> writing to get so
On Tue, Feb 15, 2011 at 11:33 AM, Sebastian Haase wrote:
> Wes,
> I think I should have a couple of GPUs. I would be ready for anything
> ... if you think that I could do some easy(!) CUDA programming here,
> maybe you could guide me into the right direction...
> Thanks,
> Sebastian.
>
>
> On Tue,
Hi all,
I have to work with huge numpy.array (i.e. up to 250 M long) and I have to
perform either np.correlate or np.convolve between those.
The process can only work on big memory machines but it takes ages. I'm writing
to get some hint on how to speed up things (at cost of precision, maybe...)
Wes,
I think I should have a couple of GPUs. I would be ready for anything
... if you think that I could do some easy(!) CUDA programming here,
maybe you could guide me into the right direction...
Thanks,
Sebastian.
On Tue, Feb 15, 2011 at 5:26 PM, Wes McKinney wrote:
> On Tue, Feb 15, 2011 at 1
On Tue, Feb 15, 2011 at 11:25 AM, Matthieu Brucher
wrote:
> Use directly restrict in C99 mode (__restrict does not have exactly the same
> semantics).
> For a valgrind profil, you can check my blog
> (http://matt.eifelle.com/2009/04/07/profiling-with-valgrind/)
> Basically, if you have a python sc
Use directly restrict in C99 mode (__restrict does not have exactly the same
semantics).
For a valgrind profil, you can check my blog (
http://matt.eifelle.com/2009/04/07/profiling-with-valgrind/)
Basically, if you have a python script, you can valgrind --optionsinmyblog
python myscript.py
For PA
Thanks Matthieu,
using __restrict__ with g++ did not change anything. How do I use
valgrind with C extensions?
I don't know what "PAPI profil" is ...?
-Sebastian
On Tue, Feb 15, 2011 at 4:54 PM, Matthieu Brucher
wrote:
> Hi,
> My first move would be to add a restrict keyword to dist (i.e. dist i
Hi,
My first move would be to add a restrict keyword to dist (i.e. dist is the
only pointer to the specific memory location), and then declare dist_ inside
the first loop also with a restrict.
Then, I would run valgrind or a PAPI profil on your code to see what causes
the issue (false sharing, ...
Hi,
I assume that someone here could maybe help me, and I'm hoping it's
not too much off topic.
I have 2 arrays of 2d point coordinates and would like to calculate
all pairwise distances as fast as possible.
Going from Python/Numpy to a (Swigged) C extension already gave me a
55x speedup.
(.9ms vs.
17 matches
Mail list logo