Re: [PyCUDA] Compiling thrust code in pyCUDA

Andreas Kloeckner Wed, 30 May 2012 22:21:07 -0700

On Wed, 30 May 2012 21:58:13 -0700, Bryan Catanzaro <[email protected]> wrote:
> Why should the overhead be measured separately?  For users of these
> systems, the Python overhead is unavoidable.  The time spent running
> on the GPU alone is an important implementation detail for people
> improving systems like PyCUDA, but users of these systems see overhead
> costs exposed in their overall application performance, and so I don't
> see how the overhead can be ignored.


Because whether the overhead matters or not depends on data size. Since
the overhead is constant across all data sizes, that overhead is going
to be mostly irrelevant for big data, whereas for tiny data it might
well be a dealbreaker.

That's why I think a single number doesn't cut it.

In addition, there's an underlying assumption that you'll keep the GPU
busy for a while, i.e. keep the GPU queue saturated. If you do that (the
ability to do that being related, again, to data size), then on top of
that anything Python does runs in parallel to the GPU--and your net run
time will be exactly the same as if the overhead never happened.

Andreas

pgpV6snTMHIaO.pgp
Description: PGP signature

_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda

Re: [PyCUDA] Compiling thrust code in pyCUDA

Reply via email to