Re: [PyCUDA] Event lifetime, cross-thread use

Eli Stevens (Gmail) Thu, 08 Mar 2012 12:56:31 -0800

Sounds like the thing to do is use the thread.join as the
synchronization callable, rather than an event.  Should work out
pretty much the same.  Sound about right?  I don't have any strict
need to use events; just trying to figure out a decent way to spread
the work over multiple GPUs without having to do a lot of bookkeeping.


We actually started this project with OpenCL, but ran into too many
bugs in the implementation we were using (Apple's) and decided it
wasn't stable enough for our use (it also didn't seem like there was
good profiler / debugger support at the time; not sure if that's
changed).

Throwing around all of our numpy data across processes doesn't sound
like any less of a headache than threads.  :/  Maybe MPC's gotten a
lot better since I looked at it last?  Threads should get the job
done, though...

Thanks,
Eli

On Thu, Mar 8, 2012 at 11:08 AM, Andreas Kloeckner
<[email protected]> wrote:
> <#part sign=pgpmime>
> Hi Eli,
>
> On Thu, 8 Mar 2012 09:33:21 -0800, "Eli Stevens (Gmail)" 
> <[email protected]> wrote:
>> I was wondering if the following will work:
>>
>> - Main thread spins up thread B.
>> - Thread B creates a context, invokes a kernel, and creates an event.
>> - Event is saved.
>> - Thread B pops the context (kernel is still running at this point)
>> and finishes.
>> - Main thread join()s B and grabs the event.
>> - Main thread does other stuff and eventually calls .synchronize()
>>
>> Does that work?  Or will trying to use an event after popping the
>> associated context (and from a different thread) cause problems?  My
>> actual use case involves a thread C that's doing other things on a
>> second GPU.  Maybe instead of an event, I should just have the threads
>> block and then use the join to indicate when the kernel is done?  Any
>> advice appreciated.  :)
>
> This should work as well as it does in the underlying CUDA
> implementation. The problem is that there is always a question of what
> context is active where and there's an intricate dance that has to be
> performed of one thread having to release the context and another one
> grabbing it (as you describe). To me, this seems not worth the
> headache. OpenCL is cleaner in this respect, if that's an option for
> you. Failing that, keeping all CUDA objects associated with a context
> within a thread *will* make your life easier (especially with respect to
> garbage collection). If you can, processes make all of this even less
> brittle (and more concurrent).
>
> HTH,
> Andreas
>

_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda

Re: [PyCUDA] Event lifetime, cross-thread use

Reply via email to