Ah, I see. I think you might be trying to fit a round peg into a square hole, so to speak.
This will work in some form or other, but I'd bet $100 that you could write a simple C function that would traverse your original Python list of lists and give you the index that you want, and that it would be much faster than converting the data to Numpy, then sending the data structure to the video card and hammering on it with CUDA. Couple that with the fact that you'll probably write your C function correctly in less time than the necessary CUDA kernel, and you save on development time as well as execution time. Unfortunately, I do not know anything about what you can do on Numpy objects or Python objects from CUDA itself. Anybody else know the answer to that or have a suggestion? David On Wed, Aug 24, 2011 at 9:45 AM, Francis <[email protected]> wrote: > Hi David, > Thanks for the reply. My data is in the host and will get moved to the > device. They are of the form: > [ [ ' string ', ' string ', ' string ' ], [ ' string ' ], .... , [ ' string > ', ' string ' ] ] > > Which is in Python and which I'm planning on turning into a numpy array. I > don't really mind what the values of the stringsĀ are, just their count for > each sub-list. I suppose I could just have one thread per sub-list work on > the length of that entire sub-list though I've yet to try that out. :) > > Best regards, > > ./francis > > > 2011/8/24 David Mertens <[email protected]> >> >> Sorry, meant to copy the list on this. >> >> ---------- Forwarded message ---------- >> From: "David Mertens" <[email protected]> >> Date: Aug 24, 2011 8:18 AM >> Subject: Re: [PyCUDA] Get sublist with largest length >> To: "Francis" <[email protected]> >> >> Francis, >> >> The answer to your question depends on the form of your data structure. >> Are you working with a Python array of arrays, a C array of arrays, or >> something else? Is this data already on the GPU, or were you planning on >> copying it over for this calculation only? >> >> There are fast CUDA methods for computing the min or max of a set of >> numbers, but assembling that set of numbers for your data set may take so >> much time that the speedup from using CUDA doesn't really matter. That's why >> I'm asking about the form of the data first. >> >> David >> >> On Aug 23, 2011 10:07 AM, "Francis" <[email protected]> wrote: >> >> _______________________________________________ >> PyCUDA mailing list >> [email protected] >> http://lists.tiker.net/listinfo/pycuda >> > > -- Sent via my carrier pigeon. _______________________________________________ PyCUDA mailing list [email protected] http://lists.tiker.net/listinfo/pycuda
