Well, there were no responses to my earlier email proposing changes to numpy.bincount() to make it faster and more flexible. Does this mean noone is using bincount? :-)
Anyway I've now made the proposed modifications and got substantial speedups of 3-10. Details are in this extract from my C code. For the time being I will test this as a standalone extension module. When stable and tests are written, I'll submit as a patch to the numpy's _compiled_base.c. Cheers, Stephen /************************************************ * Faster versions of bincount and casting strings to integers * * Author: Stephen Simmons, [EMAIL PROTECTED] * Date: 11 March 2007 * * This module contains C code for functions I am using to accelerate * SQL-like aggregate functions for a column-oriented database based on numpy. * * subtotal's bincount is typically 3-10 times faster than numpy's bincount, * and more flexible * - takes bins as they are, without promoting their type to int32 * - takes weights as they are, without promoting their type to double * - ignores negative bins, rather than reporting an error * - allows optional int32/double output array to be passed in * - specify maximum bin number to use. Larger bins numbers are ignored * - only scans for max bin number if neither max_bin nor out array specified * * atoi is 30-60 times faster than casting a string array to an integer type, * and may optionally use a translation table. The translation table is * a convenient way to map sparse integer strings to adjacent bins before * using bincount. * * Typical usage * ------------- # Set up arrays 5,000,000 elts long. s1 has strings filled with '1' and '2'. # w are weights >>> s1 = numpy.ndarray(shape=5000000, dtype='S1'); s1[:]='2'; s1[1::2]='1' >>> w = numpy.arange(5000000,dtype='f4') # Using standard numpy functions, string conversion is slow >>> i = s1.astype('i1') -> 5.95 sec (!) >>> numpy.bincount(i) -> 113 ms >>> numpy.bincount(i, w) -> 272 ms >>> numpy.bincount(s1.astype('i1'), w) -> 6.12 sec # Using the faster functions here: >>> i = subtotal.atoi(s1) -> 90 ms (60x faster) >>> subtotal.bincount(i, max_bin=2) -> 31 ms (3.6x faster) >>> subtotal.bincount(i, w, max_bin=2) -> 51 ms (5.3x faster) # In both bases, output is array([2, 1, 2, ..., 1, 2, 1], dtype=int8) array([ 0, 2500000, 2500000]) array([ 0.00000000e+00, 6.25000000e+12, 6.24999750e+12]) # As an example of using a translate table, run bincount from atoi applying # a translate table for counting odd vs even strings. This does the whole lot # in 158ms, a speedup of 38x from 6.12 s above. >>> t = numpy.arange(256, dtype='i1') % 2 >>> subtotal.bincount(subtotal.atoi(s1, t), w, max_bin=t.max()) -> 158 ms array([ 6.24999750e+12, 6.25000000e+12]) * * These timings are based on numpy-1.0.1 Windows binary distribution * for Python 2.5, compared with building this code using standard distutils * without any particular compiler optimisations: C:\mnt\dev\subtotal> python setup.py build -cmingw32 * Processor is a 1.83GHz Pentium-M ****************************************************/ << rest of C code omitted >> Stephen Simmons wrote: > Hi, > > I'd like to propose some minor modifications to the function > bincount(arr, weights=None), so would like some feedback from other > uses of bincount() before I write this up as a proper patch, . > > Background: > bincount() has two forms: > - bincount(x) returns an integer array ians of length max(x)+1 where > ians[n] is the number of times n appears in x. > - bincount(x, weights) returns a double array dans of length max(x)+1 > where dans[n] is the sum of elements in the weight vector weights[i] > at the positions where x[i]==n > In both cases, all elements of x must be non-negative. > > Proposed changes: > (1) Remove the restriction that elements of x must be non-negative. > Currently bincount() starts by finding max(x) and min(x). If the min > value is negative, an exception is raised. This change proposes > dropping the initial search for min(x), and instead testing for > non-negativity while summing values in the return arrays ians or dans. > Any indexes where where x is negative will be silently ignored. This > will allow selective bincounts where values to ignore are flagged with > a negative bin number. > > (2) Allow an optional argument for maximum bin number. > Currently bincount(x) returns an array whose length is dependent on > max(x). It is sometimes preferable to specify the exact size of the > returned array, so this change would add a new optional argument, > max_bin, which is one less than the size of the returned array. Under > this change, bincount() starts by finding max(x) only if max_bin is > not specified. Then the returned array ians or dans is created with > length max_bin+1, and any indexes that would overflow the output array > are silently ignored. > > (3) Allow an optional output array, y. > Currently bincount() creates a new output array each time. Sometimes > it is preferable to add results to an existing output array, for > example, when the input array is only available in smaller chunks, or > for a progressive update strategy to avoid fp precision problems when > adding lots of small weights to larger subtotals. Thus we can add an > extra optional argument y that bypasses the creation of an output array. > > With these three change, the function signature of bincount() would > become: > bincount(x, weights=None, y=None, max_bin=None) > > > Anyway, that's the general idea. I'd be grateful for any feedback > before I code this up as a patch to _compiled_base.c. > > Cheers > > Stephen > > _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion