Re: [Numpy-discussion] Consider improving numpy.outer's behavior with zero-dimensional vectors

2015-04-15 Thread josef.pktd
On Wed, Apr 15, 2015 at 6:40 PM, Nathaniel Smith wrote: > On Wed, Apr 15, 2015 at 6:08 PM, wrote: >> On Wed, Apr 15, 2015 at 5:31 PM, Neil Girdhar wrote: >>> Does it work for you to set >>> >>> outer = np.multiply.outer >>> >>> ? >>> >>> It's actually faster on my machine. >> >> I assume it doe

Re: [Numpy-discussion] Automatic number of bins for numpy histograms

2015-04-15 Thread Neil Girdhar
Cool, thanks for looking at this. P2 might still be better even if the whole dataset is in memory because of cache misses. Partition, which I guess is based on quickselect, is going to run over all of the data as many times as there are bins roughly, whereas p2 only runs over it once. From a cac

Re: [Numpy-discussion] Consider improving numpy.outer's behavior with zero-dimensional vectors

2015-04-15 Thread Nathaniel Smith
On Wed, Apr 15, 2015 at 6:08 PM, wrote: > On Wed, Apr 15, 2015 at 5:31 PM, Neil Girdhar wrote: >> Does it work for you to set >> >> outer = np.multiply.outer >> >> ? >> >> It's actually faster on my machine. > > I assume it does because np.corrcoeff uses it, and it's the same type > of use cases

Re: [Numpy-discussion] Consider improving numpy.outer's behavior with zero-dimensional vectors

2015-04-15 Thread Neil Girdhar
I don't understand. Are you at pycon by any chance? On Wed, Apr 15, 2015 at 6:12 PM, wrote: > On Wed, Apr 15, 2015 at 6:08 PM, wrote: > > On Wed, Apr 15, 2015 at 5:31 PM, Neil Girdhar > wrote: > >> Does it work for you to set > >> > >> outer = np.multiply.outer > >> > >> ? > >> > >> It's act

Re: [Numpy-discussion] Consider improving numpy.outer's behavior with zero-dimensional vectors

2015-04-15 Thread josef.pktd
On Wed, Apr 15, 2015 at 6:08 PM, wrote: > On Wed, Apr 15, 2015 at 5:31 PM, Neil Girdhar wrote: >> Does it work for you to set >> >> outer = np.multiply.outer >> >> ? >> >> It's actually faster on my machine. > > I assume it does because np.corrcoeff uses it, and it's the same type > of use cases

Re: [Numpy-discussion] Consider improving numpy.outer's behavior with zero-dimensional vectors

2015-04-15 Thread josef.pktd
On Wed, Apr 15, 2015 at 5:31 PM, Neil Girdhar wrote: > Does it work for you to set > > outer = np.multiply.outer > > ? > > It's actually faster on my machine. I assume it does because np.corrcoeff uses it, and it's the same type of use cases. However, I'm not using it very often (I prefer broadca

Re: [Numpy-discussion] Consider improving numpy.outer's behavior with zero-dimensional vectors

2015-04-15 Thread Neil Girdhar
Does it work for you to set outer = np.multiply.outer ? It's actually faster on my machine. On Wed, Apr 15, 2015 at 5:29 PM, wrote: > On Wed, Apr 15, 2015 at 7:35 AM, Neil Girdhar > wrote: > > Yes, I totally agree. If I get started on the PR to deprecate np.outer, > > maybe I can do it as p

Re: [Numpy-discussion] Consider improving numpy.outer's behavior with zero-dimensional vectors

2015-04-15 Thread josef.pktd
On Wed, Apr 15, 2015 at 7:35 AM, Neil Girdhar wrote: > Yes, I totally agree. If I get started on the PR to deprecate np.outer, > maybe I can do it as part of the same PR? > > On Wed, Apr 15, 2015 at 4:32 AM, Sebastian Berg > wrote: >> >> Just a general thing, if someone has a few minutes, I thin

Re: [Numpy-discussion] IDE's for numpy development?

2015-04-15 Thread Joseph Martinot-Lagarde
Le 08/04/2015 21:19, Yuxiang Wang a écrit : > I think spyder supports code highlighting in C and that's all... > There's no way to compile in Spyder, is there? > Well, you could write a compilation script using Scons and run it from spyder ! :) But no, spyder is very python-oriented and there is

[Numpy-discussion] [ANN] python-blosc v1.2.5

2015-04-15 Thread Valentin Haenel
= Announcing python-blosc 1.2.5 = What is new? This release contains support for Blosc v1.5.4 including changes to how the GIL is kept. This was required because Blosc was refactored in the v1.5.x line to remove global variables

Re: [Numpy-discussion] Automatic number of bins for numpy histograms

2015-04-15 Thread Jaime Fernández del Río
On Wed, Apr 15, 2015 at 9:14 AM, Eric Moore wrote: > This blog post, and the links within also seem relevant. Appears to have > python code available to try things out as well. > > > https://dataorigami.net/blogs/napkin-folding/19055451-percentile-and-quantile-estimation-of-big-data-the-t-digest

Re: [Numpy-discussion] Automatic number of bins for numpy histograms

2015-04-15 Thread Jaime Fernández del Río
On Wed, Apr 15, 2015 at 8:06 AM, Neil Girdhar wrote: > You got it. I remember this from when I worked at Google and we would > process (many many) logs. With enough bins, the approximation is still > really close. It's great if you want to make an automatic plot of data. > Calling numpy.partit

Re: [Numpy-discussion] Automatic number of bins for numpy histograms

2015-04-15 Thread Eric Moore
This blog post, and the links within also seem relevant. Appears to have python code available to try things out as well. https://dataorigami.net/blogs/napkin-folding/19055451-percentile-and-quantile-estimation-of-big-data-the-t-digest -Eric On Wed, Apr 15, 2015 at 11:24 AM, Benjamin Root wrot

Re: [Numpy-discussion] Automatic number of bins for numpy histograms

2015-04-15 Thread Benjamin Root
"Then you can set about convincing matplotlib and friends to use it by default" Just to note, this proposal was originally made over in the matplotlib project. We sent it over here where its benefits would have wider reach. Matplotlib's plan is not to change the defaults, but to offload as much as

Re: [Numpy-discussion] Automatic number of bins for numpy histograms

2015-04-15 Thread Neil Girdhar
You got it. I remember this from when I worked at Google and we would process (many many) logs. With enough bins, the approximation is still really close. It's great if you want to make an automatic plot of data. Calling numpy.partition a hundred times is probably slower than calling P^2 with n=

Re: [Numpy-discussion] Automatic number of bins for numpy histograms

2015-04-15 Thread Jaime Fernández del Río
On Wed, Apr 15, 2015 at 4:36 AM, Neil Girdhar wrote: > Yeah, I'm not arguing, I'm just curious about your reasoning. That > explains why not C++. Why would you want to do this in C and not Python? > Well, the algorithm has to iterate over all the inputs, updating the estimated percentile posit

Re: [Numpy-discussion] Automatic number of bins for numpy histograms

2015-04-15 Thread Neil Girdhar
Yeah, I'm not arguing, I'm just curious about your reasoning. That explains why not C++. Why would you want to do this in C and not Python? On Wed, Apr 15, 2015 at 1:48 AM, Jaime Fernández del Río < jaime.f...@gmail.com> wrote: > On Tue, Apr 14, 2015 at 6:16 PM, Neil Girdhar > wrote: > >> If y

Re: [Numpy-discussion] Consider improving numpy.outer's behavior with zero-dimensional vectors

2015-04-15 Thread Neil Girdhar
Yes, I totally agree. If I get started on the PR to deprecate np.outer, maybe I can do it as part of the same PR? On Wed, Apr 15, 2015 at 4:32 AM, Sebastian Berg wrote: > Just a general thing, if someone has a few minutes, I think it would > make sense to add the ufunc.reduce thing to all of th

Re: [Numpy-discussion] Consider improving numpy.outer's behavior with zero-dimensional vectors

2015-04-15 Thread Sebastian Berg
Just a general thing, if someone has a few minutes, I think it would make sense to add the ufunc.reduce thing to all of these functions at least in the "See Also" or "Notes" section in the documentation. These special attributes are not that well known, and I think that might be a nice way to make