Re: [Numpy-discussion] Silent Broadcasting considered harmful

2015-02-08 Thread Eelco Hoogendoorn
ussion of Numerical Python" >> >>> Betreff: Re: [Numpy-discussion] Silent Broadcasting considered harmful >> >>> >> >>> >> >>> On Sun, Feb 8, 2015 at 2:14 PM, Stefan Reiterer >> wrote: >> >>>> >> >>>&g

Re: [Numpy-discussion] Silent Broadcasting considered harmful

2015-02-08 Thread Eelco Hoogendoorn
This. (nd)arrays are a far more widespread concept than linear algebraic operations. If you want LA semantics, use the matrix subclass. Or don't, since simply sticking to the much more pervasive and general ndarray semantics is usually simpler and less confusing. On Sun, Feb 8, 2015 at 10:54 PM, W

Re: [Numpy-discussion] Silent Broadcasting considered harmful

2015-02-08 Thread Eelco Hoogendoorn
> I personally use Octave and/or Numpy for several years now and never ever needed braodcasting. But since it is still there there will be many users who need it, there will be some use for it. Uhm, yeah, there is some use for it. Im all for explicit over implicit, but personally current broadcas

Re: [Numpy-discussion] Sorting refactor

2015-01-16 Thread Eelco Hoogendoorn
> > > > The data parallel constructs tend to crash the compiler, but task > > spawning seems to be stable in 4.9.2. I've still to see how it handles > > multiprocessing/fork. > > > > What do you mean by will be in 5.0, did they do a big push? > > gcc 5.0 cha

Re: [Numpy-discussion] Sorting refactor

2015-01-16 Thread Eelco Hoogendoorn
I don't know if there is a general consensus or guideline on these matters, but I am personally not entirely charmed by the use of behind-the-scenes parallelism, unless explicitly requested. Perhaps an algorithm can be made faster, but often these multicore algorithms are also less efficient, and

Re: [Numpy-discussion] Optimizing numpy's einsum expression (again)

2015-01-16 Thread Eelco Hoogendoorn
Thanks for taking the time to think about this; good work. Personally, I don't think a factor 5 memory overhead is much to sweat over. The most complex einsum I have ever needed in a production environment was 5/6 terms, and for what this anecdote is worth, speed was a far bigger concern to me tha

Re: [Numpy-discussion] Question about dtype

2014-12-13 Thread Eelco Hoogendoorn
This is a general problem in trying to use JSON to send arbitrary python objects. Its not made for that purpose, JSON itself only supports a very limited grammar (only one sequence type for instance, as you noticed), so in general you will need to specify your own encoding/decoding for more complex

Re: [Numpy-discussion] Should ndarray be a context manager?

2014-12-09 Thread Eelco Hoogendoorn
My impression is that this level of optimization does and should not fall within the scope of numpy.. -Original Message- From: "Sturla Molden" Sent: ‎9-‎12-‎2014 16:02 To: "numpy-discussion@scipy.org" Subject: [Numpy-discussion] Should ndarray be a context manager? I wonder if ndarra

Re: [Numpy-discussion] help using np.einsum for stacked matrix multiplication

2014-10-29 Thread Eelco Hoogendoorn
You need to specify your input format. Also, if your output matrix misses the NY dimension, that implies you wish to contract (sum) over it, which contradicts your statement that the 2x2 subblocks form the matrices to multiply with. In general, I think it would help if you give a little more backgr

Re: [Numpy-discussion] FFTS for numpy's FFTs (was: Re: Choosing between NumPy and SciPy functions)

2014-10-29 Thread Eelco Hoogendoorn
have them in a separate package. Id rather have us discuss how to facilitate the integration of as many possible fft libraries with numpy behind a maximally uniform interface, rather than having us debate which fft library is 'best'. On Tue, Oct 28, 2014 at 6:21 PM, Sturla Molden wrote:

Re: [Numpy-discussion] FFTS for numpy's FFTs (was: Re: Choosing between NumPy and SciPy functions)

2014-10-28 Thread Eelco Hoogendoorn
If I may 'hyjack' the discussion back to the meta-point: should we be having this discussion on the numpy mailing list at all? Perhaps the 'batteries included' philosophy made sense in the early days of numpy; but given that there are several fft libraries with their own pros and cons, and that m

Re: [Numpy-discussion] Choosing between NumPy and SciPy functions

2014-10-27 Thread Eelco Hoogendoorn
The same occurred to me when reading that question. My personal opinion is that such functionality should be deprecated from numpy. I don't know who said this, but it really stuck with me: but the power of numpy is first and foremost in it being a fantastic interface, not in being a library. There

Re: [Numpy-discussion] Memory efficient alternative for np.loadtxt and np.genfromtxt

2014-10-26 Thread Eelco Hoogendoorn
Im not sure why the memory doubling is necessary. Isnt it possible to preallocate the arrays and write to them? I suppose this might be inefficient though, in case you end up reading only a small subset of rows out of a mostly corrupt file? But that seems to be a rather uncommon corner case. Eithe

Re: [Numpy-discussion] Request for enhancement to numpy.random.shuffle

2014-10-12 Thread Eelco Hoogendoorn
yeah, a shuffle function that does not shuffle indeed seems like a major source of bugs to me. Indeed one could argue that setting axis=None should suffice to give a clear enough declaration of intent; though I wouldn't mind typing the extra bit to ensure consistent semantics. On Sun, Oct 12, 201

Re: [Numpy-discussion] Request for enhancement to numpy.random.shuffle

2014-10-12 Thread Eelco Hoogendoorn
Thanks Warren, I think these are sensible additions. I would argue to treat the None-False condition as an error. Indeed I agree one might argue the correcr behavior is to 'shuffle' the singleton block of data, which does nothing; but its more likely to come up as an unintended error than as a nat

Re: [Numpy-discussion] 0/0 == 0?

2014-10-02 Thread Eelco Hoogendoorn
slightly OT; but fwiw, its all ill-thought out nonsense from the start anyway. ALL numbers satisfy the predicate 0*x=0. what the IEEE calls 'not a number' would be more accurately called 'not a specific number', or 'a number'. whats a logical negation among computer scientists? On Fri, Oct 3, 201

Re: [Numpy-discussion] Proposal: add ndarray.keys() to return dtype.names

2014-10-01 Thread Eelco Hoogendoorn
tly, special methods show up in the ndarray > object when the dtype is datetime64, right? > > On Wed, Oct 1, 2014 at 10:13 AM, Eelco Hoogendoorn < > hoogendoorn.ee...@gmail.com> wrote: > >> Well, the method will have to be present on all ndarrays, since >> structured

Re: [Numpy-discussion] Proposal: add ndarray.keys() to return dtype.names

2014-10-01 Thread Eelco Hoogendoorn
kind of duck it is, so to speak. Indeed it seems like an atypical design pattern; but I don't see a problem with it. On Wed, Oct 1, 2014 at 4:08 PM, John Zwinck wrote: > On 1 Oct 2014 04:30, "Stephan Hoyer" wrote: > > > > On Tue, Sep 30, 2014 at 1:22 PM, Eelco

Re: [Numpy-discussion] Proposal: add ndarray.keys() to return dtype.names

2014-09-30 Thread Eelco Hoogendoorn
On more careful reading of your words, I think we agree; indeed, if keys() is present is should return an iterable; but I don't think it should be present for non-structured arrays. On Tue, Sep 30, 2014 at 10:21 PM, Eelco Hoogendoorn < hoogendoorn.ee...@gmail.com> wrote: > So a

Re: [Numpy-discussion] Proposal: add ndarray.keys() to return dtype.names

2014-09-30 Thread Eelco Hoogendoorn
So a non-structured array should return an empty list/iterable as its keys? That doesn't seem right to me, but perhaps you have a compelling example to the contrary. I mean, wouldn't we want the duck-typing to fail if it isn't a structured array? Throwing an attributeError seems like the best thin

Re: [Numpy-discussion] Proposal: add ndarray.keys() to return dtype.names

2014-09-30 Thread Eelco Hoogendoorn
Sounds fair to me. Indeed the ducktyping argument makes sense, and I have a hard time imagining any namespace conflicts or other confusion. Should this attribute return none for non-structured arrays, or simply be undefined? On Tue, Sep 30, 2014 at 12:49 PM, John Zwinck wrote: > I first proposed

Re: [Numpy-discussion] Tracking and inspecting numpy objects

2014-09-15 Thread Eelco Hoogendoorn
wer! > > Best regards, > > Mads > > On 15/09/14 12:11, Sebastian Berg wrote: > > On Mo, 2014-09-15 at 12:05 +0200, Eelco Hoogendoorn wrote: > >> > >> > >> > >> On Mon, Sep 15, 2014 at 11:55 AM, Sebastian Berg > >>

Re: [Numpy-discussion] Tracking and inspecting numpy objects

2014-09-15 Thread Eelco Hoogendoorn
On Mon, Sep 15, 2014 at 11:55 AM, Sebastian Berg wrote: > On Mo, 2014-09-15 at 10:16 +0200, Mads Ipsen wrote: > > Hi, > > > > I am trying to inspect the reference count of numpy arrays generated by > > my application. > > > > Initially, I thought I could inspect the tracked objects using > > gc.g

Re: [Numpy-discussion] why does u.resize return None?

2014-09-11 Thread Eelco Hoogendoorn
agreed; I never saw the logic in returning none either. On Thu, Sep 11, 2014 at 4:27 PM, Neal Becker wrote: > It would be useful if u.resize returned the new array, so it could be used > for > chaining operations > > -- > -- Those who don't understand recursion are doomed to repeat it > > __

Re: [Numpy-discussion] Generalize hstack/vstack --> stack; Blockmatrices like in matlab

2014-09-08 Thread Eelco Hoogendoorn
on the world... I think just having this generalize stack feature would be nice start. Tetris could be built on top of that later. (Although, I do vote for at least 3 or 4 dimensional stacking, if possible). Cheers! Ben Root On Mon, Sep 8, 2014 at 12:41 PM, Eelco Hoogendoorn wrote: Sturl

Re: [Numpy-discussion] Generalize hstack/vstack --> stack; Block matrices like in matlab

2014-09-08 Thread Eelco Hoogendoorn
Sturla: im not sure if the intention is always unambiguous, for such more flexible arrangements. Also, I doubt such situations arise often in practice; if the arrays arnt a grid, they are probably a nested grid, and the code would most naturally concatenate them with nested calls to a stacking fun

Re: [Numpy-discussion] Does a `mergesorted` function make sense?

2014-09-04 Thread Eelco Hoogendoorn
of 3: 161 µs per loop > > In [11]: s = Series(a) > > # without the creation overhead > In [12]: %timeit s.unique() > 1 loops, best of 3: 75.3 µs per loop > > > > On Thu, Sep 4, 2014 at 2:29 PM, Eelco Hoogendoorn < > hoogendoorn.ee...@gmail.com> wrote: &g

Re: [Numpy-discussion] Does a `mergesorted` function make sense?

2014-09-04 Thread Eelco Hoogendoorn
On Thu, Sep 4, 2014 at 8:14 PM, Eelco Hoogendoorn < hoogendoorn.ee...@gmail.com> wrote: > I should clarify: I am speaking about my implementation, I havnt looked at > the numpy implementation for a while so im not sure what it is up to. Note > that by 'almost free', w

Re: [Numpy-discussion] Does a `mergesorted` function make sense?

2014-09-04 Thread Eelco Hoogendoorn
or performance isn't that big a concern in the first place. On Thu, Sep 4, 2014 at 7:55 PM, Jaime Fernández del Río < jaime.f...@gmail.com> wrote: > On Thu, Sep 4, 2014 at 10:39 AM, Eelco Hoogendoorn < > hoogendoorn.ee...@gmail.com> wrote: > >> >> On Thu, Sep

Re: [Numpy-discussion] Does a `mergesorted` function make sense?

2014-09-04 Thread Eelco Hoogendoorn
On Thu, Sep 4, 2014 at 10:31 AM, Eelco Hoogendoorn < hoogendoorn.ee...@gmail.com> wrote: > > On Wed, Sep 3, 2014 at 6:46 PM, Jaime Fernández del Río < > jaime.f...@gmail.com> wrote: > >> On Wed, Sep 3, 2014 at 9:33 AM, Jaime Fernández del Río < >> jaime.f...

Re: [Numpy-discussion] Does a `mergesorted` function make sense?

2014-09-04 Thread Eelco Hoogendoorn
On Wed, Sep 3, 2014 at 6:46 PM, Jaime Fernández del Río < jaime.f...@gmail.com> wrote: > On Wed, Sep 3, 2014 at 9:33 AM, Jaime Fernández del Río < > jaime.f...@gmail.com> wrote: > >> On Wed, Sep 3, 2014 at 6:41 AM, Eelco Hoogendoorn < >> hoogendoorn.ee...@gmai

Re: [Numpy-discussion] Give Jaime Fernandez commit rights.

2014-09-03 Thread Eelco Hoogendoorn
+1; though I am relatively new to the scene, Jaime's contributions have always stood out to me as thoughtful. On Thu, Sep 4, 2014 at 12:42 AM, Ralf Gommers wrote: > > > > On Wed, Sep 3, 2014 at 11:48 PM, Robert Kern > wrote: > >> On Wed, Sep 3, 2014 at 10:47 PM, Charles R Harris >> wrote: >>

Re: [Numpy-discussion] Does a `mergesorted` function make sense?

2014-09-03 Thread Eelco Hoogendoorn
On Wed, Sep 3, 2014 at 4:07 AM, Jaime Fernández del Río < jaime.f...@gmail.com> wrote: > On Tue, Sep 2, 2014 at 5:40 PM, Charles R Harris < > charlesr.har...@gmail.com> wrote: >> >> >> What do you think about the suggestion of timsort? One would need to >> concatenate the arrays before sorting, bu

Re: [Numpy-discussion] Does a `mergesorted` function make sense?

2014-09-01 Thread Eelco Hoogendoorn
On Mon, Sep 1, 2014 at 2:05 PM, Charles R Harris wrote: > > > > On Mon, Sep 1, 2014 at 1:49 AM, Eelco Hoogendoorn < > hoogendoorn.ee...@gmail.com> wrote: > >> Sure, id like to do the hashing things out, but I would also like some >> preliminary feedback as to

Re: [Numpy-discussion] Does a `mergesorted` function make sense?

2014-09-01 Thread Eelco Hoogendoorn
rom a community perspective, a significant fraction of all stackoverflow numpy questions are (unknowingly) exactly about 'how to do grouping in numpy'. On Mon, Sep 1, 2014 at 4:36 AM, Charles R Harris wrote: > > > > On Sun, Aug 31, 2014 at 1:48 PM, Eelco Hoogendoorn < &

Re: [Numpy-discussion] Does a `mergesorted` function make sense?

2014-08-31 Thread Eelco Hoogendoorn
s, and so on. You mentioned getting the numpy core developers involved; are they not subscribed to this mailing list? I wouldn't be surprised; youd hope there is a channel of discussion concerning development with higher signal to noise On Thu, Aug 28, 2014 at 1:49 AM, Eelco

Re: [Numpy-discussion] Does a `mergesorted` function make sense?

2014-08-27 Thread Eelco Hoogendoorn
ut the point would be to provide true > vectorization on those operations. > > The way I see it, numpy may not have to have a GroupBy implementation, but > it should at least enable implementing one that is fast and efficient over > any axis. > > > On Wed, Aug 27, 2014 a

Re: [Numpy-discussion] Does a `mergesorted` function make sense?

2014-08-27 Thread Eelco Hoogendoorn
in zip(*group_by(keys)(values)): print k, g.mean(0) On Wed, Aug 27, 2014 at 9:29 PM, Eelco Hoogendoorn < hoogendoorn.ee...@gmail.com> wrote: > f.i., this works as expected as well (100 keys of 1d int arrays and 100 > values of 1d float arrays): > > group_by(randint(0,4,(

Re: [Numpy-discussion] Does a `mergesorted` function make sense?

2014-08-27 Thread Eelco Hoogendoorn
f.i., this works as expected as well (100 keys of 1d int arrays and 100 values of 1d float arrays): group_by(randint(0,4,(100,2))).mean(rand(100,2)) On Wed, Aug 27, 2014 at 9:27 PM, Eelco Hoogendoorn < hoogendoorn.ee...@gmail.com> wrote: > If I understand you correctly, th

Re: [Numpy-discussion] Does a `mergesorted` function make sense?

2014-08-27 Thread Eelco Hoogendoorn
t; seems to be material for a NEP here, and some guidance from one of the > numpy devs would be helpful in getting this somewhere. > > Jaime > > > On Wed, Aug 27, 2014 at 10:35 AM, Eelco Hoogendoorn < > hoogendoorn.ee...@gmail.com> wrote: > >> It wouldn't h

Re: [Numpy-discussion] Does a `mergesorted` function make sense?

2014-08-27 Thread Eelco Hoogendoorn
It wouldn't hurt to have this function, but my intuition is that its use will be minimal. If you are already working with sorted arrays, you already have a flop cost on that order of magnitude, and the optimized merge saves you a factor two at the very most. Using numpy means you are sacrificing fa

Re: [Numpy-discussion] np.unique with structured arrays

2014-08-22 Thread Eelco Hoogendoorn
Oh yeah this could be. Floating point equality and bitwise equality are not the same thing. -Original Message- From: "Jaime Fernández del Río" Sent: ‎22-‎8-‎2014 16:22 To: "Discussion of Numerical Python" Subject: Re: [Numpy-discussion] np.unique with structured arrays I can confirm,

Re: [Numpy-discussion] np.unique with structured arrays

2014-08-22 Thread Eelco Hoogendoorn
It does not sound like an issue with unique, but rather like a matter of floating point equality and representation. Do the ' identical' elements pass an equality test? -Original Message- From: "Nicolas P. Rougier" Sent: ‎22-‎8-‎2014 15:21 To: "Discussion of Numerical Python" Subject:

Re: [Numpy-discussion] Proposed new feature for numpy.einsum: repeated output subscripts as diagonal

2014-08-15 Thread Eelco Hoogendoorn
here is a snippet I extracted from a project with similar aims (integrating the functionality of einsum and numexpr, actually) Not much to it, but in case someone needs a reminder on how to use striding tricks: http://pastebin.com/kQNySjcj On Fri, Aug 15, 2014 at 5:20 PM, Eelco Hoogendoorn

Re: [Numpy-discussion] Proposed new feature for numpy.einsum: repeated output subscripts as diagonal

2014-08-15 Thread Eelco Hoogendoorn
indexing once... ill see if I can dig that up. On Fri, Aug 15, 2014 at 5:01 PM, Sebastian Berg wrote: > On Fr, 2014-08-15 at 16:42 +0200, Eelco Hoogendoorn wrote: > > Agreed; this addition occurred to me as well. Note that the > > implemenatation should be straightforward: j

Re: [Numpy-discussion] Proposed new feature for numpy.einsum: repeated output subscripts as diagonal

2014-08-15 Thread Eelco Hoogendoorn
Agreed; this addition occurred to me as well. Note that the implemenatation should be straightforward: just allocate an enlarged array, use some striding logic to construct the relevant view, and let einsums internals act on the view. hopefully, you wont even have to touch the guts of einsum at the

Re: [Numpy-discussion] New function `count_unique` to generate contingency tables.

2014-08-13 Thread Eelco Hoogendoorn
R and Matlab functions. I won't > update it right now, but if there is interest in putting it into numpy, > I'll rename it to avoid the pylab conflict. Anything along the lines of > `crosstab`, `xtable`, etc., would be fine with me. > > Warren > > > >> On

Re: [Numpy-discussion] New function `count_unique` to generate contingency tables.

2014-08-12 Thread Eelco Hoogendoorn
tened. I also agree that the extension you propose here is useful; but ideally, with a little more discussion on these subjects we can converge on an even more comprehensive overhaul On Tue, Aug 12, 2014 at 6:33 PM, Joe Kington wrote: > > > > On Tue, Aug 12, 2014 at 11:17 AM,

Re: [Numpy-discussion] New function `count_unique` to generate contingency tables.

2014-08-12 Thread Eelco Hoogendoorn
Thanks. Prompted by that stackoverflow question, and similar problems I had to deal with myself, I started working on a much more general extension to numpy's functionality in this space. Like you noted, things get a little panda-y, but I think there is a lot of panda's functionality that could or

Re: [Numpy-discussion] Calculation of a hessian

2014-08-08 Thread Eelco Hoogendoorn
Do it in pure numpy? How about copying the source of numdifftools? What exactly is the obstacle to using numdifftools? There seem to be no licensing issues. In my experience, its a crafty piece of work; and calculating a hessian correctly, accounting for all kinds of nasty floating point issues, i

Re: [Numpy-discussion] Preliminary thoughts on implementing __matmul__

2014-08-07 Thread Eelco Hoogendoorn
I don't expect stacked matrices/vectors to be used often, although there are some areas that might make heavy use of them, so I think we could live with the simple implementation, it's just a bit of a wart when there is broadcasting of arrays. Just to be clear, the '@' broadcasting differs from the

Re: [Numpy-discussion] Array2 subset of array1

2014-08-05 Thread Eelco Hoogendoorn
ah yes, that may indeed be what you want. depending on your datatype, you could access the underlying raw data as a string. b.tostring() in a.tostring() sort of works; but isn't entirely safe, as you may have false positive matches which arnt aligned to your datatype using str.find in combination

Re: [Numpy-discussion] Array2 subset of array1

2014-08-05 Thread Eelco Hoogendoorn
np.all(np.in1d(array1,array2)) On Tue, Aug 5, 2014 at 2:58 PM, Jurgens de Bruin wrote: > Hi, > > I am new to numpy so any help would be greatly appreciated. > > I have two arrays: > > array1 = np.arange(1,100+1) > array2 = np.arange(1,50+1) > > How can I calculate/determine if array2 is

Re: [Numpy-discussion] numpy.mean still broken for largefloat32arrays

2014-07-28 Thread Eelco Hoogendoorn
, the hierarchical summation would make it fairly easy to erase (and in any case would minimize) summation differences due to differences between logical and actual ordering in memory of the data, no? On Mon, Jul 28, 2014 at 5:22 PM, Sebastian Berg wrote: > On Mo, 2014-07-28 at 16:31 +0200, Eelco Hoo

Re: [Numpy-discussion] numpy.mean still broken for largefloat32arrays

2014-07-28 Thread Eelco Hoogendoorn
Sebastian: Those are good points. Indeed iteration order may already produce different results, even though the semantics of numpy suggest identical operations. Still, I feel this different behavior without any semantical clues is something to be minimized. Indeed copying might have large speed i

Re: [Numpy-discussion] numpy.mean still broken for largefloat32arrays

2014-07-28 Thread Eelco Hoogendoorn
To rephrase my most pressing question: may np.ones((N,2)).mean(0) and np.ones((2,N)).mean(1) produce different results with the implementation in the current master? If so, I think that would be very much regrettable; and if this is a minority opinion, I do hope that at least this gets documented i

Re: [Numpy-discussion] numpy.mean still broken for largefloat32arrays

2014-07-26 Thread Eelco Hoogendoorn
Perhaps I in turn am missing something; but I would suppose that any algorithm that requires multiple passes over the data is off the table? Perhaps I am being a little old fashioned and performance oriented here, but to make the ultra-majority of use cases suffer a factor two performance penalty f

Re: [Numpy-discussion] numpy.mean still broken for largefloat32arrays

2014-07-26 Thread Eelco Hoogendoorn
de most of the benefits, without any of the drawbacks? On Sat, Jul 26, 2014 at 3:53 PM, Julian Taylor < jtaylor.deb...@googlemail.com> wrote: > On 26.07.2014 15:38, Eelco Hoogendoorn wrote: > > > > Why is it not always used? > > for 1d reduction the iterator blocks by 8192

Re: [Numpy-discussion] numpy.mean still broken for largefloat32arrays

2014-07-26 Thread Eelco Hoogendoorn
I was wondering the same thing. Are there any known tradeoffs to this method of reduction? On Sat, Jul 26, 2014 at 12:39 PM, Sturla Molden wrote: > Sebastian Berg wrote: > > > chose more stable algorithms for such statistical functions. The > > pairwise summation that is in master now is very

Re: [Numpy-discussion] numpy.mean still broken for largefloat32arrays

2014-07-26 Thread Eelco Hoogendoorn
- From: "Julian Taylor" Sent: ‎26-‎7-‎2014 00:58 To: "Discussion of Numerical Python" Subject: Re: [Numpy-discussion] numpy.mean still broken for largefloat32arrays On 25.07.2014 23:51, Eelco Hoogendoorn wrote: > Ray: I'm not working with Hubble data, but yeah the

Re: [Numpy-discussion] numpy.mean still broken for largefloat32arrays

2014-07-25 Thread Eelco Hoogendoorn
Ray: I'm not working with Hubble data, but yeah these are all issues I've run into with my terrabytes of microscopy data as well. Given that such raw data comes as uint16, its best to do your calculations as much as possible in good old ints. What you compute is what you get, no obscure shenanig

Re: [Numpy-discussion] numpy.mean still broken for largefloat32arrays

2014-07-25 Thread Eelco Hoogendoorn
e to implement, given the current framework. The ability to specify different algorithms per kwarg wouldn't be a bad idea either, imo; or the ability to explicitly specify a separate output and accumulator dtype. On Fri, Jul 25, 2014 at 8:00 PM, Alan G Isaac wrote: > On 7/25/2014 1:40 PM, Ee

Re: [Numpy-discussion] numpy.mean still broken for largefloat32arrays

2014-07-25 Thread Eelco Hoogendoorn
Arguably, the whole of floating point numbers and their related shenanigans is not very pythonic in the first place. The accuracy of the output WILL depend on the input, to some degree or another. At the risk of repeating myself: explicit is better than implicit -Original Message- From:

Re: [Numpy-discussion] numpy.mean still broken for largefloat32arrays

2014-07-25 Thread Eelco Hoogendoorn
does not support the specified precision, rather than obtain subtly or horribly broken results without warning when moving your code to a different platform/compiler whatever. On Fri, Jul 25, 2014 at 5:37 AM, Eelco Hoogendoorn < hoogendoorn.ee...@gmail.com> wrote: > Perhaps it is a

Re: [Numpy-discussion] numpy.mean still broken for largefloat32arrays

2014-07-24 Thread Eelco Hoogendoorn
nal Message- From: "Alan G Isaac" Sent: ‎25-‎7-‎2014 00:10 To: "Discussion of Numerical Python" Subject: Re: [Numpy-discussion] numpy.mean still broken for largefloat32arrays On 7/24/2014 4:42 PM, Eelco Hoogendoorn wrote: > This isn't a bug report, but rather a feat

Re: [Numpy-discussion] numpy.mean still broken for large float32arrays

2014-07-24 Thread Eelco Hoogendoorn
Inaccurate and utterly wrong are subjective. If You want To Be sufficiently strict, floating point calculations are almost always 'utterly wrong'. Granted, It would Be Nice if the docs specified the algorithm used. But numpy does not produce anything different than what a standard c loop or c++

Re: [Numpy-discussion] numpy.mean still broken for large float32arrays

2014-07-24 Thread Eelco Hoogendoorn
4 18:09 To: "Discussion of Numerical Python" Subject: Re: [Numpy-discussion] numpy.mean still broken for large float32arrays On 7/24/2014 5:59 AM, Eelco Hoogendoorn wrote to Thomas: > np.mean isn't broken; your understanding of floating point number is. This comment seems to conflate se

Re: [Numpy-discussion] numpy.mean still broken for large float32arrays

2014-07-24 Thread Eelco Hoogendoorn
ssage- From: "Julian Taylor" Sent: ‎24-‎7-‎2014 14:56 To: "Discussion of Numerical Python" Subject: Re: [Numpy-discussion] numpy.mean still broken for large float32arrays On Thu, Jul 24, 2014 at 1:33 PM, Fabien wrote: > Hi all, > > On 24.07.2014 11:59, Eelco H

Re: [Numpy-discussion] numpy.mean still broken for large float32 arrays

2014-07-24 Thread Eelco Hoogendoorn
Arguably, this isn't a problem of numpy, but of programmers being trained to think of floating point numbers as 'real' numbers, rather than just a finite number of states with a funny distribution over the number line. np.mean isn't broken; your understanding of floating point number is. What you

Re: [Numpy-discussion] Find n closest values

2014-06-22 Thread Eelco Hoogendoorn
> > I tested your solution and it is faster by only a tiny amount but the way > you wrote it might open the door for other improvements. Thanks. > > > Nicolas > > On 22 Jun 2014, at 21:14, Eelco Hoogendoorn > wrote: > > > Protip: if you are writing yo

Re: [Numpy-discussion] Find n closest values

2014-06-22 Thread Eelco Hoogendoorn
; > > On Sun, Jun 22, 2014 at 5:16 PM, Nicolas P. Rougier < > nicolas.roug...@inria.fr> wrote: > > > > Thanks for the answer. > > I was secretly hoping for some kind of hardly-known numpy function that > would make things faster auto-magically... > > > &g

Re: [Numpy-discussion] Find n closest values

2014-06-22 Thread Eelco Hoogendoorn
Also, if you use scipy.spatial.KDTree, make sure to use cKDTree; the native python kdtree is sure to be slow as hell. On Sun, Jun 22, 2014 at 7:05 PM, Eelco Hoogendoorn < hoogendoorn.ee...@gmail.com> wrote: > Well, if the spacing is truly uniform, then of course you don't rea

Re: [Numpy-discussion] Find n closest values

2014-06-22 Thread Eelco Hoogendoorn
gt; would make things faster auto-magically... > > > Nicolas > > > On 22 Jun 2014, at 10:30, Eelco Hoogendoorn > wrote: > > > Perhaps you could simplify some statements, but at least the algorithmic > complexity is fine, and everything is vectorized, so I doubt you wil

Re: [Numpy-discussion] Find n closest values

2014-06-22 Thread Eelco Hoogendoorn
Perhaps you could simplify some statements, but at least the algorithmic complexity is fine, and everything is vectorized, so I doubt you will get huge gains. You could take a look at the functions in scipy.spatial, and see how they perform for your problem parameters. On Sun, Jun 22, 2014 at 10

Re: [Numpy-discussion] Easter Egg or what I am missing here?

2014-05-21 Thread Eelco Hoogendoorn
I agree; this 'wart' has also messed with my code a few times. I didn't find it to be the case two years ago, but perhaps I should reevaluate if the scientific python stack has sufficiently migrated to python 3. On Thu, May 22, 2014 at 7:35 AM, Siegfried Gonzi wrote: > On 22/05/2014 00:37, numpy

Re: [Numpy-discussion] repeat an array without allocation

2014-05-05 Thread Eelco Hoogendoorn
If b is indeed big I don't see a problem with the python loop, elegance aside; but Cython will not beat it on that front. On Mon, May 5, 2014 at 9:34 AM, srean wrote: > Great ! thanks. I should have seen that. > > Is there any way array multiplication (as opposed to matrix > multiplication) can

Re: [Numpy-discussion] repeat an array without allocation

2014-05-04 Thread Eelco Hoogendoorn
nope; its impossible to express A as a strided view on x, for the repeats you have. even if you had uniform repeats, it still would not work. that would make it easy to add an extra axis to x without a new allocation; but reshaping/merging that axis with axis=0 would again trigger a copy, as it wo

Re: [Numpy-discussion] arrays and : behaviour

2014-05-01 Thread Eelco Hoogendoorn
You problem isn't with colon indexing, but with the interpretation of the arguments to plot. multiple calls to plot with scalar arguments do not have the same result as a single call with array arguments. For this to work as intended, you would need plt.hold(True), for starters, and maybe there are

Re: [Numpy-discussion] string replace

2014-04-21 Thread Eelco Hoogendoorn
Indeed this isn't numpy, and I don't see how your collegues opinions have bearing on that issue; but anyway.. There isn't a 'python' way to do this, the best method involves some form of parsing library. Undoubtly there is a one-line regex to do this kind of thing, but regexes are themselves the a

Re: [Numpy-discussion] numerical gradient, Jacobian, and Hessian

2014-04-21 Thread Eelco Hoogendoorn
I was going to suggest numdifftools; its a very capable package in my experience. Indeed it would be nice to have it integrated into scipy. Also, in case trying to calculate a numerical gradient is a case of 'the math getting too bothersome' rather than no closed form gradient actually existing: T

Re: [Numpy-discussion] min depth to nonzero in 3d array

2014-04-17 Thread Eelco Hoogendoorn
I agree; argmax would the best option here; though I would hardly call it abuse. It seems perfectly readable and idiomatic to me. Though the != comparison requires an extra pass over the array, that's the kind of tradeoff you make in using numpy. On Thu, Apr 17, 2014 at 7:45 PM, Stephan Hoyer wr

Re: [Numpy-discussion] Wiki page for building numerical stuff onWindows

2014-04-12 Thread Eelco Hoogendoorn
ding numerical stuff onWindows Eelco Hoogendoorn wrote: > I wonder: how hard would it be to create a more 21th-century oriented BLAS, > relying more on code generation tools, and perhaps LLVM/JITting? > > Wouldn't we get ten times the portability with one-tenth the lines of code? &g

Re: [Numpy-discussion] Wiki page for building numerical stuff on Windows

2014-04-12 Thread Eelco Hoogendoorn
I wonder: how hard would it be to create a more 21th-century oriented BLAS, relying more on code generation tools, and perhaps LLVM/JITting? Wouldn't we get ten times the portability with one-tenth the lines of code? Or is there too much dark magic going on in BLAS for such an approach to come clo

Re: [Numpy-discussion] Standard Deviation (std): Suggested change for "ddof" default value

2014-04-01 Thread Eelco Hoogendoorn
I agree; breaking code over this would be ridiculous. Also, I prefer the zero default, despite the mean/std combo probably being more common. On Tue, Apr 1, 2014 at 10:02 PM, Sturla Molden wrote: > Haslwanter Thomas wrote: > > > Personally I cannot think of many applications where it would be d

Re: [Numpy-discussion] Is there a pure numpy recipe for this?

2014-03-27 Thread Eelco Hoogendoorn
Id recommend taking a look at pytables as well. It has support for out-of-core array computations on large arrays. On Thu, Mar 27, 2014 at 9:00 PM, RayS wrote: > Thanks for all of the suggestions; we are migrating to 64bit Python soon > as well. > The environments are Win7 and Mac Maverics. >

Re: [Numpy-discussion] Is there a pure numpy recipe for this?

2014-03-26 Thread Eelco Hoogendoorn
Without looking ahead, here is what I came up with; but I see more elegant solutions have been found already. import numpy as np def as_dense(f, length): i = np.zeros(length+1, np.int) i[f[0]] = 1 i[f[1]] = -1 return np.cumsum(i)[:-1] def as_sparse(d): diff = np.diff(np.con

Re: [Numpy-discussion] Implementing elementary matrices

2014-03-24 Thread Eelco Hoogendoorn
Sounds (marginally) useful; although elementary row/column operations are in practice usually better implemented directly by indexing rather than in an operator form. Though I can see a use for the latter. My suggestion: its not a common enough operation to deserve a 4 letter acronym (assuming tho

Re: [Numpy-discussion] [help needed] associativity and precedence of '@'

2014-03-18 Thread Eelco Hoogendoorn
aybe that means > then 'same' because it's easier to remember !?) > > My two cents, > Sebastian Haase > > > On Tue, Mar 18, 2014 at 7:13 AM, Eelco Hoogendoorn > wrote: > > > > > > Perhaps this a bit of a thread hyjack; but this discussion got

Re: [Numpy-discussion] [help needed] associativity and precedence of '@'

2014-03-17 Thread Eelco Hoogendoorn
sions it's easy, but I always find that if > I've written code that works for 1 matrix or vector, 5 minutes later I want > it to work for fields of matrices or vectors. If we're just going by shape > there's no way to distinguish between a 2d field of matrices and a 3d fiel

Re: [Numpy-discussion] It looks like Py 3.5 will include a dedicated infix matrix multiply operator

2014-03-16 Thread Eelco Hoogendoorn
scheme, rather than maximum performance considerations. Ideally, the standard operator would pick a sensible default which can be inferred from the arguments, while allowing for explicit specification of the kind of algorithm used where this verbosity is worth the hassle. On Sun, Mar 16, 2014 at 5:3

Re: [Numpy-discussion] It looks like Py 3.5 will include a dedicated infix matrix multiply operator

2014-03-16 Thread Eelco Hoogendoorn
Different people work on different code and have different experiences here -- yours may or may be typical yours. Pauli did some quick checks on scikit-learn & nipy & scipy, and found that in their test suites, uses of np.dot and uses of elementwise-multiplication are ~equally common: ht

Re: [Numpy-discussion] It looks like Py 3.5 will include a dedicated infix matrix multiply operator

2014-03-16 Thread Eelco Hoogendoorn
Note that I am not opposed to extra operators in python, and only mildly opposed to a matrix multiplication operator in numpy; but let me lay out the case against, for your consideration. First of all, the use of matrix semantics relative to arrays semantics is extremely rare; even in linear algeb

[Numpy-discussion] Pickling of memory aliasing patterns

2014-03-13 Thread Eelco Hoogendoorn
I have been working on a general function caching mechanism, and in doing so I stumbled upon the following quirck: @cached def foo(a,b): b[0] = 1 return a[0] a = np.zeros(1) b = a[:] print foo(a, b)#computes and returns 1 print foo(a, b)#gets 1 fro

Re: [Numpy-discussion] dtype promotion

2014-03-03 Thread Eelco Hoogendoorn
The tuple gets cast to an ndarray; which invokes a different codepath than the scalar addition. Somehow, numpy has gotten more aggressive at upcasting to float64 as of 1.8, but I havnt been able to discover the logic behind it either. On Mon, Mar 3, 2014 at 10:06 PM, Nicolas Rougier wrote: > >

Re: [Numpy-discussion] ANN: XDress v0.4

2014-02-27 Thread Eelco Hoogendoorn
That is good to know. The boost documentation makes it appear as if bjam is the only way to build boost.python, but good to see examples to the contrary! On Thu, Feb 27, 2014 at 2:19 PM, Toby St Clere Smithe < pyvienn...@tsmithe.net> wrote: > Eelco Hoogendoorn writes: > >

Re: [Numpy-discussion] ANN: XDress v0.4

2014-02-27 Thread Eelco Hoogendoorn
at 1:51 AM, Eelco Hoogendoorn < > hoogendoorn.ee...@gmail.com> wrote: > >> Thanks for the heads up, I wasn't aware of this project. While >> boost.python is a very nice package, its distributability is nothing short >> of nonexistent, so its great to have

Re: [Numpy-discussion] ANN: XDress v0.4

2014-02-27 Thread Eelco Hoogendoorn
I have; but if I recall correctly, it does not solve the problem of distributing code that uses it, or does it? On Thu, Feb 27, 2014 at 10:51 AM, Toby St Clere Smithe < pyvienn...@tsmithe.net> wrote: > Hi, > > Eelco Hoogendoorn writes: > > Thanks for the heads up, I

Re: [Numpy-discussion] ANN: XDress v0.4

2014-02-26 Thread Eelco Hoogendoorn
Thanks for the heads up, I wasn't aware of this project. While boost.python is a very nice package, its distributability is nothing short of nonexistent, so its great to have a pure python binding generator. One thing which I have often found frustrating is natural ndarray interop between python a

Re: [Numpy-discussion] Help Understanding Indexing Behavior

2014-02-25 Thread Eelco Hoogendoorn
To elaborate on what Julian wrote: it is indeed simply a convention; slices/ranges in python are from the start to one-past-the-end. The reason for the emergence of this convention is that C code using iterators looks most natural this way. This manifests in a simple for (i = 0; i < 5; i++), but al

Re: [Numpy-discussion] Proposal: Chaining np.dot with mdot helper function

2014-02-20 Thread Eelco Hoogendoorn
ce of np.einsum will be hard to beat On Thu, Feb 20, 2014 at 3:27 PM, Eric Moore wrote: > > > On Thursday, February 20, 2014, Eelco Hoogendoorn < > hoogendoorn.ee...@gmail.com> wrote: > >> If the standard semantics are not affected, and the most common >> two-argum

  1   2   >