On Tue, Dec 6, 2011 at 4:11 PM, Ralf Gommers <[email protected]> wrote: > > > On Mon, Dec 5, 2011 at 8:43 PM, Ralf Gommers <[email protected]> > wrote: >> >> Hi all, >> >> It's been a little over 6 months since the release of 1.6.0 and the NA >> debate has quieted down, so I'd like to ask your opinion on the timing of >> 1.7.0. It looks to me like we have a healthy amount of bug fixes and small >> improvements, plus three larger chucks of work: >> >> - datetime >> - NA >> - Bento support >> >> My impression is that both datetime and NA are releasable, but should be >> labeled "tech preview" or something similar, because they may still see >> significant changes. Please correct me if I'm wrong. >> >> There's still some maintenance work to do and pull requests to merge, but >> a beta release by Christmas should be feasible. > > > To be a bit more detailed here, these are the most significant pull requests > / patches that I think can be merged with a limited amount of work: > meshgrid enhancements: http://projects.scipy.org/numpy/ticket/966 > sample_from function: https://github.com/numpy/numpy/pull/151 > loadtable function: https://github.com/numpy/numpy/pull/143 > > Other maintenance things: > - un-deprecate putmask > - clean up causes of "DType strings 'O4' and 'O8' are deprecated..." > - fix failing einsum and polyfit tests > - update release notes > > Cheers, > Ralf > > >> What do you all think? >> >> >> Cheers, >> Ralf > > > > _______________________________________________ > NumPy-Discussion mailing list > [email protected] > http://mail.scipy.org/mailman/listinfo/numpy-discussion >
This isn't the place for this discussion but we should start talking about building a *high performance* flat file loading solution with good column type inference and sensible defaults, etc. It's clear that loadtable is aiming for highest compatibility-- for example I can read a 2800x30 file in < 50 ms with the read_table / read_csv functions I wrote myself recent in Cython (compared with loadtable taking > 1s as quoted in the pull request), but I don't handle European decimal formats and lots of other sources of unruliness. I personally don't believe in sacrificing an order of magnitude of performance in the 90% case for the 10% case-- so maybe it makes sense to have two functions around: a superfast custom CSV reader for well-behaved data, and a slower, but highly flexible, function like loadtable to fall back on. I think R has two functions read.csv and read.csv2, where read.csv2 is capable of dealing with things like European decimal format. - Wes _______________________________________________ NumPy-Discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
