On Tue, Oct 27, 2015 at 7:30 AM, Benjamin Root <ben.v.r...@gmail.com> wrote:
> FWIW, when I needed a fast Fixed Width reader > was there potentially no whitespace between fields in that case? In which case, it really isn a different use-case than delimited text -- if it's at all common, a version written in C would be nice and fast. and nat hard to do. But fromstring never would have helped you with that anyway :-) -CHB > for a very large dataset last year, I found that np.genfromtext() was > faster than pandas' read_fwf(). IIRC, pandas' text reading code fell back > to pure python for fixed width scenarios. > > On Fri, Oct 23, 2015 at 8:22 PM, Chris Barker - NOAA Federal < > chris.bar...@noaa.gov> wrote: > >> Grabbing the pandas csv reader would be great, and I hope it happens >> sooner than later, though alas, I haven't the spare cycles for it either. >> >> In the meantime though, can we put a deprecation Warning in when using >> fromstring() on text files? It's really pretty broken. >> >> -Chris >> >> On Oct 23, 2015, at 4:02 PM, Jeff Reback <jeffreb...@gmail.com> wrote: >> >> >> >> On Oct 23, 2015, at 6:49 PM, Nathaniel Smith <n...@pobox.com> wrote: >> >> On Oct 23, 2015 3:30 PM, "Jeff Reback" <jeffreb...@gmail.com> wrote: >> > >> > On Oct 23, 2015, at 6:13 PM, Charles R Harris < >> charlesr.har...@gmail.com> wrote: >> > >> >> >> >> >> >> On Thu, Oct 22, 2015 at 5:47 PM, Chris Barker - NOAA Federal < >> chris.bar...@noaa.gov> wrote: >> >>> >> >>> >> >>>> I think it would be good to keep the usage to read binary data at >> least. >> >>> >> >>> >> >>> Agreed -- it's only the text file reading I'm proposing to deprecate. >> It was kind of weird to cram it in there in the first place. >> >>> >> >>> Oh, fromfile() has the same issues. >> >>> >> >>> Chris >> >>> >> >>> >> >>>> Or is there a good alternative to `np.fromstring(<bytes>, >> dtype=...)`? -- Marten >> >>>> >> >>>> On Thu, Oct 22, 2015 at 1:03 PM, Chris Barker <chris.bar...@noaa.gov> >> wrote: >> >>>>> >> >>>>> There was just a question about a bug/issue with scipy.fromstring >> (which is numpy.fromstring) when used to read integers from a text file. >> >>>>> >> >>>>> >> https://mail.scipy.org/pipermail/scipy-user/2015-October/036746.html >> >>>>> >> >>>>> fromstring() is bugging and inflexible for reading text files -- >> and it is a very, very ugly mess of code. I dug into it a while back, and >> gave up -- just to much of a mess! >> >>>>> >> >>>>> So we really should completely re-implement it, or deprecate it. I >> doubt anyone is going to do a big refactor, so that means deprecating it. >> >>>>> >> >>>>> Also -- if we do want a fast read numbers from text files function >> (which would be nice, actually), it really should get a new name anyway. >> >>>>> >> >>>>> (and the hopefully coming new dtype system would make it easier to >> write cleanly) >> >>>>> >> >>>>> I'm not sure what deprecating something means, though -- have it >> raise a deprecation warning in the next version? >> >>>>> >> >> >> >> There was discussion at SciPy 2015 of separating out the text reading >> abilities of Pandas so that numpy could include it. We should contact Jeff >> Rebeck and see about moving that forward. >> > >> > >> > IIRC Thomas Caswell was interested in doing this :) >> >> When he was in Berkeley a few weeks ago he assured me that every night >> since SciPy he has dutifully been feeling guilty about not having done it >> yet. I think this week his paltry excuse is that he's "on his honeymoon" or >> something. >> >> ...which is to say that if someone has some spare cycles to take this >> over then I think that might be a nice wedding present for him :-). >> >> (The basic idea is to take the text reading backend behind >> pandas.read_csv and extract it into a standalone package that pandas could >> depend on, and that could also be used by other packages like numpy (among >> others -- I thing dato's SFrame package has a fork of this code as well?)) >> >> -n >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion@scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> I can certainly provide guidance on how/what to extract but don't have >> spare cycles myself for this :( >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion@scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion@scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion