Re: [Numpy-discussion] genfromtxt universal newline support

2014-06-30 Thread Derek Homeier
On 30.06.2014, at 23:10, Jeff Reback wrote: > In pandas 0.14.0, generic whitespace IS parsed via the c-parser, e.g. > specifying '\s+' as a separator. Not sure when you were playing last with > pandas, but the c-parser has been in place since late 2012. (version 0.8.0) > > http://pandas-docs.g

Re: [Numpy-discussion] genfromtxt universal newline support

2014-06-30 Thread Jeff Reback
In pandas 0.14.0, generic whitespace IS parsed via the c-parser, e.g. specifying '\s+' as a separator. Not sure when you were playing last with pandas, but the c-parser has been in place since late 2012. (version 0.8.0) http://pandas-docs.github.io/pandas-docs-travis/whatsnew.html#text-parsing-a

Re: [Numpy-discussion] genfromtxt universal newline support

2014-06-30 Thread Derek Homeier
On 30 Jun 2014, at 04:56 pm, Nathaniel Smith wrote: >> A real need, which had also been discussed at length, is a truly performant >> text IO >> function (i.e. one using a compiled ASCII number parser, and optimally also >> a more >> memory-efficient one), but unfortunately all people intereste

Re: [Numpy-discussion] genfromtxt universal newline support

2014-06-30 Thread Chris Barker
On Mon, Jun 30, 2014 at 9:31 AM, Nathaniel Smith wrote: > On 30 Jun 2014 17:05, "Chris Barker" wrote: > > > Anyway, this all ties in with the text file parsing issues... > > Only tangentially though :-) > well, a fast text parser (and "text mode") input file will either need to deal with Unico

Re: [Numpy-discussion] genfromtxt universal newline support

2014-06-30 Thread Nathaniel Smith
On 30 Jun 2014 17:05, "Chris Barker" wrote: >> >> It's also an interesting >> question whether they've fixed the unicode/binary issues, > > > Which brings up the "how do we handle text/strings in numpy? issue. We had a good thread going here about what the 'S' data type should be , what with py3 a

Re: [Numpy-discussion] genfromtxt universal newline support

2014-06-30 Thread Chris Barker
> > It's also an interesting > question whether they've fixed the unicode/binary issues, Which brings up the "how do we handle text/strings in numpy? issue. We had a good thread going here about what the 'S' data type should be , what with py3 and all, but I don't think we ever really resolved th

Re: [Numpy-discussion] genfromtxt universal newline support

2014-06-30 Thread Nathaniel Smith
On Mon, Jun 30, 2014 at 3:47 PM, Derek Homeier wrote: > Does it make sense to keep maintaing both functions at all? IIRC the idea that > loadtxt would be the faster version of the two has been discarded long ago, > thus it seems there is very little, if anything, loadtxt can do that cannot > be d

Re: [Numpy-discussion] genfromtxt universal newline support

2014-06-30 Thread Derek Homeier
On 30 Jun 2014, at 04:39 pm, Nathaniel Smith wrote: > On Mon, Jun 30, 2014 at 12:33 PM, Julian Taylor > wrote: >> genfromtxt and loadtxt need an almost full rewrite to fix the botched >> python3 conversion of these functions. There are a couple threads >> about this on this list already. >> The

Re: [Numpy-discussion] genfromtxt universal newline support

2014-06-30 Thread Nathaniel Smith
On Mon, Jun 30, 2014 at 12:33 PM, Julian Taylor wrote: > genfromtxt and loadtxt need an almost full rewrite to fix the botched > python3 conversion of these functions. There are a couple threads > about this on this list already. > There are numerous PRs fixing stuff in these functions which I > c

Re: [Numpy-discussion] genfromtxt universal newline support

2014-06-30 Thread Julian Taylor
genfromtxt and loadtxt need an almost full rewrite to fix the botched python3 conversion of these functions. There are a couple threads about this on this list already. There are numerous PRs fixing stuff in these functions which I currently all -1'd because we need to fix the underlying unicode is

[Numpy-discussion] genfromtxt universal newline support

2014-06-30 Thread Derek Homeier
Hi all, I was just having a new look into the mess that is, imo, the support for automatic line ending recognition in genfromtxt, and more generally, the Python file openers. I am glad at least reading gzip files is no longer entirely broken in Python3, but actually detecting in particular “old

Re: [Numpy-discussion] genfromtxt and gzip

2013-06-11 Thread Derek Homeier
On 05.06.2013, at 9:52AM, Ted To wrote: >> From the list archives (2011), I noticed that there is a bug in the > python gzip module that causes genfromtxt to fail with python 2 but this > bug is not a problem for python 3. When I tried to use genfromtxt and > python 3 with a gzip'ed csv file, I

[Numpy-discussion] genfromtxt and gzip

2013-06-05 Thread Ted To
Hi all, >From the list archives (2011), I noticed that there is a bug in the python gzip module that causes genfromtxt to fail with python 2 but this bug is not a problem for python 3. When I tried to use genfromtxt and python 3 with a gzip'ed csv file, I instead got: IOError: Mode rbU not suppo

Re: [Numpy-discussion] genfromtxt() skips comments

2013-05-31 Thread Albert Kottke
Now try the same thing with np.recfromcsv(). I get the following (Python 3.3): >>> import io >>> b = io.BytesIO(b"!blah\n!blah\n!blah\n!A:B:C\n1:2:3\n4:5:6\n") >>> np.recfromcsv(b, delimiter=':', comments='!') ... ValueError: Some errors were detected ! Line #5 (got 3 columns instead of 1)

Re: [Numpy-discussion] genfromtxt() skips comments

2013-05-31 Thread Albert Kottke
I agree that "last comment line before the first line of data" is more descriptive. Regarding the location of the names. I thought taking it from the last comment line before the first line of data made sense because it would permit reading of just the data with np.loadtxt(), but also permit creat

Re: [Numpy-discussion] genfromtxt() skips comments

2013-05-31 Thread Benjamin Root
On Fri, May 31, 2013 at 5:08 PM, Albert Kottke wrote: > I noticed that genfromtxt() did not skip comments if the keyword names is > not True. If names is True, then genfromtxt() would take the first line as > the names. I am proposing a fix to genfromtxt that skips all of the > comments in a file,

[Numpy-discussion] genfromtxt() skips comments

2013-05-31 Thread Albert Kottke
I noticed that genfromtxt() did not skip comments if the keyword names is not True. If names is True, then genfromtxt() would take the first line as the names. I am proposing a fix to genfromtxt that skips all of the comments in a file, and potentially using the last comment line for names. This wi

Re: [Numpy-discussion] genfromtxt

2011-10-11 Thread Derek Homeier
Hi Nils, On 11 Oct 2011, at 16:34, Nils Wagner wrote: > How do I use genfromtxt to read a file with the following > lines > > 11 2.2592365264892578D+01 > 22 2.2592365264892578D+01 > 13 2.669845581055D+00 >

[Numpy-discussion] genfromtxt

2011-10-11 Thread Nils Wagner
Hi all, How do I use genfromtxt to read a file with the following lines 11 2.2592365264892578D+01 22 2.2592365264892578D+01 13 2.669845581055D+00 33 2.2592365264892578D+01

Re: [Numpy-discussion] genfromtxt converter question

2011-06-18 Thread Derek Homeier
On 18 Jun 2011, at 04:48, gary ruben wrote: > Thanks guys - I'm happy with the solution for now. FYI, Derek's > suggestion doesn't work in numpy 1.5.1 either. > For any developers following this thread, I think this might be a nice > use case for genfromtxt to handle in future. Numpy 1.6.0 and ab

Re: [Numpy-discussion] genfromtxt converter question

2011-06-17 Thread Olivier Delalleau
For the hardcoded part, you can easily read the first line of your file and split it with the same delimiter to know the number of columns. It's sure it'd be best to be able to be able to skip this part, but you don't need to hardcode this number into your code at least. Something like: n_cols = le

Re: [Numpy-discussion] genfromtxt converter question

2011-06-17 Thread gary ruben
Thanks guys - I'm happy with the solution for now. FYI, Derek's suggestion doesn't work in numpy 1.5.1 either. For any developers following this thread, I think this might be a nice use case for genfromtxt to handle in future. As a corollary of this problem, I wonder whether there's a human-readabl

Re: [Numpy-discussion] genfromtxt converter question

2011-06-17 Thread Derek Homeier
On 17.06.2011, at 11:01PM, Olivier Delalleau wrote: >> You were just overdoing it by already creating an array with the converter, >> this apparently caused genfromtxt to create a structured array from the >> input (which could be converted back to an ndarray, but that can prove >> tricky as we

Re: [Numpy-discussion] genfromtxt converter question

2011-06-17 Thread Olivier Delalleau
2011/6/17 Derek Homeier > Hi Gary, > > On 17.06.2011, at 5:39PM, gary ruben wrote: > > Thanks for the hints Olivier and Bruce. Based on them, the following > > is a working solution, although I still have that itchy sense that > genfromtxt > > should be able to do it directly. > > > > import nump

Re: [Numpy-discussion] genfromtxt converter question

2011-06-17 Thread Derek Homeier
Hi Gary, On 17.06.2011, at 5:39PM, gary ruben wrote: > Thanks for the hints Olivier and Bruce. Based on them, the following > is a working solution, although I still have that itchy sense that genfromtxt > should be able to do it directly. > > import numpy as np > from StringIO import StringIO >

Re: [Numpy-discussion] genfromtxt converter question

2011-06-17 Thread gary ruben
Thanks for the hints Olivier and Bruce. Based on them, the following is a working solution, although I still have that itchy sense that genfromtxt should be able to do it directly. import numpy as np from StringIO import StringIO a = StringIO('''\ (-3.9700,-5.0400) (-1.1318,-2.5693) (-4.6027,-0.

Re: [Numpy-discussion] genfromtxt converter question

2011-06-17 Thread Bruce Southey
On 06/17/2011 08:51 AM, Olivier Delalleau wrote: 2011/6/17 Bruce Southey mailto:bsout...@gmail.com>> On 06/17/2011 08:22 AM, gary ruben wrote: > Thanks Olivier, > Your suggestion gets me a little closer to what I want, but doesn't > quite work. Replacing the conversion with

Re: [Numpy-discussion] genfromtxt converter question

2011-06-17 Thread Olivier Delalleau
2011/6/17 Bruce Southey > On 06/17/2011 08:22 AM, gary ruben wrote: > > Thanks Olivier, > > Your suggestion gets me a little closer to what I want, but doesn't > > quite work. Replacing the conversion with > > > > c = lambda x:np.cast[np.complex64](complex(*eval(x))) > > b = np.genfromtxt(a,conve

Re: [Numpy-discussion] genfromtxt converter question

2011-06-17 Thread Bruce Southey
On 06/17/2011 08:22 AM, gary ruben wrote: > Thanks Olivier, > Your suggestion gets me a little closer to what I want, but doesn't > quite work. Replacing the conversion with > > c = lambda x:np.cast[np.complex64](complex(*eval(x))) > b = np.genfromtxt(a,converters={0:c, 1:c, 2:c, > 3:c},dtype=None,

Re: [Numpy-discussion] genfromtxt converter question

2011-06-17 Thread gary ruben
Thanks Olivier, Your suggestion gets me a little closer to what I want, but doesn't quite work. Replacing the conversion with c = lambda x:np.cast[np.complex64](complex(*eval(x))) b = np.genfromtxt(a,converters={0:c, 1:c, 2:c, 3:c},dtype=None,delimiter=18,usecols=range(4)) produces [[(-3.970

Re: [Numpy-discussion] genfromtxt converter question

2011-06-17 Thread Olivier Delalleau
If I understand correctly, your error is that you convert only the second column, because your converters dictionary contains a single key (1). If you have it contain keys from 0 to 3 associated to the same function, it should work. -=- Olivier 2011/6/17 gary ruben > I'm trying to read a file c

[Numpy-discussion] genfromtxt converter question

2011-06-16 Thread gary ruben
I'm trying to read a file containing data formatted as in the following example using genfromtxt and I'm doing something wrong. It almost works. Can someone point out my error, or suggest a simpler solution to the ugly converter function? I thought I'd leave in the commented-out line for future ref

Re: [Numpy-discussion] genfromtxt behaviour

2010-10-29 Thread Pierre GM
On Oct 29, 2010, at 2:59 PM, Matt Studley wrote: > > >>> How can I do my nice 2d slicing on the latter? >>> >>> array([('a', 2, 3), ('b', 5, 6), ('c', 8, 9)], >>> dtype=[('f0', '|S1'), ('f1', ' >> Select a column by its name: >> yourarray['f0'] > > Super! > > So I would need to get the

[Numpy-discussion] genfromtxt behaviour

2010-10-29 Thread Matt Studley
>> How can I do my nice 2d slicing on the latter? >> >> array([('a', 2, 3), ('b', 5, 6), ('c', 8, 9)], >> dtype=[('f0', '|S1'), ('f1', 'Select a column by its name: >yourarray['f0'] Super! So I would need to get the dtype object... myData[ myData.dtype.names[0] ] in order to index by col

Re: [Numpy-discussion] genfromtxt behaviour

2010-10-29 Thread Pierre GM
On Oct 29, 2010, at 2:35 PM, Matt Studley wrote: > Hi all > > first, please forgive me for my ignorance - I am taking my first > stumbling steps with numpy and scipy. No problem, it;s educational > I am having some difficulty with the behaviour of genfromtxt. > > s = SIO.StringIO("""1, 2, 3 >

[Numpy-discussion] genfromtxt behaviour

2010-10-29 Thread Matt Studley
Hi all first, please forgive me for my ignorance - I am taking my first stumbling steps with numpy and scipy. I am having some difficulty with the behaviour of genfromtxt. s = SIO.StringIO("""1, 2, 3 4, 5, 6 7, 8, 9""") g= genfromtxt(s, delimiter=', ', dtype=None) print g[:,0] This produces th

Re: [Numpy-discussion] genfromtxt usage

2010-08-25 Thread Stéfan van der Walt
Hi Antoine On 25 August 2010 10:44, Antoine Dechaume wrote: > Hello, > I am trying to read a file with a variable number of values on each lines, > using genfromtxt and missing_values or filling_values arguments. > The usage of those arguments is not clear in the documentation, if what I am > try

[Numpy-discussion] genfromtxt usage

2010-08-25 Thread Antoine Dechaume
Hello, I am trying to read a file with a variable number of values on each lines, using genfromtxt and missing_values or filling_values arguments. The usage of those arguments is not clear in the documentation, if what I am trying to do is possible, how could I do it? Thanks, Antoine. ___

Re: [Numpy-discussion] genfromtxt documentation : review needed

2009-10-16 Thread Pierre GM
On Oct 16, 2009, at 8:29 AM, Skipper Seabold wrote: > Great work! I am especially glad to see the better documentation on > missing values, as I didn't fully understand how to do this. A few > small comments and a small attached diff with a few nitpicking > grammatical changes and some of what'

Re: [Numpy-discussion] genfromtxt documentation : review needed

2009-10-16 Thread Skipper Seabold
On Thu, Oct 15, 2009 at 7:08 PM, Pierre GM wrote: > All, > Here's a first draft for the documentation of np.genfromtxt. > It took me longer than I thought, but that way I uncovered and fix some > bugs. > Please send me your comments/reviews/etc > I count especially on our documentation specialist

[Numpy-discussion] genfromtxt documentation : review needed

2009-10-15 Thread Pierre GM
All, Here's a first draft for the documentation of np.genfromtxt. It took me longer than I thought, but that way I uncovered and fix some bugs. Please send me your comments/reviews/etc I count especially on our documentation specialist to let me know where to put it. Thx in advance P. do

Re: [Numpy-discussion] genfromtxt - the return

2009-10-07 Thread Pierre GM
On Oct 7, 2009, at 3:54 PM, Bruce Southey wrote: > > Anyhow, I do like what genfromtxt is doing so merging multiple > delimiters of the same type is not really needed. Thinking about it, merging multiple delimiters of the same type can be tricky: how do you distinguish between, say, "AAA\t\tC

Re: [Numpy-discussion] genfromtxt - the return

2009-10-07 Thread Bruce Southey
On 10/07/2009 02:14 PM, Christopher Barker wrote: > Pierre GM wrote: > >> On Oct 6, 2009, at 10:08 PM, Bruce Southey wrote: >> >>> option to merge delimiters - actually in SAS it is default >>> > Wow! that sure strikes me as a bad choice. > > >> Ahah! I get it. Well, I remembe

Re: [Numpy-discussion] genfromtxt - the return

2009-10-07 Thread Christopher Barker
Pierre GM wrote: > On Oct 6, 2009, at 10:08 PM, Bruce Southey wrote: >> option to merge delimiters - actually in SAS it is default Wow! that sure strikes me as a bad choice. > Ahah! I get it. Well, I remember that we discussed something like that a > few months ago when I started working on np.

Re: [Numpy-discussion] genfromtxt - the return

2009-10-06 Thread Pierre GM
On Oct 6, 2009, at 11:01 PM, Skipper Seabold wrote: > > In keeping with the making some work for you theme, I filed an > enhancement ticket for one change that we discussed and another IMO > useful addition. http://projects.scipy.org/numpy/ticket/1238 > > I think it would be nice if we could do >

Re: [Numpy-discussion] genfromtxt - the return

2009-10-06 Thread Skipper Seabold
On Tue, Oct 6, 2009 at 10:27 PM, Pierre GM wrote: >> Anyhow, I am really impressed on how this function works. > > Thx. I hope things haven't been slowed down too much. In keeping with the making some work for you theme, I filed an enhancement ticket for one change that we discussed and another

Re: [Numpy-discussion] genfromtxt - the return

2009-10-06 Thread Skipper Seabold
On Tue, Oct 6, 2009 at 10:08 PM, Bruce Southey wrote: > On Tue, Oct 6, 2009 at 4:04 PM, Pierre GM wrote: >> >> On Oct 6, 2009, at 4:43 PM, Christopher Barker wrote: >> >>> Pierre GM wrote: > I think that the default invalid_raise should be True. Mmh, OK, that's a +1/) for invalid_ra

Re: [Numpy-discussion] genfromtxt - the return

2009-10-06 Thread Pierre GM
On Oct 6, 2009, at 10:08 PM, Bruce Southey wrote: > No, just seeing what sort of problems I can create. This case is > partly based on if someone is using tab-delimited then they need to > set the delimiter='\t' otherwise it gives an error. Also I often parse > text files so, yes, you have to be c

Re: [Numpy-discussion] genfromtxt - the return

2009-10-06 Thread Bruce Southey
On Tue, Oct 6, 2009 at 4:04 PM, Pierre GM wrote: > > On Oct 6, 2009, at 4:43 PM, Christopher Barker wrote: > >> Pierre GM wrote: I think that the default invalid_raise should be True. >>> >>> Mmh, OK, that's a +1/) for invalid_raise=true. Anybody else ? >> >> yup -- make it +2 -- ignoring err

Re: [Numpy-discussion] genfromtxt - the return

2009-10-06 Thread Pierre GM
On Oct 6, 2009, at 4:43 PM, Christopher Barker wrote: > Pierre GM wrote: >>> I think that the default invalid_raise should be True. >> >> Mmh, OK, that's a +1/) for invalid_raise=true. Anybody else ? > > yup -- make it +2 -- ignoring erreos and losing data by default is a > "bad idea"! OK then,

Re: [Numpy-discussion] genfromtxt - the return

2009-10-06 Thread Christopher Barker
Pierre GM wrote: >> I think that the default invalid_raise should be True. > > Mmh, OK, that's a +1/) for invalid_raise=true. Anybody else ? yup -- make it +2 -- ignoring erreos and losing data by default is a "bad idea"! >> One 'feature' is that there is no way to indicate multiple delimiters

Re: [Numpy-discussion] genfromtxt - the return

2009-10-06 Thread Pierre GM
On Oct 6, 2009, at 2:42 PM, Bruce Southey wrote: >> > Hi, > Excellent as the changes appear to address incorrect number of > delimiters. They should also give some extra info if there's a problem w/ the converters. > I think that the default invalid_raise should be True. Mmh, OK, that's a +

Re: [Numpy-discussion] genfromtxt - the return

2009-10-06 Thread Bruce Southey
On 10/05/2009 02:13 PM, Pierre GM wrote: > All, > Could you try r7449 ? I introduced some mechanisms to keep track of > invalid lines (where the number of columns don't match what's > expected). By default, a warning is emitted and these lines are > skipped, but an optional argument gives the possi

[Numpy-discussion] genfromtxt - the return

2009-10-05 Thread Pierre GM
All, Could you try r7449 ? I introduced some mechanisms to keep track of invalid lines (where the number of columns don't match what's expected). By default, a warning is emitted and these lines are skipped, but an optional argument gives the possibility to raise an exception instead. Now,

Re: [Numpy-discussion] genfromtxt to structured array

2009-09-25 Thread Ryan May
On Fri, Sep 25, 2009 at 4:30 PM, Timmie wrote: > Hello, > this may be a easier question. > > I want to load data into an structured array with getting the names from the > column header (names=True). > > The data looks like: > >    ;month;day;hour;value >    1995;1;1;01;0 > > > but loading only wo

[Numpy-discussion] genfromtxt to structured array

2009-09-25 Thread Timmie
Hello, this may be a easier question. I want to load data into an structured array with getting the names from the column header (names=True). The data looks like: ;month;day;hour;value 1995;1;1;01;0 but loading only works only if changed to: year;month;day;hour;value

Re: [Numpy-discussion] genfromtxt view with object dtype

2009-02-04 Thread Brent Pedersen
On Wed, Feb 4, 2009 at 8:51 PM, Pierre GM wrote: > OK, Brent, try r6341. > I fixed genfromtxt for cases like yours (explicit dtype involving a > np.object). > Note that the fix won't work if the dtype is nested and involves > np.objects (as we would hit the pb of renaming fields we observed...). >

Re: [Numpy-discussion] genfromtxt view with object dtype

2009-02-04 Thread Pierre GM
OK, Brent, try r6341. I fixed genfromtxt for cases like yours (explicit dtype involving a np.object). Note that the fix won't work if the dtype is nested and involves np.objects (as we would hit the pb of renaming fields we observed...). Let me know how it goes. P. On Feb 4, 2009, at 4:03 PM,

Re: [Numpy-discussion] genfromtxt view with object dtype

2009-02-04 Thread Brent Pedersen
On Wed, Feb 4, 2009 at 9:36 AM, Pierre GM wrote: > > On Feb 4, 2009, at 12:09 PM, Brent Pedersen wrote: > >> hi, i am using genfromtxt, with a dtype like this: >> [('seqid', '|S24'), ('source', '|S16'), ('type', '|S16'), ('start', >> '> ' > Brent, > Please post a simple, self-contained example wit

Re: [Numpy-discussion] genfromtxt view with object dtype

2009-02-04 Thread Pierre GM
On Feb 4, 2009, at 12:09 PM, Brent Pedersen wrote: > hi, i am using genfromtxt, with a dtype like this: > [('seqid', '|S24'), ('source', '|S16'), ('type', '|S16'), ('start', > ' 'http://projects.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] genfromtxt view with object dtype

2009-02-04 Thread Brent Pedersen
hi, i am using genfromtxt, with a dtype like this: [('seqid', '|S24'), ('source', '|S16'), ('type', '|S16'), ('start', 'http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] genfromtxt

2009-01-21 Thread Brent Pedersen
On Wed, Jan 21, 2009 at 9:39 PM, Pierre GM wrote: > Brent, > Mind trying r6330 and let me know if it works for you ? Make sure that > you use names=True to detect a header. > P. > yes, works perfectly. thanks! -brent ___ Numpy-discussion mailing list Nu

Re: [Numpy-discussion] genfromtxt

2009-01-21 Thread Pierre GM
Brent, Mind trying r6330 and let me know if it works for you ? Make sure that you use names=True to detect a header. P. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] genfromtxt

2009-01-21 Thread Pierre GM
Brent, Currently, no, you won't be able to retrieve the header if it's commented. I'll see what I can do. P. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] genfromtxt

2009-01-21 Thread Brent Pedersen
hi, i'm using the new genfromtxt stuff in numpy svn, looks great pierre any who contributed. is there a way to have the header commented and still be able to have it recognized as the header? e.g. #gender age weight M 21 72.10 F 35 58.33 M 33 21.99 if i use np.loadtxt or genfromt