Re: [Numpy-discussion] More loadtxt() changes

2008-11-28 Thread Pierre GM
Manuel, Give me the week-end to come up with something. What you want is already doable with the current implementation of np.loadtxt, through the converter keyword. Support for missing data will be covered in a separate function, most likely to be put in numpy.ma.io at term. On Nov 28, 2

Re: [Numpy-discussion] More loadtxt() changes

2008-11-28 Thread Manuel Metz
Pierre GM wrote: > On Nov 27, 2008, at 3:08 AM, Manuel Metz wrote: >> Certainly, yes! Dealing with fixed-length fields would be necessary. >> The >> case I had in mind had both -- a separator ("|") __and__ fixed-length >> fields -- and is probably very special in that sense. But such >> data-file

Re: [Numpy-discussion] More loadtxt() changes

2008-11-27 Thread Nils Wagner
On Thu, 27 Nov 2008 09:08:41 +0100 Manuel Metz <[EMAIL PROTECTED]> wrote: > Pierre GM wrote: >> On Nov 26, 2008, at 5:55 PM, Ryan May wrote: >> >>> Manuel Metz wrote: Ryan May wrote: > 3) Better support for missing values. The docstring >mentions a > way of > handling mi

Re: [Numpy-discussion] More loadtxt() changes

2008-11-27 Thread Pierre GM
On Nov 27, 2008, at 3:08 AM, Manuel Metz wrote: >> > > Certainly, yes! Dealing with fixed-length fields would be necessary. > The > case I had in mind had both -- a separator ("|") __and__ fixed-length > fields -- and is probably very special in that sense. But such > data-files exists out there

Re: [Numpy-discussion] More loadtxt() changes

2008-11-27 Thread Manuel Metz
Pierre GM wrote: > On Nov 26, 2008, at 5:55 PM, Ryan May wrote: > >> Manuel Metz wrote: >>> Ryan May wrote: 3) Better support for missing values. The docstring mentions a way of handling missing values by passing in a converter. The problem with this is that you have

Re: [Numpy-discussion] More loadtxt() changes

2008-11-26 Thread Pierre GM
On Nov 26, 2008, at 5:55 PM, Ryan May wrote: > Manuel Metz wrote: >> Ryan May wrote: >>> 3) Better support for missing values. The docstring mentions a >>> way of >>> handling missing values by passing in a converter. The problem >>> with this is >>> that you have to pass in a converter for

Re: [Numpy-discussion] More loadtxt() changes

2008-11-26 Thread Ryan May
Manuel Metz wrote: > Ryan May wrote: >> 3) Better support for missing values. The docstring mentions a way of >> handling missing values by passing in a converter. The problem with this is >> that you have to pass in a converter for *every column* that will contain >> missing values. If you have

Re: [Numpy-discussion] More loadtxt() changes

2008-11-26 Thread Ryan May
John Hunter wrote: > On Tue, Nov 25, 2008 at 11:23 PM, Ryan May <[EMAIL PROTECTED]> wrote: > >> Updated patch attached. This includes: >> * Updated docstring >> * New tests >> * Fixes for previous issues >> * Fixes to make new tests actually work >> >> I appreciate any and all feedback. > >

Re: [Numpy-discussion] More loadtxt() changes

2008-11-26 Thread John Hunter
On Tue, Nov 25, 2008 at 11:23 PM, Ryan May <[EMAIL PROTECTED]> wrote: > Updated patch attached. This includes: > * Updated docstring > * New tests > * Fixes for previous issues > * Fixes to make new tests actually work > > I appreciate any and all feedback. I'm having trouble applying your p

Re: [Numpy-discussion] More loadtxt() changes

2008-11-26 Thread Manuel Metz
Ryan May wrote: > Hi, > > I have a couple more changes to loadtxt() that I'd like to code up in time > for 1.3, but I thought I should run them by the list before doing too much > work. These are already implemented in some fashion in > matplotlib.mlab.csv2rec(), but the code bases are different

Re: [Numpy-discussion] More loadtxt() changes

2008-11-25 Thread Ryan May
Pierre GM wrote: On Nov 25, 2008, at 10:02 PM, Ryan May wrote: Pierre GM wrote: * Your locked version of update won't probably work either, as you force the converter to output a string (you set the status to largest possible, that's the one that outputs strings). Why don't you set the status

Re: [Numpy-discussion] More loadtxt() changes

2008-11-25 Thread Ryan May
Pierre GM wrote: > On Nov 25, 2008, at 10:02 PM, Ryan May wrote: >> Pierre GM wrote: >>> * Your locked version of update won't probably work either, as you >>> force >>> the converter to output a string (you set the status to largest >>> possible, that's the one that outputs strings). Why don't y

Re: [Numpy-discussion] More loadtxt() changes

2008-11-25 Thread Pierre GM
On Nov 25, 2008, at 10:02 PM, Ryan May wrote: > Pierre GM wrote: >> >> * Your locked version of update won't probably work either, as you >> force >> the converter to output a string (you set the status to largest >> possible, that's the one that outputs strings). Why don't you set the >> status

Re: [Numpy-discussion] More loadtxt() changes

2008-11-25 Thread Ryan May
Pierre GM wrote: > Ryan, > Quick comments: > > * I already have some unittests for StringConverter, check the file I > attach. Ok, great. > * Your str2bool will probably mess things up in upgrade compared to the > one JDH had written (the one I send you): you don't wanna use > int(bool(value)

Re: [Numpy-discussion] More loadtxt() changes

2008-11-25 Thread Charles R Harris
On Tue, Nov 25, 2008 at 5:00 PM, Pierre GM <[EMAIL PROTECTED]> wrote: > All, another question: > What's the best way to have some kind of sandbox for code like the one Ryan > is writing ? So that we can try it, modify it, without commiting anything to > SVN yet ? > Probably make a branch and do

Re: [Numpy-discussion] More loadtxt() changes

2008-11-25 Thread Pierre GM
Ryan, Quick comments: * I already have some unittests for StringConverter, check the file I attach. * Your str2bool will probably mess things up in upgrade compared to the one JDH had written (the one I send you): you don't wanna use int(bool(value)), as it'll always give you 0 or 1 when

Re: [Numpy-discussion] More loadtxt() changes

2008-11-25 Thread Ryan May
Pierre GM wrote: Sounds like a plan. Wouldn't mind getting more feedback from fellow users before we get too deep, however... Ok, I've attached, as a first cut, a diff against SVN HEAD that does (I think) what I'm looking for. It passes all of the old tests and passes my own quick test. A

Re: [Numpy-discussion] More loadtxt() changes

2008-11-25 Thread Pierre GM
Oh don't mention... However, I'd be quite grateful if you could give an eye to the pb of mixing np.scalars and 0d subclasses of ndarray: looks like it's a C pb, quite out of my league... http://scipy.org/scipy/numpy/ticket/826 http://article.gmane.org/gmane.comp.python.numeric.general/26354/ma

Re: [Numpy-discussion] More loadtxt() changes

2008-11-25 Thread Travis E. Oliphant
Pierre GM wrote: > OK then, I'll take care of that over the next few weeks... > > Thanks Pierre. -Travis ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] More loadtxt() changes

2008-11-25 Thread Travis E. Oliphant
John Hunter wrote: > On Tue, Nov 25, 2008 at 12:16 PM, Pierre GM <[EMAIL PROTECTED]> wrote: > > >> A la mlab.csv2rec ? It could work with a bit more tweaking, basically >> following John Hunter's et al. path. What happens when the column names are >> unknown (read from the header) or wrong ? >>

Re: [Numpy-discussion] More loadtxt() changes

2008-11-25 Thread Pierre GM
OK then, I'll take care of that over the next few weeks... On Nov 25, 2008, at 4:56 PM, John Hunter wrote: > On Tue, Nov 25, 2008 at 2:01 PM, Pierre GM <[EMAIL PROTECTED]> > wrote: >> >> On Nov 25, 2008, at 2:26 PM, John Hunter wrote: >>> >>> Yes, I've said on a number of occasions I'd like to

Re: [Numpy-discussion] More loadtxt() changes

2008-11-25 Thread John Hunter
On Tue, Nov 25, 2008 at 2:01 PM, Pierre GM <[EMAIL PROTECTED]> wrote: > > On Nov 25, 2008, at 2:26 PM, John Hunter wrote: >> >> Yes, I've said on a number of occasions I'd like to see these >> functions in numpy, since a number of them make more sense as numpy >> methods than as stand alone functio

Re: [Numpy-discussion] More loadtxt() changes

2008-11-25 Thread Pierre GM
On Nov 25, 2008, at 3:33 PM, Ryan May wrote: > > You couldn't run this loop on the array returned by np.loadtxt() (by > masking on the appropriate fill value)? Yet an extra loop... Doable, yes... But meh. ___ Numpy-discussion mailing list Numpy-discuss

Re: [Numpy-discussion] More loadtxt() changes

2008-11-25 Thread Ryan May
Pierre GM wrote: > Nope, we still need to double check whether there's any missing data > in any field of the line we process, independently of the conversion. > So there must be some extra loop involved, and I'd need a special > function in numpy.ma to take care of that. So our options are >

Re: [Numpy-discussion] More loadtxt() changes

2008-11-25 Thread Pierre GM
>> It shouldn't create any *extra* temporaries since we already make a >> list > of lists before creating the final array. It just introduces an extra > looping step. (I'd reuse the existing list of lists). Cool then, go for it. > If my understanding of > StringConverter is correct, tweaking

Re: [Numpy-discussion] More loadtxt() changes

2008-11-25 Thread Ryan May
> On Nov 25, 2008, at 2:37 PM, Ryan May wrote: >> What about doing the parsing and type inference in a loop and holding >> onto the already split lines? Then loop through the lines with the >> converters that were finally chosen? In addition to making my usecase >> work, this has the benefit of n

Re: [Numpy-discussion] More loadtxt() changes

2008-11-25 Thread Pierre GM
On Nov 25, 2008, at 2:26 PM, John Hunter wrote: > > Yes, I've said on a number of occasions I'd like to see these > functions in numpy, since a number of them make more sense as numpy > methods than as stand alone functions. Great. Could we think about getting that on for 1.3x, would you have t

Re: [Numpy-discussion] More loadtxt() changes

2008-11-25 Thread Ryan May
Pierre GM wrote: > On Nov 25, 2008, at 2:06 PM, Ryan May wrote: >> 1) It looks like the function returns a structured array rather than a >> rec array, so that fields are obtained by doing a dictionary access. >> Since it's a dictionary access, is there any reason that the header >> needs to be mun

Re: [Numpy-discussion] More loadtxt() changes

2008-11-25 Thread John Hunter
On Tue, Nov 25, 2008 at 12:16 PM, Pierre GM <[EMAIL PROTECTED]> wrote: > A la mlab.csv2rec ? It could work with a bit more tweaking, basically > following John Hunter's et al. path. What happens when the column names are > unknown (read from the header) or wrong ? > > Actually, I'd like John to co

Re: [Numpy-discussion] More loadtxt() changes

2008-11-25 Thread Pierre GM
On Nov 25, 2008, at 2:06 PM, Ryan May wrote: > > 1) It looks like the function returns a structured array rather than a > rec array, so that fields are obtained by doing a dictionary access. > Since it's a dictionary access, is there any reason that the header > needs to be munged to replace chara

Re: [Numpy-discussion] More loadtxt() changes

2008-11-25 Thread Ryan May
Pierre GM wrote: > Ryan, > FYI, I've been coding over the last couple of weeks an extension of > loadtxt for a better support of masked data, with the option to read > column names in a header. Please find an example below (I also have > unittest). Most of the work is actually inspired from mat

Re: [Numpy-discussion] More loadtxt() changes

2008-11-25 Thread Christopher Barker
Pierre GM wrote: >> would it possible to specify column header, rather than number here? > > A la mlab.csv2rec ? I'll have to take a look at that. > following John Hunter's et al. path. What happens when the column > names are unknown (read from the header) or wrong ? well, my use case is tha

Re: [Numpy-discussion] More loadtxt() changes

2008-11-25 Thread Pierre GM
On Nov 25, 2008, at 12:30 PM, Christopher Barker wrote: > > """ > missing : string, optional > A string representing a missing value, irrespective of the > column where it appears (e.g., ``'missing'`` or ``'unused'``. > """ > > It might be nice if "missing" could be a sequence of strings,

Re: [Numpy-discussion] More loadtxt() changes

2008-11-25 Thread Christopher Barker
Pierre GM wrote: > FYI, I've been coding over the last couple of weeks an extension of > loadtxt for a better support of masked data, with the option to read > column names in a header. Please find an example below great, thanks! this could be very useful to me. Two comments: """ missing : st

Re: [Numpy-discussion] More loadtxt() changes

2008-11-25 Thread Pierre GM
Ryan, FYI, I've been coding over the last couple of weeks an extension of loadtxt for a better support of masked data, with the option to read column names in a header. Please find an example below (I also have unittest). Most of the work is actually inspired from matplotlib's mlab.csv2re

[Numpy-discussion] More loadtxt() changes

2008-11-25 Thread Ryan May
Hi, I have a couple more changes to loadtxt() that I'd like to code up in time for 1.3, but I thought I should run them by the list before doing too much work. These are already implemented in some fashion in matplotlib.mlab.csv2rec(), but the code bases are different enough, that pretty much onl