On 03/31/2011 10:08 AM, Ralf Gommers wrote: > On Thu, Mar 31, 2011 at 5:03 PM, Bruce Southey<[email protected]> wrote: >> On Wed, Mar 30, 2011 at 9:53 PM, Charles R Harris >> <[email protected]> wrote: >>> >>> On Sun, Mar 27, 2011 at 4:09 AM, Paul Anton Letnes >>> <[email protected]> wrote: >>>> On 26. mars 2011, at 21.44, Derek Homeier wrote: >>>> >>>>> Hi Paul, >>>>> >>>>> having had a look at the other tickets you dug up, >>>>> >> [snip] >>>>>> 1071: >>>>>> It is not clear to me whether loadtxt is supposed to support >>>>>> missing values in the fashion indicated in the ticket. >>>>> In principle it should at least allow you to, by the use of converters >>>>> as described there. >>>>> The problem is, the default delimiter is described as 'any >>>>> whitespace', which in the >>>>> present implementation obviously includes any number of blanks or >>>>> tabs. These >>>>> are therefore treated differently from delimiters like ',' or '&'. I'd >>>>> reckon there are >>>>> too many people actually relying on this behaviour to silently change it >>>>> (e.g. I know plenty of tables with columns separated by either one or >>>>> several >>>>> tabs depending on the length of the previous entry). But the tab is >>>>> apparently also >>>>> treated differently if explicitly specified with "delimiter='\t'" - >>>>> and in that case using >>>>> a converter à la {2: lambda s: float(s or 'Nan')} is working for >>>>> fields in the middle of >>>>> the line, but not at the end - clearly warrants improvement. I've >>>>> prepared a patch >>>>> working for Python3 as well. >>>> Great! >>>> >> This is an invalid ticket because the docstring clearly states that in >> 3 different, yet critical places, that missing values are not handled >> here: >> >> "Each row in the text file must have the same number of values." >> "genfromtxt : Load data with missing values handled as specified." >> " This function aims to be a fast reader for simply formatted files. The >> `genfromtxt` function provides more sophisticated handling of, e.g., >> lines with missing values." >> >> Really I am trying to separate the usage of loadtxt and genfromtxt to >> avoid unnecessary duplication and confusion. Part of this is >> historical because loadtxt was added in 2007 and genfromtxt was added >> in 2009. So really certain features of loadtxt have been 'kept' for >> backwards compatibility purposes yet these features can be 'abused' to >> handle missing data. But I really consider that any missing values >> should cause loadtxt to fail. > I agree with you Bruce, but it would be easier to discuss this on the > tickets instead of here. Could you add your comments there please? > > Ralf > 'Easier' seems a contradiction when you have use captcha... Sure I will add more comments there.
Bruce _______________________________________________ NumPy-Discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
