Ignore that last mail, I hit send instead of save by mistake.
Between you you seem to be right, it's a problem with loading the array of
strings. There must be some large strings in the first 'rowname' column. If
this column is left out, it works fine (even as strings).
Many thanks, sorry for all
Appologies for the multiple posts, people. My posting to the forum was
pending for a long time, so I deleted it and tried emailing directly. I
didn't think they'd all be sent out.
Gokan, thanks for the reply, I hope you get this one.
"Here I use loadtxt to read ~89 MB txt file. Can you use loadtxt
Dave Wood wrote:
> Well, I suppose they are all considered to be strings here. I haven't
> tried to convert the numbers to floats yet.
This could be an issue. For strings, numpy creates an array of strings,
all of the same length, so each element is as big as the largest one:
In [13]: l
Out[13]
On Wed, Sep 23, 2009 at 9:42 AM, davew wrote:
>
> Hi,
>
> I've got a fairly large (but not huge, 58mb) tab seperated text file, with
> approximately 200 columns and 56k rows of numbers and strings.
>
> Here's a snippet of my code to create a numpy matrix from the data file...
>
>
>
> data
On 09/23/2009 10:00 AM, Dave Wood wrote:
"If the text file has 'numbers and strings' how is numpy meant to know
what dtype to use?
Please try genfromtxt especially if columns contain both numbers and
strings."
Well, I suppose they are all considered to be strings here. I haven't
tried to convert
"If the text file has 'numbers and strings' how is numpy meant to know
what dtype to use?
Please try genfromtxt especially if columns contain both numbers and
strings."
Well, I suppose they are all considered to be strings here. I haven't tried
to convert the numbers to floats yet.
"What happens
On 09/23/2009 08:42 AM, davew wrote:
> Hi,
>
> I've got a fairly large (but not huge, 58mb) tab seperated text file, with
> approximately 200 columns and 56k rows of numbers and strings.
>
> Here's a snippet of my code to create a numpy matrix from the data file...
>
>
>
> data = map(lambd
Hi,
I've got a fairly large (but not huge, 58mb) tab seperated text file, with
approximately 200 columns and 56k rows of numbers and strings.
Here's a snippet of my code to create a numpy matrix from the data file...
data = map(lambda x : x.strip().split('\t'), sys.stdin.readlines())