In article <CANm_+Zqmsgo8Q+Oz_0RCya-hJv4Q7PqynDb=lzrgvbtxgy3...@mail.gmail.com>, Anne Archibald <aarch...@physics.mcgill.ca> wrote:
> There was also some work on a semi-mutable array type that allowed > appending along one axis, then 'freezing' to yield a normal numpy > array (unfortunately I'm not sure how to find it in the mailing list > archives). One could write such a setup by hand, using mmap() or > realloc(), but I'd be inclined to simply write a filter that converted > the text file to some sort of binary file on the fly, value by value. > Then the file can be loaded in or mmap()ed. A 1 Gb text file is a > miserable object anyway, so it might be desirable to convert to (say) > HDF5 and then throw away the text file. Thank you and the others for your help. It seems a shame that loadtxt has no argument for predicted length, which would allow preallocation and less appending/copying data. And yes...reading the whole file first to figure out how many elements it has seems sensible to me -- at least as a switchable behavior, and preferably the default. 1Gb isn't that large in modern systems, but loadtxt is filing up all 6Gb of RAM reading it! I'll suggest the HDF5 solution to my colleague. Meanwhile I think he's hacked around the problem by reading the file through once to figure out the array length, allocating that, and reading the data in with a Python loop. Sounds slow, but it's working. -- Russell _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion