Re: [Tutor] Read-ahead for large fixed-width binary files?

Marc Tompkins Sat, 17 Nov 2007 21:43:24 -0800

On Nov 17, 2007 8:20 PM, Kent Johnson <[EMAIL PROTECTED]> wrote:

> I would wrap the record buffering into a generator function and probably
> use plain slicing to return the individual records instead of StringIO.
> I have a writeup on generators here:
>
> http://personalpages.tds.net/~kent37/kk/00004.html<http://personalpages.tds.net/%7Ekent37/kk/00004.html>
>
> Kent
>


I can see the benefit of wrapping the buffering in a generator - after all,
a large point of the exercise is to make this and all future programs
simpler to write and to debug - but I don't understand the second part

> use plain slicing to return the individual records instead of StringIO.

I hope I'm not being obtuse, but could you clarify that?

Meditating further, I begin to see a way through... the generator reads in
chunks 4096 records long; an internal iterator keeps track of how far along
to slice the thing; I need to check to make sure I don't try to slice past
the end of the loaf...  I'm not sure I see how this makes my life better
than using StringIO (especially since I'm actually using cStringIO, with a
"just-in-case" fallback in the import section, and it seems to be pretty
fast.)
Of course, if I load the entire file into memory I only need to check the
size once, but several of my clients have transaction files (to name only
one file) that are larger than their physical RAM, so that approach would
seem... problematic.  I'm trying to enhance performance, not kill it.

Thanks in advance for your insight -

Marc
-- 
www.fsrtechnologies.com

_______________________________________________
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Read-ahead for large fixed-width binary files?

Reply via email to