On Nov 17, 2007 8:20 PM, Kent Johnson <[EMAIL PROTECTED]> wrote: > I would wrap the record buffering into a generator function and probably > use plain slicing to return the individual records instead of StringIO. > I have a writeup on generators here: > > http://personalpages.tds.net/~kent37/kk/00004.html<http://personalpages.tds.net/%7Ekent37/kk/00004.html> > > Kent >
I can see the benefit of wrapping the buffering in a generator - after all, a large point of the exercise is to make this and all future programs simpler to write and to debug - but I don't understand the second part > use plain slicing to return the individual records instead of StringIO. I hope I'm not being obtuse, but could you clarify that? Meditating further, I begin to see a way through... the generator reads in chunks 4096 records long; an internal iterator keeps track of how far along to slice the thing; I need to check to make sure I don't try to slice past the end of the loaf... I'm not sure I see how this makes my life better than using StringIO (especially since I'm actually using cStringIO, with a "just-in-case" fallback in the import section, and it seems to be pretty fast.) Of course, if I load the entire file into memory I only need to check the size once, but several of my clients have transaction files (to name only one file) that are larger than their physical RAM, so that approach would seem... problematic. I'm trying to enhance performance, not kill it. Thanks in advance for your insight - Marc -- www.fsrtechnologies.com
_______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor