On Nov 18, 2007 5:15 AM, Kent Johnson <[EMAIL PROTECTED]> wrote: > Marc Tompkins wrote: > > On Nov 17, 2007 8:20 PM, Kent Johnson <[EMAIL PROTECTED] > > <mailto:[EMAIL PROTECTED]>> wrote: > > use plain slicing to return the individual records instead of > StringIO. > > > > I hope I'm not being obtuse, but could you clarify that? > > I think it will simplify the looping. A sketch, probably needs work: > > def by_record(path, recsize): > with open(path,'rb') as inFile: > inFile.read(recLen) # throw away the header record > while True: > buf = inFile.read(recLen*4096) > if not buf: > return > for ix in range(0, len(buf), recLen): > yield buf[ix:ix+recLen] > > > I'm not sure I see how this makes my > > life better than using StringIO (especially since I'm actually using > > cStringIO, with a "just-in-case" fallback in the import section, and it > > seems to be pretty fast.) > > This version seems simpler and more readable to me. > > Kent > It does look lean and mean, true. I'll time this against the cStringIO version. One thing, though - I think I need to do
> if len(buf) < recLen: > return > rather than > if not buf: > return > I'll have to experiment again to refresh my memory, but I believe I tried that in one of my first iterations (about a year ago, so I may be remembering wrong.) If I remember correctly, read() was still returning a result - but with a size that didn't evaluate to false. As you can imagine, hilarity ensued when I tried to slice the last record. Of course, I may have hallucinated that while on an extended caffeine jag, so feel free to disregard! -- www.fsrtechnologies.com
_______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor