In article <[EMAIL PROTECTED]>, <[EMAIL PROTECTED]> wrote: >Thanks to all who replied. It's very appreciated. > >Yes, I had to doublecheck line counts and the number of lines is ~16 >million (insetead of stated 1.6B). > >Also: > >>What is a "Unicode text file"? How is it encoded: utf8, utf16, utf16le, >>utf16be, ??? If you don't know, do this: >The file is UTF-8 > >> Do the first two characters always belong to the ASCII subset? >Yes, first two always belong to ASCII subset > >> What are you going to do with it after it's sorted? >I need to isolate all lines that start with two characters (zz to be >particular)
Like in? grep '^zz' longfile > aapje You will have a hard time to beat that with python, in every respect. <SNIP> > >Cheers, > >Ira Groetjes Albert -- -- Albert van der Horst, UTRECHT,THE NETHERLANDS Economic growth -- like all pyramid schemes -- ultimately falters. [EMAIL PROTECTED]&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst -- http://mail.python.org/mailman/listinfo/python-list
