OK - you handled the problem regarding reading to end-of-file. Yes it
takes a lot longer, because now you are actually iterating through
match_zips for each line.

How large are these files? Consider creating a set from match_zips. As
lists get longer, set membership test become faster than list membership
test.

If the outfile is empty that means that line[149:154] is never in
match_zips.

I suggest you take a look at match_zips. You will find a list of strings
of length 6, which cannot match line[149:154], a string of length 5.

I am still struggling with this....I have simplified the code, because I need to understand the principle.

#!/usr/bin/env python

import string

def main():
     infile = open("filex")
     outfile = open("results_testx", "w")
     zips = open("zippys", "r")
     match_zips = zips.readlines()
lines = [line for line in infile if line[0:3] + '\n' in match_zips]
     outfile.write(''.join(lines))
     print line[0:3]
     zips.close()
     infile.close()
     outfile.close()
main()

filex:
112332424
23423423423
34523423423
456234234234
234234234234
5672342342
683824242

zippys:
123
123
234
345
456
567
678
555


I want to output records from filex whose first 3 characters match a record in zippys. Ouptut:
23423423423
34523423423
456234234234
234234234234
5672342342

I am not sure where I should put a '\n' or tweak something that I just cannot see.

Thanks
_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Reply via email to