OK - you handled the problem regarding reading to end-of-file. Yes it
takes a lot longer, because now you are actually iterating through
match_zips for each line.
How large are these files? Consider creating a set from match_zips. As
lists get longer, set membership test become faster than list
membership
test.
If the outfile is empty that means that line[149:154] is never in
match_zips.
I suggest you take a look at match_zips. You will find a list of
strings
of length 6, which cannot match line[149:154], a string of length 5.
I am still struggling with this....I have simplified the code, because
I need to understand the principle.
#!/usr/bin/env python
import string
def main():
infile = open("filex")
outfile = open("results_testx", "w")
zips = open("zippys", "r")
match_zips = zips.readlines()
lines = [line for line in infile if line[0:3] + '\n' in
match_zips]
outfile.write(''.join(lines))
print line[0:3]
zips.close()
infile.close()
outfile.close()
main()
filex:
112332424
23423423423
34523423423
456234234234
234234234234
5672342342
683824242
zippys:
123
123
234
345
456
567
678
555
I want to output records from filex whose first 3 characters match a
record in zippys. Ouptut:
23423423423
34523423423
456234234234
234234234234
5672342342
I am not sure where I should put a '\n' or tweak something that I just
cannot see.
Thanks
_______________________________________________
Tutor maillist - Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor