On Tue, 24 May 2011 06:53:30 am Spyros Charonis wrote: > Hello List, > > I'm trying to read some sequence files and modify them to a > particular [...]
You should almost never modify files in place, especially if you need to insert text. It *might*, sometimes, be acceptable to modify files in place if you are just over-writing what is already there, but absolutely not if you have to insert text! The problem is that file systems don't support insert. They support shrinking files, adding to the end, and overwriting in place. To insert, you have to do a LOT more work, which is slow, fragile and risky: if something goes bad, you end up with a corrupted file. It is almost always better to read the file into memory, process it, then write the output back out to the file. You ask: > for line sequence file: > if line.startswith('>P1; ICA ....) > make a newline > go to list with extracted tt; fields* > find the one with the same query (tt; ICA1 ...)* > insert this field in the newline This is better to become some variation of: infile = open('sequence file', 'r') outfile = open('processed file', 'w') for line in infile: outfile.write(line) if line.startswith('>P1; ICA'): new_line = ... #### what to do here??? outfile.write(new_info) outfile.close() infile.close() The problem then becomes, how to calculate the new_line above. Break that into steps: you have a line that looks like ">P1; ICA1_HUMAN" and you want to extract the ICA... part. def extract_ica(line): line = line.strip() if not line.startswith('>P1;'): raise ValueError('not a >P1 line') p = line.index(';') s = line[p+1:] s = s.strip() if s.startswith('ICA'): return s else: raise ValueError('no ICA... field in line') Meanwhile, you have a dict (not a list, a dictionary) that looks like this: descriptions = { 'ICA1_BOVINE': description, 'ICA1_HUMAN': description, ...} If you need help assembling this dict, just ask. With a dict, searches are easy. Making the new line takes three short lines of code: key = extract_ica(line) descr = descriptions[key] new_line = 'tt; ' + key + ' ' + desc -- Steven D'Aprano _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor