Hello List, I'm trying to read some sequence files and modify them to a particular format. These files are structured something like:
>P1; ICA1_HUMAN AAEVDTG..... (A very long sequence of letters) >P1;ICA1_BOVIN TRETG....(A very long sequence of letters) >P1;ICA2_HUMAN WKH.....(another sequence) I read a database file which has information that I need to modify my sequence files. I must extract one of the data fields from the database (done this) and place it in the sequence file (structure shown above). The relevant database fields go like: tt; ICA1_HUMAN Description tt; ICA1_BOVIN Description tt; ICA2_HUMAN Description What I would like is to extract the tt; fields (I already have code for that) and then to read through the sequence file and insert the TT field corresponding to the >P1 header right underneath the >P1 header. Basically, I need a newline everytime >P1 occurs in the sequence file and I need to paste its corresponding TT field in that newline (for P1; ICA1_HUMAN,that would be ICA1_HUMAN Description, etc). the pseudocode would go like this: for line sequence file: if line.startswith('>P1; ICA ....) make a newline go to list with extracted tt; fields* find the one with the same query (tt; ICA1 ...)* insert this field in the newline The steps marked * are the ones I am not sure how to implement. What logical structure would I need to make Python match a tt; field (I already have the list of entries) whenever it finds a header with the same content? Apologies for the verbosity, but I did want to be clear as it is quite specific. S.
_______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor