Dnia 06-08-2009 o 01:54:46 PeroMHC <[email protected]> wrote:
This snippet represents 3 individual DNA sequences. Each sequences is
identified by the line starting with >
The complete file has about 10 million individual sequences.
A simple enough problem, I want to read in this data, and cut out the
last 76 letters (nucleotides) from each individual sequence and send
them to a new txt file with a similar format.
If I understand correctly you want sth like this:
with open(path_to_the_input_file) as fasta:
with open(path_to_the_input_file) as nucleotides:
for seq in fasta:
print >>nucleotides, '> foo bar length=76'
print >>nucleotides, seq[-76]
Cheers,
*j
--
Jan Kaliszewski (zuo) <[email protected]>
--
http://mail.python.org/mailman/listinfo/python-list