On 3/25/2010 11:34 AM, kumar s wrote:
Dear group:
I need some tips/help from experts.
I have two files tab-delimted.
One file is 4K lines. The other files is 40K lines.
I want to search contents of a file to other and print those lines that satisfy.
File 1:
chr X Y
chr1 8337733 8337767 NM_001042682_cds_0_0_chr1_8337734_r 0 -
RERE
chr1 8338065 8338246 NM_001042682_cds_1_0_chr1_8338066_r 0 -
RERE
chr1 8338746 8338893 NM_001042682_cds_2_0_chr1_8338747_r 0 -
RERE
chr1 8340842 8341563 NM_001042682_cds_3_0_chr1_8340843_r 0 -
RERE
chr1 8342410 8342633 NM_001042682_cds_4_0_chr1_8342411_r 0 -
RERE
File 2:
Chr X Y
chr1 871490 871491
chr1 925085 925086
chr1 980143 980144
chr1 1548655 1548656
chr1 1589675 1589676
chr1 1977853 1977854
chr1 3384899 3384900
chr1 3406309 3406310
chr1 3732274 3732275
I want to search if file 2 X is greater or less then X and Y and print line of
file 2 and last column of file 1:
I don't understand your desired result. Could you post a very simplified
example of file1 and 2 and the desired result.
for j in file2:
col = j.split('\t')
for k in file1:
Note that the first time this loop is executed it will leave file2 at
eof. Subsequent executions of this loop will process no lines as file2
is at eof.
cols = k.split('\t')
if col[1]> cols[1]:
Note that this compares 2 strings; which may not give the same result as
integer comparison.
if col[1]< cols[2]:
print j +'\t'+cols[6]
This prints a lot of duplicate lines and is slow. Is there any other way I can
make it fast.
In file 1, how a dictionary can be made. I mean unique keys that are common to
file 1 and 2.
thanks
Kumar.
_______________________________________________
Tutor maillist - Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor
--
Bob Gailer
919-636-4239
Chapel Hill NC
_______________________________________________
Tutor maillist - Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor