Dear group:

I need some tips/help from experts. 

I have two files tab-delimted. 
One file is 4K lines. The other files is 40K lines. 

I want to search contents of a file to other and print those lines that satisfy.


File 1:
chr         X           Y
chr1    8337733 8337767 NM_001042682_cds_0_0_chr1_8337734_r     0       -       
RERE
chr1    8338065 8338246 NM_001042682_cds_1_0_chr1_8338066_r     0       -       
RERE
chr1    8338746 8338893 NM_001042682_cds_2_0_chr1_8338747_r     0       -       
RERE
chr1    8340842 8341563 NM_001042682_cds_3_0_chr1_8340843_r     0       -       
RERE
chr1    8342410 8342633 NM_001042682_cds_4_0_chr1_8342411_r     0       -       
RERE


File 2:
Chr          X         Y
chr1    871490  871491
chr1    925085  925086
chr1    980143  980144
chr1    1548655 1548656
chr1    1589675 1589676
chr1    1977853 1977854
chr1    3384899 3384900
chr1    3406309 3406310
chr1    3732274 3732275


I want to search if file 2 X is greater or less then X and Y and print line of 
file 2 and last column of file 1:


for j in file2:
        col = j.split('\t')
         for k in file1:
         cols = k.split('\t')
          if col[1] > cols[1]:
                 if col[1] < cols[2]:
                 print j +'\t'+cols[6]


This prints a lot of duplicate lines and is slow.  Is there any other way I can 
make it fast. 

In file 1, how a dictionary can be made. I mean unique keys that are common to 
file 1 and 2. 

thanks
Kumar.


      

_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Reply via email to