Hi all, I am a biologist trying to learn python. Please help me for the
below code.
Number of files can be more than 50
>>> file1
cNo gene year
1 A,B 2004
2 C,D 2008
3 K,L 2011
>>> file2
cNo gene year
1 a,e 2001
2 d,c,p 2003
3 x,y,x 2000
4 m,n 1988
>>> file3
cNo gene year
1 R,S 2002
2 X 2005
3 A,Q 2002
Condition: compare gene of each file among all and find common/partial
common 'genes' and respective 'name' and 'cNo'
final_output
name cNo genes
file1,file2,file3 1,1,3 [A,B], [a,e], [A,Q]
file1,file2 2,2 [C,D], [d,c,p]
file2,file3 3,2 [x,y,z], [X]
import pandas as pd
import functools
file1 = pd.DataFrame({'cNo':[1,2,3], 'gene':
['A,B','C,D','K,L'],'year':[2004,2008,2011]})
file2 =
pd.DataFrame({'cNo':[1,2,3,4],'gene':['a,e','d,c,p','x,y,x','m,n'],'year':[2001,2003,2000,1988]})
file3 =
pd.DataFrame({'cNo':[1,2,3],'gene':['R,S','X','A,Q'],'year':[2002,2005,2002]})
files = [file1, file2, file3]
#I don't know how to merge by partial matching of column
df = functools.reduce(lambda left,right: pd.merge(left,right,on='genes'),
files)
print (df)
--
You received this message because you are subscribed to the Google Groups
"Python Programming for Autodesk Maya" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/python_inside_maya/c94db8e5-3eaa-479e-a028-940fe3cdd54d%40googlegroups.com.