[Tutor] just one question
Hi, i want to ask one thing that suppose i have a .txt file having content like:--- 47 8 ALA H H 7.85 0.02 1 48 8 ALA HAH 2.98 0.02 1 49 8 ALA HBH 1.05 0.02 1 50 8 ALA C C179.39 0.3 1 51 8 ALA CAC 54.67 0.3 1 52 8 ALA CBC 18.85 0.3 1 53 8 ALA N N123.95 0.3 1 10715 ALA H H 8.05 0.02 1 10815 ALA HAH 4.52 0.02 1 10915 ALA HBH 1.29 0.02 1 11015 ALA C C177.18 0.3 1 11115 ALA CAC 52.18 0.3 1 11215 ALA CBC 20.64 0.3 1 11315 ALA N N119.31 0.3 1 15421 ALA H H 7.66 0.02 1 15521 ALA HAH 4.05 0.02 1 15621 ALA HBH 1.39 0.02 1 15721 ALA C C179.35 0.3 1 15821 ALA CAC 54.33 0.3 1 now what i want that i will make another .txt file in which first it will write the position of ALA lets say 8, 15, 21 then its name ALA and then the fifth column value for only three atoms C,CA and CB. Means it will be someting like: 8 ALA C = 179.39 CA = 54.67 CB = 18.85 15 ALA C = 177.18 CA = 52.18 CB = 20.64 21 ALA C = 179.35 CA = 54.33 CB = if some value is not there then it will leave that as blank.I am new in python but this is what we want, so how can i do it using python script. Amrita Kumari Research Fellow IISER Mohali Chandigarh INDIA ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] just one question
Thankyou very much sir now it is working..it is giving that result which i wanted. Thankyou very much.. Thanks, Amrita > Please use reply-all, so that emails go to the list as well. > > 2009/7/16 : >> Thankyou for help it is working and giving the result but the only >> problem >> is that it is making a very big file as it is searching for each >> position >> of ALA and first writting its C value then CA then CB like that, is it >> possible that it will do all these things but in the output it will give >> only the possible of C, CA and CB for each position of ALA:.. >> >> Like instead of giving all these:--- >> >> 23 ALA C = CA = CB = >> 21 ALA C = 179.35 CA = 54.33 CB = 17.87 >> 15 ALA C = 177.18 CA = 52.18 CB = 20.64 >> 8 ALA C = 179.39 CA = 54.67 CB = 18.85 >> 23 ALA C = CA = CB = >> 21 ALA C = 179.35 CA = 54.33 CB = 17.87 >> . >> >> it will only give: >> >> 8 ALA C = 179.39 CA = 54.67 CB = 18.85 >> 15 ALA C = 177.18 CA = 52.18 CB = 20.64 >> 21 ALA C = 179.35 CA = 54.33 CB = 17.87 >> 23 ALA C = 179.93 CA = 55.84 CB = 17.55 >> 33 ALA C = 179.24 CA = 55.58 CB = 19.75 >> 38 ALA C = 178.95 CA = 54.33 CB = 18.30 >> >> >> Thanks, >> Amrita >> >> >> >> >> >> Amrita Kumari >> Research Fellow >> IISER Mohali >> Chandigarh >> INDIA >> >> > > Either you're not entering the code correctly, or the input file is > different to what you've shown us so far. > > I think you need to send me a copy of the input file - or at least a > larger sample than we've had so far so we can see what we're dealing > with. > > The code should be: > > from __future__ import with_statement > from collections import defaultdict > from decimal import Decimal > > atoms = defaultdict(dict) > > with open("file1.txt") as f: >for line in f: >try: >n, pos, ala, at, symb, weight, rad, count = line.split() >except ValueError: >continue >else: >atoms[int(pos)][at] = Decimal(weight) > > #modify these lines to fit your needs: > positionsNeeded = (8, 15, 21) > atomsNeeded = ("C", "CA", "CB") > > for k, v in atoms.iteritems(): >print k, "ALA C = %s CA = %s CB = %s" % tuple(v.get(a,"") for a in > atomsNeeded) > > Check you've got the indentation (the spaces at the start of lines) > correct, exactly how it is above: this is VERY important in python. > > -- > Rich "Roadie Rich" Lovely > There are 10 types of people in the world: those who know binary, > those who do not, and those who are off by one. > Amrita Kumari Research Fellow IISER Mohali Chandigarh INDIA ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] how to join two different files
Hi, I have two large different column datafiles now i want to join them as single multi-column datafile:-- I tried the command:-- >>> file('ala', 'w').write(file('/home/amrita/alachems/chem2.txt', 'r').read()+file('/home/amrita/pdbfile/pdb2.txt', 'r').read()) but it is priniting second file after first, whereas i want to join them columwise like:--- FileA FileB FileC 12 14 12 14 15 + 16 = 15 16 18 17 18 17 20 19 20 19 What command I should use? Thanks, Amrita Kumari Research Fellow IISER Mohali Chandigarh INDIA ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] how to join two different files
Thankyou sir it is working.but one more thing i want to ask that if my file will have entries like:--- fileA and fileB 12 10 13 12 14 15 means if their no. of entries will not match then how to combine them(both input files have more than one column). Thanks, Amrita >> Maybe you could break that up a bit? This is the tutor list, not a >> one-liner competition! > > rather than one-liners, we can try to create the most "Pythonic" > solution. below's my entry. :-) > > cheers, > -wesley > > myMac$ cat parafiles.py > #!/usr/bin/env python > > from itertools import izip > from os.path import exists > > def parafiles(*files): > vec = (open(f) for f in files if exists(f)) > data = izip(*vec) > [f.close() for f in vec] > return data > > for data in parafiles('fileA.txt', 'fileB.txt'): > print ' '.join(d.strip() for d in data) > > myMac$ cat fileA.txt > FileA > 12 > 15 > 18 > 20 > > myMac$ cat fileB.txt > FileB > 14 > 16 > 18 > 20 > 22 > > myMac$ parafiles.py > FileA FileB > 12 14 > 15 16 > 18 18 > 20 20 > > - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > "Core Python Programming", Prentice Hall, (c)2007,2001 > "Python Fundamentals", Prentice Hall, (c)2009 > http://corepython.com > > wesley.j.chun :: wescpy-at-gmail.com > python training and technical consulting > cyberweb.consulting : silicon valley, ca > http://cyberwebconsulting.com > ___ > Tutor maillist - Tutor@python.org > http://mail.python.org/mailman/listinfo/tutor > Amrita Kumari Research Fellow IISER Mohali Chandigarh INDIA ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] how to fill zero value and join two column
Hi, I have two text file, having entries as fileA 33 ALA H = 7.57 N = 121.52 CA = 55.58 HA = 3.89 C = 179.24 38 ALA H = 8.29 N = 120.62 CA = 54.33 HA = 4.04 C = 178.95 8 ALA H = 7.85 N = 123.95 CA = 54.67 HA = C = fileB 8 ALA helix (helix_alpha, helix1) 21 ALA helix (helix_alpha, helix2) 23 ALA helix (helix_alpha, helix2) now what i want that i will make another file in which the matching entries from the two file get printed together along with zero values for those atoms which doesnot have nay value in fileA. so the reult will be something like:- fileC 8 ALA H = 7.85 N = 123.95 CA = 54.67 HA =0.00 C =0.00|8 ALA helix (helix_alpha, helix1) I tried to merge these two files using commands like:- from collections import defaultdict >>> def merge(sources): ... if __name__ == "__main__": ... a = open("/home/amrita/alachems/chem100.txt") ... c = open("/home/amrita/secstr/secstr100.txt") ...def source(stream): ...return (line.strip() for line in stream) ...for m in merge([source(x) for x in [a,c]]): ...print "|".join(c.ljust(10) for c in m) ... but it is not giving any value. Thanks, Amrita Kumari Research Fellow IISER Mohali Chandigarh INDIA ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] how to get blank value
Hi, I have a file having lines:- 48 ALA H = 8.33 N = 120.77 CA = 55.18 HA = 4.12 C = 181.50 104 ALA H = 7.70 N = CA = HA = 4.21 C = 85 ALA H = 8.60 N = CA = HA = 4.65 C = Now i want to make two another file in which i want to put those lines for which C is missing and another one for which N,CA and C all are missing, I tried in this way: import re expr = re.compile("C = None") f = open("helix.dat") for line in f: if expr.search(line): print line but i am not getting the desired output. Amrita Kumari Research Fellow IISER Mohali Chandigarh INDIA ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] how to get blank value
> What is the python command for searching blank value of a parameter? > Please use Reply All to send it to the list as well. > > >> I am trying it in this way also:--- >> >> import re >> expr = re.compile("C") > > This will find all lines with the letter C in them. > Which from your data is all of them. Look at the regex documentation > to see how to represent the end of a line (or, slightly more complex, a > non digit). > >> f = open('chem.txt') >> for line in f: >> expr.search(line) >> if 'C = ' > > > This is invalid Python, the second level of indentation should produce an > error! > Also you are not doing anything with the result of your search, you just > throw > it away. > > You need something like > > for line in open('chem.txt'): > if expr.search(line): >print line > > > HTH, > > Alan g. > >> > wrote >> > >> >> 48 ALA H = 8.33 N = 120.77 CA = 55.18 HA = 4.12 C = 181.50 >> >> 104 ALA H = 7.70 N = CA = HA = 4.21 C = >> >> >> >> Now i want to make two another file in which i want to put those >> lines >> >> for >> >> which C is missing and another one for which N,CA and C all are >> missing, >> >> >> >> I tried in this way: >> >> import re >> >> expr = re.compile("C = None") >> > >> > This will search for the literal string 'C = None' which does not >> exist in >> > your sdata. >> > You need to search for 'C = 'at the end of the line (assuming it is >> always >> > there. >> > Otherwise you need to search for 'C = ' followed by a non number.) >> > >> > HTH, >> > >> > -- >> > Alan Gauld >> > Author of the Learn to Program web site >> > http://www.alan-g.me.uk/ >> > >> > >> > ___ >> > Tutor maillist - Tutor@python.org >> > http://mail.python.org/mailman/listinfo/tutor >> > >> >> >> Amrita Kumari >> Research Fellow >> IISER Mohali >> Chandigarh >> INDIA > > Amrita Kumari Research Fellow IISER Mohali Chandigarh INDIA ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] how to get blank value
Hi, I have a file having lines:- 48 ALA H = 8.33 N = 120.77 CA = 55.18 HA = 4.12 C = 181.50 104 ALA H = 7.70 N = 121.21 CA = 54.32 HA = 4.21 C = 85 ALA H = 8.60 N = CA = HA = 4.65 C = Now i want to make two another file in which i want to put those lines for which C is missing and another one for which N,CA and C all are missing, With these commands:- import re f = open('chem.txt') for line in f: if re.search('C = ',line): print line I am getting those lines for which C value is there but how to get those one for which it doesn't have any value, i did google search but still i am not getting. Amrita Kumari Research Fellow IISER Mohali Chandigarh INDIA ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] how to get blank value
Sorry to say, but till now I have not got the solution of my problem, I tried with this command:- import re if __name__ == '__main__': data = open('chem.txt').readlines() for line in data: RE = re.compile('C = (.)',re.M) matches = RE.findall(line) for m in matches: print line but with this also I am getting those lines for which C value is there. > Hi, > > I have a file having lines:- > > 48 ALA H = 8.33 N = 120.77 CA = 55.18 HA = 4.12 C = 181.50 > 104 ALA H = 7.70 N = 121.21 CA = 54.32 HA = 4.21 C = > 85 ALA H = 8.60 N = CA = HA = 4.65 C = > > Now i want to make two another file in which i want to put those lines for > which C is missing and another one for which N,CA and C all are missing, > > With these commands:- > > import re > f = open('chem.txt') > for line in f: > if re.search('C = ',line): > print line > > I am getting those lines for which C value is there but how to get those > one for which it doesn't have any value, i did google search but still i > am not getting. > > Amrita Kumari > Research Fellow > IISER Mohali > Chandigarh > INDIA > Amrita Kumari Research Fellow IISER Mohali Chandigarh INDIA ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] how to get blank value
It is not giving any value, without any error ph08...@sys53:~> python trial.py ph08...@sys53:~> it is coming out from shell. Thanks for help. Amrita > amr...@iisermohali.ac.in wrote: >> Sorry to say, but till now I have not got the solution of my problem, I >> tried with this command:- >> >> import re >> >> > # assuming H = , N = , CA = , HA = and C = always present in that order > if __name__ == '__main__': > data = open('chem.txt').readlines() > for line in data: > line = line.split('=') > if not line[5]: # C value missing > if len(line[2])==1 and len(line[3])==1: # N and CA values missing > print "all missing", line > else: > print "C missing", line > >> >> >>> Hi, >>> >>> I have a file having lines:- >>> >>> 48 ALA H = 8.33 N = 120.77 CA = 55.18 HA = 4.12 C = 181.50 >>> 104 ALA H = 7.70 N = 121.21 CA = 54.32 HA = 4.21 C = >>> 85 ALA H = 8.60 N = CA = HA = 4.65 C = >>> >>> Now i want to make two another file in which i want to put those lines >>> for >>> which C is missing and another one for which N,CA and C all are >>> missing, >>> >>> With these commands:- >>> >>> import re >>> f = open('chem.txt') >>> for line in f: >>> if re.search('C = ',line): >>> print line >>> >>> I am getting those lines for which C value is there but how to get >>> those >>> one for which it doesn't have any value, i did google search but still >>> i >>> am not getting. >>> >>> Amrita Kumari >>> Research Fellow >>> IISER Mohali >>> Chandigarh >>> INDIA >>> >>> >>> > > > -- > Bob Gailer > Chapel Hill NC > 919-636-4239 > Amrita Kumari Research Fellow IISER Mohali Chandigarh INDIA ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] how to get blank value
with these data it is giving output but when I tried if __name__ == '__main__': data = open('chem.txt').readlines() for line in data: line2 = line.split('=') if not line2[5]: # C value missing if len(line2[2]) <= 5 and len(line2[3]) <= 5: # N and CA values missing print "all missing", line else: print "C missing", line by putting data in .txt file then it is not giving output. Actually I have few large data file what I want that I will put those lines in one file for which only C value is missing and in another one I will put those line for which N, CA and C all values are missing. Thanks for help. Amrita > amr...@iisermohali.ac.in wrote: >> It is not giving any value, without any error >> >> ph08...@sys53:~> python trial.py >> ph08...@sys53:~> >> it is coming out from shell. >> > Try this. I embedded the test data to simplify testing: > > data = """48 ALA H = 8.33 N = 120.77 CA = 55.18 HA = 4.12 C = 181.50 > 104 ALA H = 7.70 N = 121.21 CA = 54.32 HA = 4.21 C = > 85 ALA H = 8.60 N = CA = HA = 4.65 C =""".split('\n') > for line in data: > line2 = line.split('=') > if not line2[5]: # C value missing > if len(line2[2]) <= 5 and len(line2[3]) <= 5: # N and CA values > missing > print "all missing", line > else: > print "C missing", line > > -- > Bob Gailer > Chapel Hill NC > 919-636-4239 > Amrita Kumari Research Fellow IISER Mohali Chandigarh INDIA ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] how to get blank value
Thanks for help Sir but with these commands it is showing error:- ph08...@sys53:~> python trip.py Traceback (most recent call last): File "trip.py", line 6, in from pyparsing import * ImportError: No module named pyparsing > Ok, I've seen various passes at this problem using regex, split('='), > etc., > but the solutions seem fairly fragile, and the OP doesn't seem happy with > any of them. Here is how this problem looks if you were going to try > breaking it up with pyparsing: > - Each line starts with an integer, and the string "ALA" > - "ALA" is followed by a series of "X = 1.2"-type attributes, where the > value part might be missing. > > And to implement (with a few bells and whistles thrown in for free): > > data = """48 ALA H = 8.33 N = 120.77 CA = 55.18 HA = 4.12 C = 181.50 > 104 ALA H = 7.70 N = 121.21 CA = 54.32 HA = 4.21 C = > 85 ALA H = 8.60 N = CA = HA = 4.65 C =""".splitlines() > > > from pyparsing import * > > # define some basic data expressions > integer = Word(nums) > real = Combine(Word(nums) + "." + Word(nums)) > > # use parse actions to automatically convert numeric > # strings to actual numbers at parse time > integer.setParseAction(lambda tokens:int(tokens[0])) > real.setParseAction(lambda tokens:float(tokens[0])) > > # define expressions for 'X = 1.2' assignments; note that the > # value might be missing, so use Optional - we'll fill in > # a default value of 0.0 if no value is given > keyValue = Word(alphas.upper()) + '=' + \ > Optional(real|integer, default=0.0) > > # define overall expression for the data on a line > dataline = integer + "ALA" + OneOrMore(Group(keyValue))("kvdata") > > # attach parse action to define named values in the returned tokens > def assignDataByKey(tokens): > for k,_,v in tokens.kvdata: > tokens[k] = v > dataline.setParseAction(assignDataByKey) > > # for each line in the input data, parse it and print some of the data > fields > for d in data: > print d > parsedData = dataline.parseString(d) > print parsedData.dump() > print parsedData.CA > print parsedData.N > print > > > Prints out: > > 48 ALA H = 8.33 N = 120.77 CA = 55.18 HA = 4.12 C = 181.50 > [48, 'ALA', ['H', '=', 8.3301], ['N', '=', 120.77], ['CA', > '=', > 55.18], ['HA', '=', 4.1201], ['C', '=', 181.5]] > - C: 181.5 > - CA: 55.18 > - H: 8.33 > - HA: 4.12 > - N: 120.77 > - kvdata: [['H', '=', 8.3301], ['N', '=', 120.77], ['CA', '=', > 55.18], ['HA', '=', 4.1201], ['C', '=', 181.5]] > 55.18 > 120.77 > > 104 ALA H = 7.70 N = 121.21 CA = 54.32 HA = 4.21 C = > [104, 'ALA', ['H', '=', 7.7002], ['N', '=', > 121.20], > ['CA', '=', 54.32], ['HA', '=', 4.21], ['C', '=', 0.0]] > - C: 0.0 > - CA: 54.32 > - H: 7.7 > - HA: 4.21 > - N: 121.21 > - kvdata: [['H', '=', 7.70000002], ['N', '=', 121.20], > ['CA', '=', 54.32], ['HA', '=', 4.21], ['C', '=', 0.0]] > 54.32 > 121.21 > > 85 ALA H = 8.60 N = CA = HA = 4.65 C = > [85, 'ALA', ['H', '=', 8.5996], ['N', '=', 0.0], ['CA', '=', > 0.0], ['HA', '=', 4.6504], ['C', '=', 0.0]] > - C: 0.0 > - CA: 0.0 > - H: 8.6 > - HA: 4.65 > - N: 0.0 > - kvdata: [['H', '=', 8.5996], ['N', '=', 0.0], ['CA', '=', > 0.0], ['HA', '=', 4.6504], ['C', '=', 0.0]] > 0.0 > 0.0 > > > Learn more about pyparsing at http://pyparsing.wikispaces.com. > > -- Paul > > > ___ > Tutor maillist - Tutor@python.org > http://mail.python.org/mailman/listinfo/tutor > Amrita Kumari Research Fellow IISER Mohali Chandigarh INDIA ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] arrangement of datafile
Hi, I am new in programming and want to try Python programming (which is simple and easy to learn) to solve one problem: in which I have various long file like this: 1 GLY HA2=3.7850 HA3=3.9130 2 SER H=8.8500 HA=4.3370 N=115.7570 3 LYS H=8.7530 HA=4.0340 HB2=1.8080 N=123.2380 4 LYS H=7.9100 HA=3.8620 HB2=1.7440 HG2=1.4410 N=117.9810 5 LYS H=7.4450 HA=4.0770 HB2=1.7650 HG2=1.4130 N=115.4790 6 LEU H=7.6870 HA=4.2100 HB2=1.3860 HB3=1.6050 HG=1.5130 HD11=0.7690 HD12=0.7690 HD13=0.7690 N=117.3260 7 PHE H=7.8190 HA=4.5540 HB2=3.1360 N=117.0800 8 PRO HD2=3.7450 9 GLN H=8.2350 HA=4.0120 HB2=2.1370 N=116.3660 10 ILE H=7.9790 HA=3.6970 HB=1.8800 HG21=0.8470 HG22=0.8470 HG23=0.8470 HG12=1.6010 HG13=2.1670 N=119.0300 11 ASN H=7.9470 HA=4.3690 HB3=2.5140 N=117.8620 12 PHE H=8.1910 HA=4.1920 HB2=3.1560 N=121.2640 13 LEU H=8.1330 HA=3.8170 HB3=1.7880 HG=1.5810 HD11=0.8620 HD12=0.8620 HD13=0.8620 N=119.1360 ... where first column is the residue number, what I want is to print individual atom chemical shift value one by one along with residue number.for example for atom HA2 it should be: 1 HA2=3.7850 2 HA2=nil 3 HA2=nil . .. 13 HA2=nil similarly for atom HA3 it should be same as above: 1 HA3=3.9130 2 HA3=nil 3 HA3=nil ... 13 HA3=nil while for atom H it should be: 1 H=nil 2 H=8.8500 3 H=8.7530 4 H=7.9100 5 H=7.4450 but in some file the residue number is not continuous some are missing (in between). I want to write python code to solve this problem but don't know how to split the datafile and print the desired output. This problem is important in order to compare each atom chemical shift value with some other web-based generated chemical shift value. As the number of atoms in different row are different and similar atom are at random position in different residue hence I don't know to to split them. Please help to solve this problem. Thanks, Amrita ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] arrangement of datafile
Hi, On 17th Dec. I posted one question, how to arrange datafile in a particular fashion so that I can have only residue no. and chemical shift value of the atom as: 1 H=nil 2 H=8.8500 3 H=8.7530 4 H=7.9100 5 H=7.4450 Peter has replied to this mail but since I haven't subscribe to the tutor mailing list earlier hence I didn't receive the reply, I apologize for my mistake, today I checked his reply and he asked me to do few things: Can you read a file line by line? Can you split the line into a list of strings at whitespace occurences? Can you extract the first item from the list and convert it to an int? Can you remove the first two items from the list? Can you split the items in the list at the "="? I tried these and here is the code: f=open('filename') lines=f.readlines() new=lines.split() number=int(new[0]) mylist=[i.split('=')[0] for i in new] one thing I don't understand is why you asked to remove first two items from the list? and is the above code alright?, it can produce output like the one you mentioned: {1: {'HA2': 3.785, 'HA3': 3.913}, 2: {'H': 8.85, 'HA': 4.337, 'N': 115.757}, 3: {'H': 8.753, 'HA': 4.034, 'HB2': 1.808, 'N': 123.238}, 4: {'H': 7.91, 'HA': 3.862, 'HB2': 1.744, 'HG2': 1.441, 'N': 117.981}, 5: {'H': 7.445, 'HA': 4.077, 'HB2': 1.765, 'HG2': 1.413, 'N': 115.479}, 6: {'H': 7.687, 'HA': 4.21, 'HB2': 1.386, 'HB3': 1.605, 'HD11': 0.769, 'HD12': 0.769, 'HD13': 0.769, 'HG': 1.513, 'N': 117.326}, 7: {'H': 7.819, 'HA': 4.554, 'HB2': 3.136, 'N': 117.08}, 8: {'HD2': 3.745}, 9: {'H': 8.235, 'HA': 4.012, 'HB2': 2.137, 'N': 116.366}, 10: {'H': 7.979, 'HA': 3.697, 'HB': 1.88, 'HG12': 1.601, 'HG13': 2.167, 'HG21': 0.847, 'HG22': 0.847, 'HG23': 0.847, 'N': 119.03}, 11: {'H': 7.947, 'HA': 4.369, 'HB3': 2.514, 'N': 117.862}, 12: {'H': 8.191, 'HA': 4.192, 'HB2': 3.156, 'N': 121.264}, 13: {'H': 8.133, 'HA': 3.817, 'HB3': 1.788, 'HD11': 0.862, 'HD12': 0.862, 'HD13': 0.862, 'HG': 1.581, 'N': 119.136}} If not then please help to point out my mistake so that I can get the correct output. Thanking you for your help and time. Thanks, Amrita ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] arrangement of datafile
Hi, My data file is something like this: 1 GLY HA2=3.7850 HA3=3.9130 2 SER H=8.8500 HA=4.3370 N=115.7570 3 LYS H=8.7530 HA=4.0340 HB2=1.8080 N=123.2380 4 LYS H=7.9100 HA=3.8620 HB2=1.7440 HG2=1.4410 N=117.9810 5 LYS H=7.4450 HA=4.0770 HB2=1.7650 HG2=1.4130 N=115.4790 6 LEU H=7.6870 HA=4.2100 HB2=1.3860 HB3=1.6050 HG=1.5130 HD11=0.7690 HD12=0.7690 HD13=0.7690 N=117.3260 7 PHE H=7.8190 HA=4.5540 HB2=3.1360 N=117.0800 8 PRO HD2=3.7450 9 GLN H=8.2350 HA=4.0120 HB2=2.1370 N=116.3660 10 ILE H=7.9790 HA=3.6970 HB=1.8800 HG21=0.8470 HG22=0.8470 HG23=0.8470 HG12=1.6010 HG13=2.1670 N=119.0300 11 ASN H=7.9470 HA=4.3690 HB3=2.5140 N=117.8620 12 PHE H=8.1910 HA=4.1920 HB2=3.1560 N=121.2640 13 LEU H=8.1330 HA=3.8170 HB3=1.7880 HG=1.5810 HD11=0.8620 HD12=0.8620 HD13=0.8620 N=119.1360 ... where first column is the residue number and I want to print the individual atom chemical shift value one by one along with residue number.for example for atom HA2 it should be: 1 HA2=3.7850 2 HA2=nil 3 HA2=nil . .. 13 HA2=nil similarly for atom HA3 it should be same as above: 1 HA3=3.9130 2 HA3=nil 3 HA3=nil ... 13 HA3=nil while for atom H it should be: 1 H=nil 2 H=8.8500 3 H=8.7530 4 H=7.9100 5 H=7.4450 can you suggest me how to produce nested dicts like this: {1: {'HA2': 3.785, 'HA3': 3.913}, 2: {'H': 8.85, 'HA': 4.337, 'N': 115.757}, 3: {'H': 8.753, 'HA': 4.034, 'HB2': 1.808, 'N': 123.238}, 4: {'H': 7.91, 'HA': 3.862, 'HB2': 1.744, 'HG2': 1.441, 'N': 117.981}, 5: {'H': 7.445, 'HA': 4.077, 'HB2': 1.765, 'HG2': 1.413, 'N': 115.479}, 6: {'H': 7.687, 'HA': 4.21, 'HB2': 1.386, 'HB3': 1.605, 'HD11': 0.769, 'HD12': 0.769, 'HD13': 0.769, 'HG': 1.513, 'N': 117.326}, 7: {'H': 7.819, 'HA': 4.554, 'HB2': 3.136, 'N': 117.08}, 8: {'HD2': 3.745}, 9: {'H': 8.235, 'HA': 4.012, 'HB2': 2.137, 'N': 116.366}, 10: {'H': 7.979, 'HA': 3.697, 'HB': 1.88, 'HG12': 1.601, 'HG13': 2.167, 'HG21': 0.847, 'HG22': 0.847, 'HG23': 0.847, 'N': 119.03}, 11: {'H': 7.947, 'HA': 4.369, 'HB3': 2.514, 'N': 117.862}, 12: {'H': 8.191, 'HA': 4.192, 'HB2': 3.156, 'N': 121.264}, 13: {'H': 8.133, 'HA': 3.817, 'HB3': 1.788, 'HD11': 0.862, 'HD12': 0.862, 'HD13': 0.862, 'HG': 1.581, 'N': 119.136}} Thanks, Amrita On Wed, Dec 25, 2013 at 7:28 PM, Dave Angel wrote: > On Wed, 25 Dec 2013 16:17:27 +0800, Amrita Kumari > wrote: > >> I tried these and here is the code: >> > > > f=open('filename') >> lines=f.readlines() >> new=lines.split() >> > > That line will throw an exception. > >> number=int(new[0]) >> mylist=[i.split('=')[0] for i in new] >> > > > one thing I don't understand is why you asked to remove first two >> items from the list? >> > > You don't show us the data file, but presumably he would ask that because > the first two lines held different formats of data. Like your number= line > was intended to fetch a count from only line zero? > > > > and is the above code alright?, it can produce >> output like the one you mentioned: >> {1: {'HA2': 3.785, 'HA3': 3.913}, >> 2: {'H': 8.85, 'HA': 4.337, 'N': 115.757}, >> > > The code above won't produce a dict of dicts. It won't even get past the > exception. Please use copy/paste. > > -- > DaveA > > ___ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor > ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] Fwd: arrangement of datafile
Sorry I forgot to add tutor mailing list.please help for the below. -- Forwarded message -- From: Amrita Kumari Date: Fri, Jan 3, 2014 at 2:42 PM Subject: Re: [Tutor] arrangement of datafile To: Evans Anyokwu Hi, I have saved my data in csv format now it is looking like this: 2,ALA,C=178.255,CA=53.263,CB=18.411,, 3,LYS,H=8.607,C=176.752,CA=57.816,CB=31.751,N=119.081 4,ASN,H=8.185,C=176.029,CA=54.712,CB=38.244,N=118.255 5,VAL,H=7.857,HG11=0.892,HG12=0.892,HG13=0.892,HG21=0.954,HG22=0.954,HG23=0.954,C=177.259,CA=64.232,CB=31.524,CG1=21.402,CG2=21.677,N=119.998 6,ILE,H=8.062,HG21=0.827,HG22=0.827,HG23=0.827,HD11=0.807,HD12=0.807,HD13=0.807,C=177.009,CA=63.400,CB=37.177,CG2=17.565,CD1=13.294,N=122.474 7,VAL,H=7.993,HG11=0.879,HG12=0.879,HG13=0.879,HG21=0.957,HG22=0.957,HG23=0.957,C=177.009,CA=65.017,CB=31.309,CG1=21.555,CG2=22.369,N=120.915 8,LEU,H=8.061,HD11=0.844,HD12=0.844,HD13=0.844,HD21=0.810,HD22=0.810,HD23=0.810,C=178.655,CA=56.781,CB=41.010,CD1=25.018,CD2=23.824,N=121.098 9,ASN,H=8.102,C=176.695,CA=54.919,CB=38.674,N=118.347 10,ALA,H=8.388,HB1=1.389,HB2=1.389,HB3=1.389,C=178.263,CA=54.505,CB=17.942,N=124.124, 11,ALA,H=8.279,HB1=1.382,HB2=1.382,HB3=1.382,C=179.204,CA=54.298,CB=17.942,N=119.814, 12,SER,H=7.952,C=175.873,CA=60.140,CB=63.221,N=113.303 13,ALA,H=7.924,HB1=1.382,HB2=1.382,HB3=1.382,C=178.420,CA=53.470,CB=18.373,N=124.308, -- - --- with comma seperated: I can read the file as infile = open('inputfile.csv', 'r') I can read each line through data = infile.readlines() I can split the line into a list of strings at comma occurences as for line in data: csvline = line.strip().split(",") after this please help me to guide how to proceed as I am new in programming but want to learn python program. Thanks, Amrita On 12/28/13, Evans Anyokwu wrote: > One thing that I've noticed is that there is no structure to your data. > Some have missing *fields* -so making the use of regex out of the question. > > Without seeing your code, I'd suggest saving the data as a separated value > file and parse it. Python has a good csv support. > > Get this one sorted out first then we can move on to the nested list. > > Good luck. > Evans > ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Fwd: arrangement of datafile
Hi Steven, I tried this code: import csv with open('file.csv') as f: reader = csv.reader(f) for row in reader: print(row) row[0] = int(row[0]) up to this extent it is ok; it is ok it is giving the output as: ['1' , ' GLY' , 'HA2=3.7850' , 'HA3=3.9130' , ' ' , ' ' , ' ' , ' '] [ '2' , 'SER' , 'H=8.8500' , 'HA=4.3370' , 'N=115.7570' , ' ' , ' ' , ' '] -- --- but the command : key, value = row[2].split('=', 1) value = float(value.strip()) print(value) is giving the value of row[2] element as ['1' , ' GLY' , 'HA2=3.7850' , 'HA3=3.9130' , ' ' , ' ' , ' ' , ' '] 3.7850 [ '2' , 'SER' , 'H=8.8500' , 'HA=4.3370' , 'N=115.7570' , ' ' , ' ' , ' '] 8.8500 -- so this is not what I want I want to print all the chemical shift value of similar atom from each row at one time like this: 1 HA2=3.7850 2 HA2=nil 3 HA2=nil . .. 13 HA2=nil similarly, for atom HA3: 1 HA3=3.9130 2 HA3=nil 3 HA3=nil ... .... 13 HA3=nil and so on. so how to split each item into a key and a numeric value and then search for similar atom and print its chemical shift value at one time along with residue no.. Thanks, Amrita On Mon, Jan 6, 2014 at 6:44 AM, Steven D'Aprano wrote: > Hi Amrita, > > On Sun, Jan 05, 2014 at 10:01:16AM +0800, Amrita Kumari wrote: > > > I have saved my data in csv format now it is looking like this: > > If you have a file in CSV format, you should use the csv module to read > the file. > > http://docs.python.org/3/library/csv.html > > If you're still using Python 2.x, you can read this instead: > > http://docs.python.org/2/library/csv.html > > > I think that something like this should work for you: > > import csv > with open('/path/to/your/file.csv') as f: > reader = csv.reader(f) > for row in reader: > print(row) > > Of course, you can process the rows, not just print them. Each row will > be a list of strings. For example, you show the first row as this: > > > 2,ALA,C=178.255,CA=53.263,CB=18.411,, > > so the above code should print this for the first row: > > ['2', 'ALA', 'C=178.255', 'CA=53.263', 'CB=18.411', '', '', '', > '', '', '', '', '', ''] > > > You can process each field as needed. For example, to convert the > first field from a string to an int: > > row[0] = int(row[0]) > > To split the third item 'C=178.255' into a key ('C') and a numeric > value: > > key, value = row[2].split('=', 1) > value = float(value.strip()) > > > > Now you know how to read CSV files. What do you want to do with the data > in the file? > > > > -- > Steven > ___ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor > ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Fwd: arrangement of datafile
Hi, Sorry for delay in reply(as internet was very slow from past two days), I tried this code which you suggested (by saving it in a file): import csv with open('19162.csv') as f: reader = csv.reader(f) for row in reader: print(row) row[0] = int(row[0]) key,value = item.split('=', 1) value = float(value) print(value) and I got the output as: C:\Python33>python 8.py ['2', 'ALA', 'C=178.255', 'CA=53.263', 'CB=18.411', '', '', '', '', '', '', '', '', '', ''] Traceback (most recent call last): File "8.py", line 7, in key,value = item.split('=', 1) NameError: name 'item' is not defined my datafile is like this: 2,ALA,C=178.255,CA=53.263,CB=18.411,, 3,LYS,H=8.607,C=176.752,CA=57.816,CB=31.751,N=119.081 4,ASN,H=8.185,C=176.029,CA=54.712,CB=38.244,N=118.255 5,VAL,H=7.857,HG11=0.892,HG12=0.892,HG13=0.892,HG21=0.954,HG22=0.954,HG23=0.954,C=177.259,CA=64.232,CB=31.524,CG1=21.402,CG2=21.677,N=119.998 6,ILE,H=8.062,HG21=0.827,HG22=0.827,HG23=0.827,HD11=0.807,HD12=0.807,HD13=0.807,C=177.009,CA=63.400,CB=37.177,CG2=17.565,CD1=13.294,N=122.474 7,VAL,H=7.993,HG11=0.879,HG12=0.879,HG13=0.879,HG21=0.957,HG22=0.957,HG23=0.957,C=177.009,CA=65.017,CB=31.309,CG1=21.555,CG2=22.369,N=120.915 8,LEU,H=8.061,HD11=0.844,HD12=0.844,HD13=0.844,HD21=0.810,HD22=0.810,HD23=0.810,C=178.655,CA=56.781,CB=41.010,CD1=25.018,CD2=23.824,N=121.098 9,ASN,H=8.102,C=176.695,CA=54.919,CB=38.674,N=118.347 10,ALA,H=8.388,HB1=1.389,HB2=1.389,HB3=1.389,C=178.263,CA=54.505,CB=17.942,N=124.124, -- where 1st element of each row is the residue no. but it is not continuous (some are missing also for example the 1st row is starting from resdiue no. 2 not from 1) second element of each row is the name of amino acid and rest element of each row are the various atom along with chemical shift information corresponding to that particular amino acid for example H=8.388 is showing that atom is H and it has chemical shift value 8.388. But the arrangement of these atoms in each row are quite random and in few row there are many more atoms and in few there are less. This value I got from Shiftx2 web server. I just want to align the similar atom chemical shift value into one column (along with residue no.) for example for atom C, it could be: 2 C=178.255 3 C=176.752 4 C=176.029 5 C=177.259 --- --- for atom H, it could be: 2 H=nil 3 H=8.607 4 H=8.185 5 H=7.857 6 H=8.062 --- and so on. So if a row doesn't have that atom (for ex. row 1 doesn't have H atom) then if it can print nil that I can undestand that it is missing for that particular residue. This arrangement I need in order to compare this chemical shift value with other web server generated program. Thanks, Amrita and got the output as: On 1/7/14, Steven D'Aprano wrote: > On Mon, Jan 06, 2014 at 04:57:38PM +0800, Amrita Kumari wrote: >> Hi Steven, >> >> I tried this code: >> >> import csv >> with open('file.csv') as f: >> reader = csv.reader(f) >> for row in reader: >> print(row) >> row[0] = int(row[0]) >> >> up to this extent it is ok; it is ok it is giving the output as: >> >> ['1' , ' GLY' , 'HA2=3.7850' , 'HA3=3.9130' , ' ' , ' ' , ' ' , ' '] >> [ '2' , 'SER' , 'H=8.8500' , 'HA=4.3370' , 'N=115.7570' , ' ' , ' ' , >> ' >> '] > > It looks like you are re-typing the output into your email. It is much > better if you copy and paste it so that we can see exactly what happens. > > >> but the command : >> >> key, value = row[2].split('=', 1) >> value = float(value.strip()) >> print(value) >> >> is giving the value of row[2] element as >> >> ['1' , ' GLY' , 'HA2=3.7850' , 'HA3=3.9130' , ' ' , ' ' , ' ' , ' '] >> 3.7850 >> [ '2' , 'SER' , 'H=8.8500' , 'HA=4.3370' , 'N=115.7570' , ' ' , ' ' , >> ' >> '] >> 8.8500 > > So far, the code is doing exactly what you told it to do. Take the third > column (index 2), and split on the equals sign. Convert the part on the > right of the equals sign to a float, and print the float. > > >> so this is not what I want I want to print all the chemical shift value >> of >>
Re: [Tutor] arrangement of datafile
Hi Peter, Thankyou very much for your kind help. I got the output like the way I wanted (which you have also shown in your output). I really appreciate your effort. Thanks for your time. Amrita On Thu, Jan 9, 2014 at 8:41 PM, Peter Otten <__pete...@web.de> wrote: > Amrita Kumari wrote: > > > On 17th Dec. I posted one question, how to arrange datafile in a > > particular fashion so that I can have only residue no. and chemical > > shift value of the atom as: > > 1 H=nil > > 2 H=8.8500 > > 3 H=8.7530 > > 4 H=7.9100 > > 5 H=7.4450 > > > > Peter has replied to this mail but since I haven't subscribe to the > > tutor mailing list earlier hence I didn't receive the reply, I > > apologize for my mistake, today I checked his reply and he asked me to > > do few things: > > I'm sorry, I'm currently lacking the patience to tune into your problem > again, but maybe the script that I wrote (but did not post) back then is of > help. > > The data sample: > > $ cat residues.txt > 1 GLY HA2=3.7850 HA3=3.9130 > 2 SER H=8.8500 HA=4.3370 N=115.7570 > 3 LYS H=8.7530 HA=4.0340 HB2=1.8080 N=123.2380 > 4 LYS H=7.9100 HA=3.8620 HB2=1.7440 HG2=1.4410 N=117.9810 > 5 LYS H=7.4450 HA=4.0770 HB2=1.7650 HG2=1.4130 N=115.4790 > 6 LEU H=7.6870 HA=4.2100 HB2=1.3860 HB3=1.6050 HG=1.5130 HD11=0.7690 > HD12=0.7690 HD13=0.7690 N=117.3260 > 7 PHE H=7.8190 HA=4.5540 HB2=3.1360 N=117.0800 > 8 PRO HD2=3.7450 > 9 GLN H=8.2350 HA=4.0120 HB2=2.1370 N=116.3660 > 10 ILE H=7.9790 HA=3.6970 HB=1.8800 HG21=0.8470 HG22=0.8470 HG23=0.8470 > HG12=1.6010 HG13=2.1670 N=119.0300 > 11 ASN H=7.9470 HA=4.3690 HB3=2.5140 N=117.8620 > 12 PHE H=8.1910 HA=4.1920 HB2=3.1560 N=121.2640 > 13 LEU H=8.1330 HA=3.8170 HB3=1.7880 HG=1.5810 HD11=0.8620 HD12=0.8620 > HD13=0.8620 N=119.1360 > > The script: > > $ cat residues.py > def process(filename): > residues = {} > with open(filename) as infile: > for line in infile: > parts = line.split()# split line at whitespace > residue = int(parts.pop(0)) # convert first item to integer > if residue in residues: > raise ValueError("duplicate residue {}".format(residue)) > parts.pop(0)# discard second item > > # split remaining items at "=" and put them in a dict, > # e. g. {"HA2": 3.7, "HA3": 3.9} > pairs = (pair.split("=") for pair in parts) > lookup = {atom: float(value) for atom, value in pairs} > > # put previous lookup dict in residues dict > # e. g. {1: {"HA2": 3.7, "HA3": 3.9}} > residues[residue] = lookup > > return residues > > def show(residues): > atoms = set().union(*(r.keys() for r in residues.values())) > residues = sorted(residues.items()) > for atom in sorted(atoms): > for residue, lookup in residues: > print "{} {}={}".format(residue, atom, lookup.get(atom, "nil")) > print > print "---" > print > > if __name__ == "__main__": > r = process("residues.txt") > show(r) > > Note that converting the values to float can be omitted if all you want to > do is print them. Finally the output of the script: > > $ python residues.py > 1 H=nil > 2 H=8.85 > 3 H=8.753 > 4 H=7.91 > 5 H=7.445 > 6 H=7.687 > 7 H=7.819 > 8 H=nil > 9 H=8.235 > 10 H=7.979 > 11 H=7.947 > 12 H=8.191 > 13 H=8.133 > > --- > > 1 HA=nil > 2 HA=4.337 > 3 HA=4.034 > 4 HA=3.862 > 5 HA=4.077 > 6 HA=4.21 > 7 HA=4.554 > 8 HA=nil > 9 HA=4.012 > 10 HA=3.697 > 11 HA=4.369 > 12 HA=4.192 > 13 HA=3.817 > > --- > > [snip] > > ___ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor > ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor