[Tutor] just one question

2009-07-15 Thread amrita
Hi,

i want to ask one thing that suppose i have a .txt file having content
like:---


 47 8   ALA   H H  7.85 0.02 1
 48 8   ALA   HAH  2.98 0.02 1
 49 8   ALA   HBH  1.05 0.02 1
 50 8   ALA   C C179.39  0.3 1
 51 8   ALA   CAC 54.67  0.3 1
 52 8   ALA   CBC 18.85  0.3 1
 53 8   ALA   N N123.95  0.3 1
10715   ALA   H H  8.05 0.02 1
10815   ALA   HAH  4.52 0.02 1
10915   ALA   HBH  1.29 0.02 1
11015   ALA   C C177.18  0.3 1
11115   ALA   CAC 52.18  0.3 1
11215   ALA   CBC 20.64  0.3 1
11315   ALA   N N119.31  0.3 1
15421   ALA   H H  7.66 0.02 1
15521   ALA   HAH  4.05 0.02 1
15621   ALA   HBH  1.39 0.02 1
15721   ALA   C C179.35  0.3 1
15821   ALA   CAC 54.33  0.3 1

now what i want that i will make another .txt file in which first it will
write the position of ALA lets say 8, 15, 21 then its name ALA and then
the fifth column value for only three atoms C,CA and CB.

Means it will be someting like:

8  ALA  C = 179.39  CA = 54.67  CB = 18.85
15 ALA  C = 177.18  CA = 52.18  CB = 20.64
21 ALA  C = 179.35  CA = 54.33  CB =

if some value is not there then it will leave that as blank.I am new in
python but this is what we want, so how can i do it using python script.





Amrita Kumari
Research Fellow
IISER Mohali
Chandigarh
INDIA

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] just one question

2009-07-16 Thread amrita

Thankyou very much sir now it is working..it is giving that result
which i wanted. Thankyou very much..

Thanks,
Amrita




> Please use reply-all, so that emails go to the list as well.
>
> 2009/7/16  :
>> Thankyou for help it is working and giving the result but the only
>> problem
>> is that it is making a very big file as it is searching for each
>> position
>> of ALA and first writting its C value then CA then CB like that, is it
>> possible that it will do all these things but in the output it will give
>> only the possible of C, CA and CB for each position of ALA:..
>>
>> Like instead of giving all these:---
>>
>> 23 ALA C =  CA =  CB =
>> 21 ALA C = 179.35 CA = 54.33 CB = 17.87
>> 15 ALA C = 177.18 CA = 52.18 CB = 20.64
>> 8 ALA C = 179.39 CA = 54.67 CB = 18.85
>> 23 ALA C =  CA =  CB =
>> 21 ALA C = 179.35 CA = 54.33 CB = 17.87
>> .
>>
>> it will only give:
>>
>> 8 ALA C = 179.39 CA = 54.67 CB = 18.85
>> 15 ALA C = 177.18 CA = 52.18 CB = 20.64
>> 21 ALA C = 179.35 CA = 54.33 CB = 17.87
>> 23 ALA C = 179.93 CA = 55.84 CB = 17.55
>> 33 ALA C = 179.24 CA = 55.58 CB = 19.75
>> 38 ALA C = 178.95 CA = 54.33 CB = 18.30
>>
>>
>> Thanks,
>> Amrita
>>
>>
>>
>>
>>
>> Amrita Kumari
>> Research Fellow
>> IISER Mohali
>> Chandigarh
>> INDIA
>>
>>
>
> Either you're not entering the code correctly, or the input file is
> different to what you've shown us so far.
>
> I think you need to send me a copy of the input file - or at least a
> larger sample than we've had so far so we can see what we're dealing
> with.
>
> The code should be:
>
> from __future__ import with_statement
> from collections import defaultdict
> from decimal import Decimal
>
> atoms = defaultdict(dict)
>
> with open("file1.txt") as f:
>for line in f:
>try:
>n, pos, ala, at, symb, weight, rad, count = line.split()
>except ValueError:
>continue
>else:
>atoms[int(pos)][at] = Decimal(weight)
>
> #modify these lines to fit your needs:
> positionsNeeded = (8, 15, 21)
> atomsNeeded = ("C", "CA", "CB")
>
> for k, v in atoms.iteritems():
>print k, "ALA C = %s CA = %s CB = %s" % tuple(v.get(a,"") for a in
> atomsNeeded)
>
> Check you've got the indentation (the spaces at the start of lines)
> correct, exactly how it is above:  this is VERY important in python.
>
> --
> Rich "Roadie Rich" Lovely
> There are 10 types of people in the world: those who know binary,
> those who do not, and those who are off by one.
>


Amrita Kumari
Research Fellow
IISER Mohali
Chandigarh
INDIA

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] how to join two different files

2009-07-17 Thread amrita
Hi,

I have two large different column datafiles now i want to join them as
single multi-column datafile:--

I tried the command:--

>>> file('ala', 'w').write(file('/home/amrita/alachems/chem2.txt',
'r').read()+file('/home/amrita/pdbfile/pdb2.txt', 'r').read())

but it is priniting second file after first, whereas i want to join them
columwise like:---

FileA  FileB   FileC
12  14 12  14
15  +   16  =  15  16
18  17     18  17
20  19 20  19

What command I should use?

Thanks,
Amrita Kumari
Research Fellow
IISER Mohali
Chandigarh
INDIA

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] how to join two different files

2009-07-17 Thread amrita
Thankyou sir it is working.but one more thing i want to ask that if my
file will have entries like:---

fileA and fileB
12 10
13 12
14
15

means if their no. of entries will not match then how to combine them(both
input files have more than one column).

Thanks,
Amrita

>> Maybe you could break that up a bit? This is the tutor list, not a
>> one-liner competition!
>
> rather than one-liners, we can try to create the most "Pythonic"
> solution. below's my entry. :-)
>
> cheers,
> -wesley
>
> myMac$ cat parafiles.py
> #!/usr/bin/env python
>
> from itertools import izip
> from os.path import exists
>
> def parafiles(*files):
> vec = (open(f) for f in files if exists(f))
> data = izip(*vec)
> [f.close() for f in vec]
> return data
>
> for data in parafiles('fileA.txt', 'fileB.txt'):
> print ' '.join(d.strip() for d in data)
>
> myMac$ cat fileA.txt
> FileA
> 12
> 15
> 18
> 20
>
> myMac$ cat fileB.txt
> FileB
> 14
> 16
> 18
> 20
> 22
>
> myMac$ parafiles.py
> FileA FileB
> 12 14
> 15 16
> 18 18
> 20 20
>
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> "Core Python Programming", Prentice Hall, (c)2007,2001
> "Python Fundamentals", Prentice Hall, (c)2009
> http://corepython.com
>
> wesley.j.chun :: wescpy-at-gmail.com
> python training and technical consulting
> cyberweb.consulting : silicon valley, ca
> http://cyberwebconsulting.com
> ___
> Tutor maillist  -  Tutor@python.org
> http://mail.python.org/mailman/listinfo/tutor
>


Amrita Kumari
Research Fellow
IISER Mohali
Chandigarh
INDIA

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] how to fill zero value and join two column

2009-07-22 Thread amrita

Hi,

I have two text file, having entries as
fileA
33 ALA H = 7.57 N = 121.52 CA = 55.58 HA = 3.89 C = 179.24
38 ALA H = 8.29 N = 120.62 CA = 54.33 HA = 4.04 C = 178.95
8 ALA H = 7.85  N = 123.95 CA = 54.67 HA =  C =
fileB
8 ALA  helix (helix_alpha, helix1)
21 ALA  helix (helix_alpha, helix2)
23 ALA  helix (helix_alpha, helix2)

now what i want that i will make another file in which the matching
entries from the two file get printed together along with zero values for 
those atoms which doesnot have nay value in fileA. so the reult will be
something like:-

fileC
8 ALA H = 7.85  N = 123.95 CA = 54.67 HA =0.00  C =0.00|8 ALA  helix
(helix_alpha, helix1)

I tried to merge these two files using commands like:-

from collections import defaultdict
>>> def merge(sources):
...   if __name__ == "__main__":
...    a = open("/home/amrita/alachems/chem100.txt")
...    c = open("/home/amrita/secstr/secstr100.txt")
...def source(stream):
...return (line.strip() for line in stream)
...for m in merge([source(x) for x in [a,c]]):
...print "|".join(c.ljust(10) for c in m)
...
but it is not giving any value.






Thanks,
Amrita Kumari
Research Fellow
IISER Mohali
Chandigarh
INDIA

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] how to get blank value

2009-07-23 Thread amrita

Hi,

I have a file having lines:-

48 ALA H = 8.33 N = 120.77 CA = 55.18 HA = 4.12 C = 181.50
104 ALA H = 7.70 N =  CA =  HA = 4.21 C =
85 ALA H = 8.60 N =  CA =  HA = 4.65 C =

Now i want to make two another file in which i want to put those lines for
which C is missing and another one for which N,CA and C all are missing,

I tried in this way:
import re
expr = re.compile("C = None")
f = open("helix.dat")
for line in f:
if expr.search(line):
   print line

but i am not getting the desired output.


Amrita Kumari
Research Fellow
IISER Mohali
Chandigarh
INDIA

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] how to get blank value

2009-07-24 Thread amrita

> What is the python command for searching blank value of a parameter?

> Please use Reply All to send it to the list as well.
>
>
>> I am trying it in this way also:---
>>
>> import re
>> expr = re.compile("C")
>
> This will find all lines with the letter C in them.
> Which from your data is all of them. Look at the regex documentation
> to see how to represent the end of a line (or, slightly more complex,  a
> non digit).
>
>> f = open('chem.txt')
>> for line in f:
>> expr.search(line)
>> if 'C = '
>
>
> This is invalid Python, the second level of indentation should produce an
> error!
> Also you are not doing anything with the result of your search, you just
> throw
> it away.
>
> You need something like
>
> for line in open('chem.txt'):
> if expr.search(line):
>print line
>
>
> HTH,
>
> Alan g.
>
>> > wrote
>> >
>> >> 48 ALA H = 8.33 N = 120.77 CA = 55.18 HA = 4.12 C = 181.50
>> >> 104 ALA H = 7.70 N =  CA =  HA = 4.21 C =
>> >>
>> >> Now i want to make two another file in which i want to put those
>> lines
>> >> for
>> >> which C is missing and another one for which N,CA and C all are
>> missing,
>> >>
>> >> I tried in this way:
>> >> import re
>> >> expr = re.compile("C = None")
>> >
>> > This will search for the literal string 'C = None' which does not
>> exist in
>> > your sdata.
>> > You need to search for 'C = 'at the end of the line (assuming it is
>> always
>> > there.
>> > Otherwise you need to search for 'C = ' followed by a non number.)
>> >
>> > HTH,
>> >
>> > --
>> > Alan Gauld
>> > Author of the Learn to Program web site
>> > http://www.alan-g.me.uk/
>> >
>> >
>> > ___
>> > Tutor maillist  -  Tutor@python.org
>> > http://mail.python.org/mailman/listinfo/tutor
>> >
>>
>>
>> Amrita Kumari
>> Research Fellow
>> IISER Mohali
>> Chandigarh
>> INDIA
>
>


Amrita Kumari
Research Fellow
IISER Mohali
Chandigarh
INDIA

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] how to get blank value

2009-07-25 Thread amrita
Hi,

I have a file having lines:-

48 ALA H = 8.33 N = 120.77 CA = 55.18 HA = 4.12 C = 181.50
104 ALA H = 7.70 N = 121.21 CA = 54.32 HA = 4.21 C =
85 ALA H = 8.60 N =  CA =  HA = 4.65 C =

Now i want to make two another file in which i want to put those lines for
which C is missing and another one for which N,CA and C all are missing,

With these commands:-

import re
f = open('chem.txt')
for line in f:
 if re.search('C = ',line):
print line

I am getting those lines for which C value is there but how to get those
one for which it doesn't have any value, i did google search but still i
am not getting.

Amrita Kumari
Research Fellow
IISER Mohali
Chandigarh
INDIA

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] how to get blank value

2009-07-28 Thread amrita
Sorry to say, but till now I have not got the solution of my problem, I
tried with this command:-

import re

if __name__ == '__main__':
 data = open('chem.txt').readlines()
 for line in data:
 RE = re.compile('C = (.)',re.M)
 matches = RE.findall(line)
 for m in matches:
 print line

but with this also I am getting those lines for which C value is there.


> Hi,
>
> I have a file having lines:-
>
> 48 ALA H = 8.33 N = 120.77 CA = 55.18 HA = 4.12 C = 181.50
> 104 ALA H = 7.70 N = 121.21 CA = 54.32 HA = 4.21 C =
> 85 ALA H = 8.60 N =  CA =  HA = 4.65 C =
>
> Now i want to make two another file in which i want to put those lines for
> which C is missing and another one for which N,CA and C all are missing,
>
> With these commands:-
>
> import re
> f = open('chem.txt')
> for line in f:
>  if re.search('C = ',line):
> print line
>
> I am getting those lines for which C value is there but how to get those
> one for which it doesn't have any value, i did google search but still i
> am not getting.
>
> Amrita Kumari
> Research Fellow
> IISER Mohali
> Chandigarh
> INDIA
>


Amrita Kumari
Research Fellow
IISER Mohali
Chandigarh
INDIA

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] how to get blank value

2009-07-28 Thread amrita
It is not giving any value, without any error

ph08...@sys53:~> python trial.py
ph08...@sys53:~>
it is coming out from shell.

Thanks for help.
Amrita

> amr...@iisermohali.ac.in wrote:
>> Sorry to say, but till now I have not got the solution of my problem, I
>> tried with this command:-
>>
>> import re
>>
>>
> # assuming H = , N = , CA = , HA =  and C = always present in that order
> if __name__ == '__main__':
>   data = open('chem.txt').readlines()
>   for line in data:
> line = line.split('=')
> if not line[5]: # C value missing
>  if len(line[2])==1 and len(line[3])==1: # N and CA values missing
> print "all missing", line
>   else:
> print "C missing", line
>
>>
>>
>>> Hi,
>>>
>>> I have a file having lines:-
>>>
>>> 48 ALA H = 8.33 N = 120.77 CA = 55.18 HA = 4.12 C = 181.50
>>> 104 ALA H = 7.70 N = 121.21 CA = 54.32 HA = 4.21 C =
>>> 85 ALA H = 8.60 N =  CA =  HA = 4.65 C =
>>>
>>> Now i want to make two another file in which i want to put those lines
>>> for
>>> which C is missing and another one for which N,CA and C all are
>>> missing,
>>>
>>> With these commands:-
>>>
>>> import re
>>> f = open('chem.txt')
>>> for line in f:
>>>  if re.search('C = ',line):
>>> print line
>>>
>>> I am getting those lines for which C value is there but how to get
>>> those
>>> one for which it doesn't have any value, i did google search but still
>>> i
>>> am not getting.
>>>
>>> Amrita Kumari
>>> Research Fellow
>>> IISER Mohali
>>> Chandigarh
>>> INDIA
>>>
>>>
>>>
>
>
> --
> Bob Gailer
> Chapel Hill NC
> 919-636-4239
>


Amrita Kumari
Research Fellow
IISER Mohali
Chandigarh
INDIA

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] how to get blank value

2009-07-28 Thread amrita
with these data it is giving output but when I tried

if __name__ == '__main__':
  data = open('chem.txt').readlines()
  for line in data:
  line2 = line.split('=')
  if not line2[5]: # C value missing
if len(line2[2]) <= 5 and len(line2[3]) <= 5:  # N and CA values
missing
  print "all missing", line
else:
  print "C missing", line

by putting data in .txt file then it is not giving output. Actually I have
few large data file what I want that I will put those lines in one file
for which only C value is missing and in another one I will put those line
for which N, CA and C all values are missing.

Thanks for help.
Amrita

> amr...@iisermohali.ac.in wrote:
>> It is not giving any value, without any error
>>
>> ph08...@sys53:~> python trial.py
>> ph08...@sys53:~>
>> it is coming out from shell.
>>
> Try this. I embedded the test data to simplify testing:
>
> data = """48 ALA H = 8.33 N = 120.77 CA = 55.18 HA = 4.12 C = 181.50
> 104 ALA H = 7.70 N = 121.21 CA = 54.32 HA = 4.21 C =
> 85 ALA H = 8.60 N =  CA =  HA = 4.65 C =""".split('\n')
> for line in data:
> line2 = line.split('=')
> if not line2[5]: # C value missing
>   if len(line2[2]) <= 5 and len(line2[3]) <= 5: # N and CA values
> missing
> print "all missing", line
>   else:
> print "C missing", line
>
> --
> Bob Gailer
> Chapel Hill NC
> 919-636-4239
>


Amrita Kumari
Research Fellow
IISER Mohali
Chandigarh
INDIA

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] how to get blank value

2009-07-28 Thread amrita
Thanks for help Sir but with these commands it is showing error:-

ph08...@sys53:~> python trip.py
Traceback (most recent call last):
  File "trip.py", line 6, in 
from pyparsing import *
ImportError: No module named pyparsing


> Ok, I've seen various passes at this problem using regex, split('='),
> etc.,
> but the solutions seem fairly fragile, and the OP doesn't seem happy with
> any of them.  Here is how this problem looks if you were going to try
> breaking it up with pyparsing:
> - Each line starts with an integer, and the string "ALA"
> - "ALA" is followed by a series of "X = 1.2"-type attributes, where the
> value part might be missing.
>
> And to implement (with a few bells and whistles thrown in for free):
>
> data = """48 ALA H = 8.33 N = 120.77 CA = 55.18 HA = 4.12 C = 181.50
> 104 ALA H = 7.70 N = 121.21 CA = 54.32 HA = 4.21 C =
> 85 ALA H = 8.60 N =  CA =  HA = 4.65 C =""".splitlines()
>
>
> from pyparsing import *
>
> # define some basic data expressions
> integer = Word(nums)
> real = Combine(Word(nums) + "." + Word(nums))
>
> # use parse actions to automatically convert numeric
> # strings to actual numbers at parse time
> integer.setParseAction(lambda tokens:int(tokens[0]))
> real.setParseAction(lambda tokens:float(tokens[0]))
>
> # define expressions for 'X = 1.2' assignments; note that the
> # value might be missing, so use Optional - we'll fill in
> # a default value of 0.0 if no value is given
> keyValue = Word(alphas.upper()) + '=' + \
> Optional(real|integer, default=0.0)
>
> # define overall expression for the data on a line
> dataline = integer + "ALA" + OneOrMore(Group(keyValue))("kvdata")
>
> # attach parse action to define named values in the returned tokens
> def assignDataByKey(tokens):
> for k,_,v in tokens.kvdata:
> tokens[k] = v
> dataline.setParseAction(assignDataByKey)
>
> # for each line in the input data, parse it and print some of the data
> fields
> for d in data:
> print d
> parsedData = dataline.parseString(d)
> print parsedData.dump()
> print parsedData.CA
> print parsedData.N
> print
>
>
> Prints out:
>
> 48 ALA H = 8.33 N = 120.77 CA = 55.18 HA = 4.12 C = 181.50
> [48, 'ALA', ['H', '=', 8.3301], ['N', '=', 120.77], ['CA',
> '=',
> 55.18], ['HA', '=', 4.1201], ['C', '=', 181.5]]
> - C: 181.5
> - CA: 55.18
> - H: 8.33
> - HA: 4.12
> - N: 120.77
> - kvdata: [['H', '=', 8.3301], ['N', '=', 120.77], ['CA', '=',
> 55.18], ['HA', '=', 4.1201], ['C', '=', 181.5]]
> 55.18
> 120.77
>
> 104 ALA H = 7.70 N = 121.21 CA = 54.32 HA = 4.21 C =
> [104, 'ALA', ['H', '=', 7.7002], ['N', '=',
> 121.20],
> ['CA', '=', 54.32], ['HA', '=', 4.21], ['C', '=', 0.0]]
> - C: 0.0
> - CA: 54.32
> - H: 7.7
> - HA: 4.21
> - N: 121.21
> - kvdata: [['H', '=', 7.70000002], ['N', '=', 121.20],
> ['CA', '=', 54.32], ['HA', '=', 4.21], ['C', '=', 0.0]]
> 54.32
> 121.21
>
> 85 ALA H = 8.60 N =  CA =  HA = 4.65 C =
> [85, 'ALA', ['H', '=', 8.5996], ['N', '=', 0.0], ['CA', '=',
> 0.0], ['HA', '=', 4.6504], ['C', '=', 0.0]]
> - C: 0.0
> - CA: 0.0
> - H: 8.6
> - HA: 4.65
> - N: 0.0
> - kvdata: [['H', '=', 8.5996], ['N', '=', 0.0], ['CA', '=',
> 0.0], ['HA', '=', 4.6504], ['C', '=', 0.0]]
> 0.0
> 0.0
>
>
> Learn more about pyparsing at http://pyparsing.wikispaces.com.
>
> -- Paul
>
>
> ___
> Tutor maillist  -  Tutor@python.org
> http://mail.python.org/mailman/listinfo/tutor
>


Amrita Kumari
Research Fellow
IISER Mohali
Chandigarh
INDIA

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] arrangement of datafile

2013-12-17 Thread Amrita Kumari
Hi,

I am new in programming and want to try Python programming (which is simple
and easy to learn) to solve one problem: in which
I have various long file like this:

1 GLY HA2=3.7850 HA3=3.9130
2 SER H=8.8500 HA=4.3370 N=115.7570
3 LYS H=8.7530 HA=4.0340 HB2=1.8080 N=123.2380
4 LYS H=7.9100 HA=3.8620 HB2=1.7440 HG2=1.4410 N=117.9810
5 LYS H=7.4450 HA=4.0770 HB2=1.7650 HG2=1.4130 N=115.4790
6 LEU H=7.6870 HA=4.2100 HB2=1.3860 HB3=1.6050 HG=1.5130 HD11=0.7690
HD12=0.7690 HD13=0.7690 N=117.3260
7 PHE H=7.8190 HA=4.5540 HB2=3.1360 N=117.0800
8 PRO HD2=3.7450
9 GLN H=8.2350 HA=4.0120 HB2=2.1370 N=116.3660
10 ILE H=7.9790 HA=3.6970 HB=1.8800 HG21=0.8470 HG22=0.8470 HG23=0.8470
HG12=1.6010 HG13=2.1670 N=119.0300
11 ASN H=7.9470 HA=4.3690 HB3=2.5140 N=117.8620
12 PHE H=8.1910 HA=4.1920 HB2=3.1560 N=121.2640
13 LEU H=8.1330 HA=3.8170 HB3=1.7880 HG=1.5810 HD11=0.8620 HD12=0.8620
HD13=0.8620 N=119.1360

...

where first column is the residue number, what I want is to print
individual atom chemical shift value one by one along with residue
number.for example for atom HA2 it should be:

1 HA2=3.7850
2 HA2=nil
3 HA2=nil
.

..
13 HA2=nil

similarly for atom HA3 it should be same as above:

1 HA3=3.9130
2 HA3=nil
3 HA3=nil
...


13 HA3=nil

while for atom H it should be:
1  H=nil
2  H=8.8500
3  H=8.7530
4  H=7.9100
5  H=7.4450


but in some file the residue number is not continuous some are missing (in
between). I want to write python code to solve this problem but don't know
how to split the datafile and print the desired output. This problem is
important in order to compare each atom chemical shift value with some
other web-based generated chemical shift value. As the number of atoms in
different row are different and similar atom are at random position in
different residue hence I don't know to to split them. Please help to solve
this problem.

Thanks,
Amrita
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] arrangement of datafile

2013-12-25 Thread Amrita Kumari
Hi,

On 17th Dec. I posted one question, how to arrange datafile in a
particular fashion so that I can have only residue no. and chemical
shift value of the atom as:
1  H=nil
2  H=8.8500
3  H=8.7530
4  H=7.9100
5  H=7.4450

Peter has replied to this mail but since I haven't subscribe to the
tutor mailing list earlier hence I didn't receive the reply, I
apologize for my mistake, today I checked his reply and he asked me to
do few things:

Can you read a file line by line?
Can you split the line into a list of strings at whitespace occurences?
Can you extract the first item from the list and convert it to an int?
Can you remove the first two items from the list?
Can you split the items in the list at the "="?

I tried these and here is the code:

f=open('filename')
lines=f.readlines()
new=lines.split()
number=int(new[0])
mylist=[i.split('=')[0] for i in new]

one thing I don't understand is why you asked to remove first two
items from the list? and is the above code alright?, it can produce
output like the one you mentioned:
{1: {'HA2': 3.785, 'HA3': 3.913},
 2: {'H': 8.85, 'HA': 4.337, 'N': 115.757},
 3: {'H': 8.753, 'HA': 4.034, 'HB2': 1.808, 'N': 123.238},
 4: {'H': 7.91, 'HA': 3.862, 'HB2': 1.744, 'HG2': 1.441, 'N': 117.981},
 5: {'H': 7.445, 'HA': 4.077, 'HB2': 1.765, 'HG2': 1.413, 'N': 115.479},
 6: {'H': 7.687,
 'HA': 4.21,
 'HB2': 1.386,
 'HB3': 1.605,
 'HD11': 0.769,
 'HD12': 0.769,
 'HD13': 0.769,
 'HG': 1.513,
 'N': 117.326},
 7: {'H': 7.819, 'HA': 4.554, 'HB2': 3.136, 'N': 117.08},
 8: {'HD2': 3.745},
 9: {'H': 8.235, 'HA': 4.012, 'HB2': 2.137, 'N': 116.366},
 10: {'H': 7.979,
  'HA': 3.697,
  'HB': 1.88,
  'HG12': 1.601,
  'HG13': 2.167,
  'HG21': 0.847,
  'HG22': 0.847,
  'HG23': 0.847,
  'N': 119.03},
 11: {'H': 7.947, 'HA': 4.369, 'HB3': 2.514, 'N': 117.862},
 12: {'H': 8.191, 'HA': 4.192, 'HB2': 3.156, 'N': 121.264},
 13: {'H': 8.133,
  'HA': 3.817,
  'HB3': 1.788,
  'HD11': 0.862,
  'HD12': 0.862,
  'HD13': 0.862,
  'HG': 1.581,
  'N': 119.136}}
If not then please help to point out my mistake so that I can get the
correct output.

Thanking you for your help and time.

Thanks,
Amrita
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] arrangement of datafile

2013-12-27 Thread Amrita Kumari
Hi,

My data file is something like this:

1 GLY HA2=3.7850 HA3=3.9130
2 SER H=8.8500 HA=4.3370 N=115.7570
3 LYS H=8.7530 HA=4.0340 HB2=1.8080 N=123.2380
 4 LYS H=7.9100 HA=3.8620 HB2=1.7440 HG2=1.4410 N=117.9810
5 LYS H=7.4450 HA=4.0770 HB2=1.7650 HG2=1.4130 N=115.4790
6 LEU H=7.6870 HA=4.2100 HB2=1.3860 HB3=1.6050 HG=1.5130 HD11=0.7690
HD12=0.7690 HD13=0.7690 N=117.3260
7 PHE H=7.8190 HA=4.5540 HB2=3.1360 N=117.0800
8 PRO HD2=3.7450
9 GLN H=8.2350 HA=4.0120 HB2=2.1370 N=116.3660
10 ILE H=7.9790 HA=3.6970 HB=1.8800 HG21=0.8470 HG22=0.8470 HG23=0.8470
HG12=1.6010 HG13=2.1670 N=119.0300
11 ASN H=7.9470 HA=4.3690 HB3=2.5140 N=117.8620
12 PHE H=8.1910 HA=4.1920 HB2=3.1560 N=121.2640
13 LEU H=8.1330 HA=3.8170 HB3=1.7880 HG=1.5810 HD11=0.8620 HD12=0.8620
HD13=0.8620 N=119.1360

...

where first column is the residue number and I want to print the individual
atom chemical shift value one by one along with residue number.for
example for atom HA2 it should be:

1 HA2=3.7850
2 HA2=nil
3 HA2=nil
.

..
13 HA2=nil

similarly for atom HA3 it should be same as above:

1 HA3=3.9130
2 HA3=nil
3 HA3=nil
...


13 HA3=nil

while for atom H it should be:

1  H=nil
2  H=8.8500
3  H=8.7530
4  H=7.9100
5  H=7.4450


can you suggest me how to produce nested dicts like this:

{1: {'HA2': 3.785, 'HA3': 3.913},
2: {'H': 8.85, 'HA': 4.337, 'N': 115.757},
3: {'H': 8.753, 'HA': 4.034, 'HB2': 1.808, 'N': 123.238},
4: {'H': 7.91, 'HA': 3.862, 'HB2': 1.744, 'HG2': 1.441, 'N': 117.981},
5: {'H': 7.445, 'HA': 4.077, 'HB2': 1.765, 'HG2': 1.413, 'N': 115.479},
6: {'H': 7.687,
 'HA': 4.21,
 'HB2': 1.386,
 'HB3': 1.605,
 'HD11': 0.769,
 'HD12': 0.769,
 'HD13': 0.769,
 'HG': 1.513,
 'N': 117.326},
7: {'H': 7.819, 'HA': 4.554, 'HB2': 3.136, 'N': 117.08},
8: {'HD2': 3.745},
9: {'H': 8.235, 'HA': 4.012, 'HB2': 2.137, 'N': 116.366},
10: {'H': 7.979,
  'HA': 3.697,
  'HB': 1.88,
  'HG12': 1.601,
  'HG13': 2.167,
  'HG21': 0.847,
  'HG22': 0.847,
  'HG23': 0.847,
  'N': 119.03},
11: {'H': 7.947, 'HA': 4.369, 'HB3': 2.514, 'N': 117.862},
12: {'H': 8.191, 'HA': 4.192, 'HB2': 3.156, 'N': 121.264},
13: {'H': 8.133,
  'HA': 3.817,
  'HB3': 1.788,
  'HD11': 0.862,
  'HD12': 0.862,
  'HD13': 0.862,
  'HG': 1.581,
  'N': 119.136}}

Thanks,
Amrita



On Wed, Dec 25, 2013 at 7:28 PM, Dave Angel  wrote:

> On Wed, 25 Dec 2013 16:17:27 +0800, Amrita Kumari 
> wrote:
>
>> I tried these and here is the code:
>>
>
>
>  f=open('filename')
>> lines=f.readlines()
>> new=lines.split()
>>
>
> That line will throw an exception.
>
>> number=int(new[0])
>> mylist=[i.split('=')[0] for i in new]
>>
>
>
>  one thing I don't understand is why you asked to remove first two
>> items from the list?
>>
>
> You don't show us the data file,  but presumably he would ask that because
> the first two lines held different formats of data. Like your number= line
> was intended to fetch a count from only line zero?
>
>
>
>  and is the above code alright?, it can produce
>> output like the one you mentioned:
>> {1: {'HA2': 3.785, 'HA3': 3.913},
>>  2: {'H': 8.85, 'HA': 4.337, 'N': 115.757},
>>
>
> The code above won't produce a dict of dicts. It won't even get past the
> exception.  Please use copy/paste.
>
> --
> DaveA
>
> ___
> Tutor maillist  -  Tutor@python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
>
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] Fwd: arrangement of datafile

2014-01-05 Thread Amrita Kumari
Sorry I forgot to add tutor mailing list.please help for the below.

-- Forwarded message --
From: Amrita Kumari 
Date: Fri, Jan 3, 2014 at 2:42 PM
Subject: Re: [Tutor] arrangement of datafile
To: Evans Anyokwu 


Hi,

I have saved my data in csv format now it is looking like this:

2,ALA,C=178.255,CA=53.263,CB=18.411,,
3,LYS,H=8.607,C=176.752,CA=57.816,CB=31.751,N=119.081
4,ASN,H=8.185,C=176.029,CA=54.712,CB=38.244,N=118.255
5,VAL,H=7.857,HG11=0.892,HG12=0.892,HG13=0.892,HG21=0.954,HG22=0.954,HG23=0.954,C=177.259,CA=64.232,CB=31.524,CG1=21.402,CG2=21.677,N=119.998
6,ILE,H=8.062,HG21=0.827,HG22=0.827,HG23=0.827,HD11=0.807,HD12=0.807,HD13=0.807,C=177.009,CA=63.400,CB=37.177,CG2=17.565,CD1=13.294,N=122.474
7,VAL,H=7.993,HG11=0.879,HG12=0.879,HG13=0.879,HG21=0.957,HG22=0.957,HG23=0.957,C=177.009,CA=65.017,CB=31.309,CG1=21.555,CG2=22.369,N=120.915
8,LEU,H=8.061,HD11=0.844,HD12=0.844,HD13=0.844,HD21=0.810,HD22=0.810,HD23=0.810,C=178.655,CA=56.781,CB=41.010,CD1=25.018,CD2=23.824,N=121.098
9,ASN,H=8.102,C=176.695,CA=54.919,CB=38.674,N=118.347
10,ALA,H=8.388,HB1=1.389,HB2=1.389,HB3=1.389,C=178.263,CA=54.505,CB=17.942,N=124.124,
11,ALA,H=8.279,HB1=1.382,HB2=1.382,HB3=1.382,C=179.204,CA=54.298,CB=17.942,N=119.814,
12,SER,H=7.952,C=175.873,CA=60.140,CB=63.221,N=113.303
13,ALA,H=7.924,HB1=1.382,HB2=1.382,HB3=1.382,C=178.420,CA=53.470,CB=18.373,N=124.308,
--
-
---

with comma seperated:

I can read the file as

infile = open('inputfile.csv', 'r')

I can read each line through

data = infile.readlines()

I can split the line into a list of strings at comma occurences as

for line in data:
  csvline = line.strip().split(",")

after this please help me to guide how to proceed as I am new in
programming but want to learn python program.

Thanks,
Amrita


On 12/28/13, Evans Anyokwu  wrote:
> One thing that I've noticed is that there is no structure to your data.
> Some have missing *fields* -so making the use of regex out of the
question.
>
> Without seeing your code, I'd suggest saving the data as a separated value
> file and parse it. Python has a good csv support.
>
> Get this one sorted out first then we can move on to the nested list.
>
> Good luck.
> Evans
>
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Fwd: arrangement of datafile

2014-01-06 Thread Amrita Kumari
Hi Steven,

I tried this code:

import csv
with open('file.csv') as f:
 reader = csv.reader(f)
 for row in reader:
 print(row)
 row[0] = int(row[0])

up to this extent it is ok; it is ok it is giving the output as:

['1' , ' GLY' ,  'HA2=3.7850' ,  'HA3=3.9130' , ' ' , ' ' , ' ' , ' ']
[ '2' ,  'SER' ,  'H=8.8500' ,  'HA=4.3370' ,  'N=115.7570' , ' ' , ' ' , '
']
--
---
but the command :

key, value = row[2].split('=', 1)
value = float(value.strip())
print(value)

is giving the value of row[2] element as

['1' , ' GLY' ,  'HA2=3.7850' ,  'HA3=3.9130' , ' ' , ' ' , ' ' , ' ']
3.7850
[ '2' ,  'SER' ,  'H=8.8500' ,  'HA=4.3370' ,  'N=115.7570' , ' ' , ' ' , '
']
8.8500

--
so this is not what I want I want to print all the chemical shift value of
similar atom from each row at one time

like this:

1 HA2=3.7850
2 HA2=nil
3 HA2=nil
.

..
13 HA2=nil

similarly, for atom HA3:

1 HA3=3.9130
2 HA3=nil
3 HA3=nil
...

....
13 HA3=nil  and so on.

so how to split each item into a key and a numeric value and then search
for similar atom and print its chemical shift value at one time along with
residue no..

Thanks,
Amrita





On Mon, Jan 6, 2014 at 6:44 AM, Steven D'Aprano  wrote:

> Hi Amrita,
>
> On Sun, Jan 05, 2014 at 10:01:16AM +0800, Amrita Kumari wrote:
>
> > I have saved my data in csv format now it is looking like this:
>
> If you have a file in CSV format, you should use the csv module to read
> the file.
>
> http://docs.python.org/3/library/csv.html
>
> If you're still using Python 2.x, you can read this instead:
>
> http://docs.python.org/2/library/csv.html
>
>
> I think that something like this should work for you:
>
> import csv
> with open('/path/to/your/file.csv') as f:
> reader = csv.reader(f)
> for row in reader:
> print(row)
>
> Of course, you can process the rows, not just print them. Each row will
> be a list of strings. For example, you show the first row as this:
>
> > 2,ALA,C=178.255,CA=53.263,CB=18.411,,
>
> so the above code should print this for the first row:
>
> ['2', 'ALA', 'C=178.255', 'CA=53.263', 'CB=18.411', '', '', '',
> '', '', '', '', '', '']
>
>
> You can process each field as needed. For example, to convert the
> first field from a string to an int:
>
> row[0] = int(row[0])
>
> To split the third item 'C=178.255' into a key ('C') and a numeric
> value:
>
> key, value = row[2].split('=', 1)
> value = float(value.strip())
>
>
>
> Now you know how to read CSV files. What do you want to do with the data
> in the file?
>
>
>
> --
> Steven
> ___
> Tutor maillist  -  Tutor@python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
>
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Fwd: arrangement of datafile

2014-01-09 Thread Amrita Kumari
Hi,

Sorry for delay in reply(as internet was very slow from past two
days), I tried this code which you suggested (by saving it in a file):

import csv
with open('19162.csv') as f:
   reader = csv.reader(f)
   for row in reader:
  print(row)
  row[0] = int(row[0])
  key,value = item.split('=', 1)
  value = float(value)
  print(value)

and I got the output as:

C:\Python33>python 8.py
['2', 'ALA', 'C=178.255', 'CA=53.263', 'CB=18.411', '', '', '', '', '', '', '',
'', '', '']
Traceback (most recent call last):
  File "8.py", line 7, in 
key,value = item.split('=', 1)
NameError: name 'item' is not defined

my datafile is like this:

2,ALA,C=178.255,CA=53.263,CB=18.411,,
3,LYS,H=8.607,C=176.752,CA=57.816,CB=31.751,N=119.081
4,ASN,H=8.185,C=176.029,CA=54.712,CB=38.244,N=118.255
5,VAL,H=7.857,HG11=0.892,HG12=0.892,HG13=0.892,HG21=0.954,HG22=0.954,HG23=0.954,C=177.259,CA=64.232,CB=31.524,CG1=21.402,CG2=21.677,N=119.998
6,ILE,H=8.062,HG21=0.827,HG22=0.827,HG23=0.827,HD11=0.807,HD12=0.807,HD13=0.807,C=177.009,CA=63.400,CB=37.177,CG2=17.565,CD1=13.294,N=122.474
7,VAL,H=7.993,HG11=0.879,HG12=0.879,HG13=0.879,HG21=0.957,HG22=0.957,HG23=0.957,C=177.009,CA=65.017,CB=31.309,CG1=21.555,CG2=22.369,N=120.915
8,LEU,H=8.061,HD11=0.844,HD12=0.844,HD13=0.844,HD21=0.810,HD22=0.810,HD23=0.810,C=178.655,CA=56.781,CB=41.010,CD1=25.018,CD2=23.824,N=121.098
9,ASN,H=8.102,C=176.695,CA=54.919,CB=38.674,N=118.347
10,ALA,H=8.388,HB1=1.389,HB2=1.389,HB3=1.389,C=178.263,CA=54.505,CB=17.942,N=124.124,
--

where 1st element of each row is the residue no. but it is not
continuous (some are missing also for example the 1st row is starting
from resdiue no. 2 not from 1) second element of each row is the name
of amino acid and rest element of each row are the various atom along
with chemical shift information corresponding to that particular amino
acid for example H=8.388 is showing that atom is H and it has chemical
shift value 8.388. But the arrangement of these atoms in each row are
quite random and in few row there are many more atoms and in few there
are less. This value I got from Shiftx2 web server. I just want to
align the similar atom chemical shift value into one column (along
with residue no.) for example for atom C, it could be:

2 C=178.255
3 C=176.752
4  C=176.029
5 C=177.259
---
---

for atom H, it could be:

2 H=nil
3 H=8.607
4 H=8.185
5 H=7.857
6 H=8.062

---
and so on. So if a row doesn't have that atom (for ex. row 1 doesn't
have H atom) then if it can print nil that I can undestand that it is
missing for that particular residue. This arrangement I need in order
to compare this chemical shift value with other web server generated
program.

Thanks,
Amrita



and got the output as:

On 1/7/14, Steven D'Aprano  wrote:
> On Mon, Jan 06, 2014 at 04:57:38PM +0800, Amrita Kumari wrote:
>> Hi Steven,
>>
>> I tried this code:
>>
>> import csv
>> with open('file.csv') as f:
>>  reader = csv.reader(f)
>>  for row in reader:
>>  print(row)
>>  row[0] = int(row[0])
>>
>> up to this extent it is ok; it is ok it is giving the output as:
>>
>> ['1' , ' GLY' ,  'HA2=3.7850' ,  'HA3=3.9130' , ' ' , ' ' , ' ' , ' ']
>> [ '2' ,  'SER' ,  'H=8.8500' ,  'HA=4.3370' ,  'N=115.7570' , ' ' , ' ' ,
>> '
>> ']
>
> It looks like you are re-typing the output into your email. It is much
> better if you copy and paste it so that we can see exactly what happens.
>
>
>> but the command :
>>
>> key, value = row[2].split('=', 1)
>> value = float(value.strip())
>> print(value)
>>
>> is giving the value of row[2] element as
>>
>> ['1' , ' GLY' ,  'HA2=3.7850' ,  'HA3=3.9130' , ' ' , ' ' , ' ' , ' ']
>> 3.7850
>> [ '2' ,  'SER' ,  'H=8.8500' ,  'HA=4.3370' ,  'N=115.7570' , ' ' , ' ' ,
>> '
>> ']
>> 8.8500
>
> So far, the code is doing exactly what you told it to do. Take the third
> column (index 2), and split on the equals sign. Convert the part on the
> right of the equals sign to a float, and print the float.
>
>
>> so this is not what I want I want to print all the chemical shift value
>> of
>>

Re: [Tutor] arrangement of datafile

2014-01-10 Thread Amrita Kumari
Hi Peter,

Thankyou very much for your kind help. I got the output like the way I
wanted (which you have also shown in your output). I really appreciate your
effort.

Thanks for your time.
Amrita


On Thu, Jan 9, 2014 at 8:41 PM, Peter Otten <__pete...@web.de> wrote:

> Amrita Kumari wrote:
>
> > On 17th Dec. I posted one question, how to arrange datafile in a
> > particular fashion so that I can have only residue no. and chemical
> > shift value of the atom as:
> > 1  H=nil
> > 2  H=8.8500
> > 3  H=8.7530
> > 4  H=7.9100
> > 5  H=7.4450
> > 
> > Peter has replied to this mail but since I haven't subscribe to the
> > tutor mailing list earlier hence I didn't receive the reply, I
> > apologize for my mistake, today I checked his reply and he asked me to
> > do few things:
>
> I'm sorry, I'm currently lacking the patience to tune into your problem
> again, but maybe the script that I wrote (but did not post) back then is of
> help.
>
> The data sample:
>
> $ cat residues.txt
> 1 GLY HA2=3.7850 HA3=3.9130
> 2 SER H=8.8500 HA=4.3370 N=115.7570
> 3 LYS H=8.7530 HA=4.0340 HB2=1.8080 N=123.2380
> 4 LYS H=7.9100 HA=3.8620 HB2=1.7440 HG2=1.4410 N=117.9810
> 5 LYS H=7.4450 HA=4.0770 HB2=1.7650 HG2=1.4130 N=115.4790
> 6 LEU H=7.6870 HA=4.2100 HB2=1.3860 HB3=1.6050 HG=1.5130 HD11=0.7690
> HD12=0.7690 HD13=0.7690 N=117.3260
> 7 PHE H=7.8190 HA=4.5540 HB2=3.1360 N=117.0800
> 8 PRO HD2=3.7450
> 9 GLN H=8.2350 HA=4.0120 HB2=2.1370 N=116.3660
> 10 ILE H=7.9790 HA=3.6970 HB=1.8800 HG21=0.8470 HG22=0.8470 HG23=0.8470
> HG12=1.6010 HG13=2.1670 N=119.0300
> 11 ASN H=7.9470 HA=4.3690 HB3=2.5140 N=117.8620
> 12 PHE H=8.1910 HA=4.1920 HB2=3.1560 N=121.2640
> 13 LEU H=8.1330 HA=3.8170 HB3=1.7880 HG=1.5810 HD11=0.8620 HD12=0.8620
> HD13=0.8620 N=119.1360
>
> The script:
>
> $ cat residues.py
> def process(filename):
> residues = {}
> with open(filename) as infile:
> for line in infile:
> parts = line.split()# split line at whitespace
> residue = int(parts.pop(0)) # convert first item to integer
> if residue in residues:
> raise ValueError("duplicate residue {}".format(residue))
> parts.pop(0)# discard second item
>
> # split remaining items at "=" and put them in a dict,
> # e. g. {"HA2": 3.7, "HA3": 3.9}
> pairs = (pair.split("=") for pair in parts)
> lookup = {atom: float(value) for atom, value in pairs}
>
> # put previous lookup dict in residues dict
> # e. g. {1: {"HA2": 3.7, "HA3": 3.9}}
> residues[residue] = lookup
>
> return residues
>
> def show(residues):
> atoms = set().union(*(r.keys() for r in residues.values()))
> residues = sorted(residues.items())
> for atom in sorted(atoms):
> for residue, lookup in residues:
> print "{} {}={}".format(residue, atom, lookup.get(atom, "nil"))
> print
> print "---"
> print
>
> if __name__ == "__main__":
> r = process("residues.txt")
> show(r)
>
> Note that converting the values to float can be omitted if all you want to
> do is print them. Finally the output of the script:
>
> $ python residues.py
> 1 H=nil
> 2 H=8.85
> 3 H=8.753
> 4 H=7.91
> 5 H=7.445
> 6 H=7.687
> 7 H=7.819
> 8 H=nil
> 9 H=8.235
> 10 H=7.979
> 11 H=7.947
> 12 H=8.191
> 13 H=8.133
>
> ---
>
> 1 HA=nil
> 2 HA=4.337
> 3 HA=4.034
> 4 HA=3.862
> 5 HA=4.077
> 6 HA=4.21
> 7 HA=4.554
> 8 HA=nil
> 9 HA=4.012
> 10 HA=3.697
> 11 HA=4.369
> 12 HA=4.192
> 13 HA=3.817
>
> ---
>
> [snip]
>
> ___
> Tutor maillist  -  Tutor@python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
>
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor