[Tutor] Regular Expression help
Hi I have a text file that I would like to split up so that I can use it in Excel to filter a certain field. However as it is a flat text file I need to do some processing on it so that Excel can correctly import it. File Example: tag descVR VM (0012,0042) Clinical Trial Subject Reading ID LO 1 (0012,0050) Clinical Trial Time Point ID LO 1 (0012,0051) Clinical Trial Time Point Description ST 1 (0012,0060) Clinical Trial Coordinating Center Name LO 1 (0018,0010) Contrast/Bolus Agent LO 1 (0018,0012) Contrast/Bolus Agent Sequence SQ 1 (0018,0014) Contrast/Bolus Administration Route Sequence SQ 1 (0018,0015) Body Part Examined CS 1 What I essentially want is to use python to process this file to give me (0012,0042); Clinical Trial Subject Reading ID; LO; 1 (0012,0050); Clinical Trial Time Point ID; LO; 1 (0012,0051); Clinical Trial Time Point Description; ST; 1 (0012,0060); Clinical Trial Coordinating Center Name; LO; 1 (0018,0010); Contrast/Bolus Agent; LO; 1 (0018,0012); Contrast/Bolus Agent Sequence; SQ ;1 (0018,0014); Contrast/Bolus Administration Route Sequence; SQ; 1 (0018,0015); Body Part Examined; CS; 1 so that I can import to excel using a delimiter. This file is extremely long and all I essentially want to do is to break it into it 'fields' Now I suspect that regular expressions are the way to go but I have only basic experience of using these and I have no idea what I should be doing. Can anyone help. Thanks DISCLAIMER: Unless indicated otherwise, the information contained in this message is privileged and confidential, and is intended only for the use of the addressee(s) named above and others who have been specifically authorized to receive it. If you are not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this message and/or attachments is strictly prohibited. The company accepts no liability for any damage caused by any virus transmitted by this email. Furthermore, the company does not warrant a proper and complete transmission of this information, nor does it accept liability for any delays. If you have received this message in error, please contact the sender and delete the message. Thank you. ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Regular Expression help
I think I have a solution. File (0012,0042) Clinical Trial Subject Reading ID LO 1 (0012,0050) Clinical Trial Time Point ID LO 1 (0012,0051) Clinical Trial Time Point Description ST 1 (0012,0060) Clinical Trial Coordinating Center Name LO 1 (0018,0010) Contrast/Bolus Agent LO 1 (0018,0012) Contrast/Bolus Agent Sequence SQ 1 (0018,0014) Contrast/Bolus Administration Route Sequence SQ 1 (0018,0015) Body Part Examined CS 1 Script # #!/usr/bin/python import re #matchstr regex flow # (\(\d+,\d+\)) # (0018,0014) # \s # [space] # (..*)# Contrast/Bolus Administration Route Sequence # \s # space # ([a-z]{2}) # SQ - two letters and no more # \s # [space] # (\d)# 1 - single digit # re.I) # case insensitive matchstr = re.compile(r"(\(\d+,\d+\))\s(..*)\s([a-z]{2})\s(\d)",re.I) myfile = open('/tmp/file','r') for line in myfile.readlines(): regex_match = matchstr.match(line) if regex_match: print regex_match.group(1) + ";" + regex_match.group(2) + ";" + regex_match.group(3) + ";" + regex_match.group(4) Output # (0012,0042);Clinical Trial Subject Reading ID;LO;1 (0012,0050);Clinical Trial Time Point ID;LO;1 (0012,0051);Clinical Trial Time Point Description;ST;1 (0012,0060);Clinical Trial Coordinating Center Name;LO;1 (0018,0010);Contrast/Bolus Agent;LO;1 (0018,0012);Contrast/Bolus Agent Sequence;SQ;1 (0018,0014);Contrast/Bolus Administration Route Sequence;SQ;1 (0018,0015);Body Part Examined;CS;1 On 6/27/07, Gardner, Dean <[EMAIL PROTECTED]> wrote: Hi I have a text file that I would like to split up so that I can use it in Excel to filter a certain field. However as it is a flat text file I need to do some processing on it so that Excel can correctly import it. File Example: tag descVR VM (0012,0042) Clinical Trial Subject Reading ID LO 1 (0012,0050) Clinical Trial Time Point ID LO 1 (0012,0051) Clinical Trial Time Point Description ST 1 (0012,0060) Clinical Trial Coordinating Center Name LO 1 (0018,0010) Contrast/Bolus Agent LO 1 (0018,0012) Contrast/Bolus Agent Sequence SQ 1 (0018,0014) Contrast/Bolus Administration Route Sequence SQ 1 (0018,0015) Body Part Examined CS 1 What I essentially want is to use python to process this file to give me (0012,0042); Clinical Trial Subject Reading ID; LO; 1 (0012,0050); Clinical Trial Time Point ID; LO; 1 (0012,0051); Clinical Trial Time Point Description; ST; 1 (0012,0060); Clinical Trial Coordinating Center Name; LO; 1 (0018,0010); Contrast/Bolus Agent; LO; 1 (0018,0012); Contrast/Bolus Agent Sequence; SQ ;1 (0018,0014); Contrast/Bolus Administration Route Sequence; SQ; 1 (0018,0015); Body Part Examined; CS; 1 so that I can import to excel using a delimiter. This file is extremely long and all I essentially want to do is to break it into it 'fields' Now I suspect that regular expressions are the way to go but I have only basic experience of using these and I have no idea what I should be doing. Can anyone help. Thanks DISCLAIMER: Unless indicated otherwise, the information contained in this message is privileged and confidential, and is intended only for the use of the addressee(s) named above and others who have been specifically authorized to receive it. If you are not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this message and/or attachments is strictly prohibited. The company accepts no liability for any damage caused by any virus transmitted by this email. Furthermore, the company does not warrant a proper and complete transmission of this information, nor does it accept liability for any delays. If you have received this message in error, please contact the sender and delete the message. Thank you. ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Beginner Game: Rock, Paper, Scissors
Cairo does do it to files, but -- that's from your source. That's for if you draw something with python and then want to save it to an svg. It doesn't take something that's already an svg and convert it or display it. PIL will let me take advantage of all kind of graphical formats, but not any that are vector, to my knowledge. I want to take advantage of the vector, if possible :D! On 6/26/07, Luke Paireepinart <[EMAIL PROTECTED]> wrote: Johnny Jelinek wrote: > how would I go about rendering an .svg to another file before showing > it on screen? Could you point me to some resources or examples? Thanks! You just mentioned that Cairo renders svgs to a file, didn't you? So can you just use pyCairo? Google would tell you of other python modules that can render svg files. I don't have any resources for doing this, I just assumed since you mentioned Cairo could do it that you knew how to use Cairo to do it. -Luke ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Regular Expression help
Gardner, Dean wrote: > Hi > > I have a text file that I would like to split up so that I can use it in > Excel to filter a certain field. However as it is a flat text file I > need to do some processing on it so that Excel can correctly import it. > > File Example: > tag descVR VM > (0012,0042) Clinical Trial Subject Reading ID LO 1 > (0012,0050) Clinical Trial Time Point ID LO 1 > (0012,0051) Clinical Trial Time Point Description ST 1 > (0012,0060) Clinical Trial Coordinating Center Name LO 1 > (0018,0010) Contrast/Bolus Agent LO 1 > (0018,0012) Contrast/Bolus Agent Sequence SQ 1 > (0018,0014) Contrast/Bolus Administration Route Sequence SQ 1 > (0018,0015) Body Part Examined CS 1 > > What I essentially want is to use python to process this file to give me > > > (0012,0042); Clinical Trial Subject Reading ID; LO; 1 > (0012,0050); Clinical Trial Time Point ID; LO; 1 > (0012,0051); Clinical Trial Time Point Description; ST; 1 > (0012,0060); Clinical Trial Coordinating Center Name; LO; 1 > (0018,0010); Contrast/Bolus Agent; LO; 1 > (0018,0012); Contrast/Bolus Agent Sequence; SQ ;1 > (0018,0014); Contrast/Bolus Administration Route Sequence; SQ; 1 > (0018,0015); Body Part Examined; CS; 1 > > so that I can import to excel using a delimiter. > > This file is extremely long and all I essentially want to do is to break > it into it 'fields' > > Now I suspect that regular expressions are the way to go but I have only > basic experience of using these and I have no idea what I should be doing. This seems to work: data = '''\ (0012,0042) Clinical Trial Subject Reading ID LO 1 (0012,0050) Clinical Trial Time Point ID LO 1 (0012,0051) Clinical Trial Time Point Description ST 1 (0012,0060) Clinical Trial Coordinating Center Name LO 1 (0018,0010) Contrast/Bolus Agent LO 1 (0018,0012) Contrast/Bolus Agent Sequence SQ 1 (0018,0014) Contrast/Bolus Administration Route Sequence SQ 1 (0018,0015) Body Part Examined CS 1'''.splitlines() import re fieldsRe = re.compile(r'^(\(\d+,\d+\)) (.*?) (\w+) (\d+)$') for line in data: match = fieldsRe.match(line) if match: print ';'.join(match.group(1, 2, 3, 4)) I don't think you want the space after the ; that you put in your example; Excel wants a single-character delimiter. Kent ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Regular Expression help
Gardner, Dean wrote: > Hi > > I have a text file that I would like to split up so that I can use it in > Excel to filter a certain field. However as it is a flat text file I > need to do some processing on it so that Excel can correctly import it. > > File Example: > tag descVR VM > (0012,0042) Clinical Trial Subject Reading ID LO 1 > (0012,0050) Clinical Trial Time Point ID LO 1 > (0012,0051) Clinical Trial Time Point Description ST 1 > (0012,0060) Clinical Trial Coordinating Center Name LO 1 > (0018,0010) Contrast/Bolus Agent LO 1 > (0018,0012) Contrast/Bolus Agent Sequence SQ 1 > (0018,0014) Contrast/Bolus Administration Route Sequence SQ 1 > (0018,0015) Body Part Examined CS 1 > > What I essentially want is to use python to process this file to give me > > > (0012,0042); Clinical Trial Subject Reading ID; LO; 1 > (0012,0050); Clinical Trial Time Point ID; LO; 1 > (0012,0051); Clinical Trial Time Point Description; ST; 1 > (0012,0060); Clinical Trial Coordinating Center Name; LO; 1 > (0018,0010); Contrast/Bolus Agent; LO; 1 > (0018,0012); Contrast/Bolus Agent Sequence; SQ ;1 > (0018,0014); Contrast/Bolus Administration Route Sequence; SQ; 1 > (0018,0015); Body Part Examined; CS; 1 > > so that I can import to excel using a delimiter. > > This file is extremely long and all I essentially want to do is to break > it into it 'fields' > > Now I suspect that regular expressions are the way to go but I have only > basic experience of using these and I have no idea what I should be doing. This seems to work: data = '''\ (0012,0042) Clinical Trial Subject Reading ID LO 1 (0012,0050) Clinical Trial Time Point ID LO 1 (0012,0051) Clinical Trial Time Point Description ST 1 (0012,0060) Clinical Trial Coordinating Center Name LO 1 (0018,0010) Contrast/Bolus Agent LO 1 (0018,0012) Contrast/Bolus Agent Sequence SQ 1 (0018,0014) Contrast/Bolus Administration Route Sequence SQ 1 (0018,0015) Body Part Examined CS 1'''.splitlines() import re fieldsRe = re.compile(r'^(\(\d+,\d+\)) (.*?) (\w+) (\d+)$') for line in data: match = fieldsRe.match(line) if match: print ';'.join(match.group(1, 2, 3, 4)) I don't think you want the space after the ; that you put in your example; Excel wants a single-character delimiter. Kent ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Regular Expression help
Argh... My e-mail program really messed up the threads. I didn't notice that there was already multiple replies to this message. Doh! Mike ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Regular Expression help
> -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Gardner, Dean > Sent: Wednesday, June 27, 2007 3:59 AM > To: tutor@python.org > Subject: [Tutor] Regular Expression help > > Hi > > I have a text file that I would like to split up so that I > can use it in Excel to filter a certain field. However as it > is a flat text file I need to do some processing on it so > that Excel can correctly import it. > > File Example: > tag descVR VM > (0012,0042) Clinical Trial Subject Reading ID LO 1 > (0012,0050) Clinical Trial Time Point ID LO 1 > (0012,0051) Clinical Trial Time Point Description ST 1 > (0012,0060) Clinical Trial Coordinating Center Name LO 1 > (0018,0010) Contrast/Bolus Agent LO 1 > (0018,0012) Contrast/Bolus Agent Sequence SQ 1 > (0018,0014) Contrast/Bolus Administration Route Sequence SQ 1 > (0018,0015) Body Part Examined CS 1 > > What I essentially want is to use python to process this file > to give me > > > (0012,0042); Clinical Trial Subject Reading ID; LO; 1 > (0012,0050); Clinical Trial Time Point ID; LO; 1 > (0012,0051); Clinical Trial Time Point Description; ST; 1 > (0012,0060); Clinical Trial Coordinating Center Name; LO; 1 > (0018,0010); Contrast/Bolus Agent; LO; 1 > (0018,0012); Contrast/Bolus Agent Sequence; SQ ;1 > (0018,0014); Contrast/Bolus Administration Route Sequence; SQ; 1 > (0018,0015); Body Part Examined; CS; 1 > > so that I can import to excel using a delimiter. > > This file is extremely long and all I essentially want to do > is to break it into it 'fields' > > Now I suspect that regular expressions are the way to go but > I have only basic experience of using these and I have no > idea what I should be doing. > > Can anyone help. > > Thanks > H... You might be able to do this without the need for regular expressions. You can split the row on spaces which will give you a list. Then you can reconstruct the row inserting your delimiter as needed and joining the rest with spaces again. In [63]: row = "(0012,0042) Clinical Trial Subject Reading ID LO 1" In [64]: row_items = row.split(' ') In [65]: row_items Out[65]: ['(0012,0042)', 'Clinical', 'Trial', 'Subject', 'Reading', 'ID', 'LO', '1'] In [66]: tag = row_items.pop(0) In [67]: tag Out[67]: '(0012,0042)' In [68]: vm = row_items.pop() In [69]: vm Out[69]: '1' In [70]: vr = row_items.pop() In [71]: vr Out[71]: 'LO' In [72]: desc = ' '.join(row_items) In [73]: new_row = "%s; %s; %s; %s" %(tag, desc, vr, vm, ) In [74]: new_row Out[74]: '(0012,0042); Clinical Trial Subject Reading ID; LO; 1' Someone might think of a better way with them thar fancy lambdas and list comprehensions thingys, but I think this will work. Mike ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Regular Expression help
On Jun 27, 2007, at 10:24 AM, Mike Hansen wrote: > > >> -Original Message- >> From: [EMAIL PROTECTED] >> [mailto:[EMAIL PROTECTED] On Behalf Of Gardner, Dean >> Sent: Wednesday, June 27, 2007 3:59 AM >> To: tutor@python.org >> Subject: [Tutor] Regular Expression help >> >> Hi >> >> I have a text file that I would like to split up so that I >> can use it in Excel to filter a certain field. However as it >> is a flat text file I need to do some processing on it so >> that Excel can correctly import it. >> >> File Example: >> tag descVR VM >> (0012,0042) Clinical Trial Subject Reading ID LO 1 >> (0012,0050) Clinical Trial Time Point ID LO 1 >> (0012,0051) Clinical Trial Time Point Description ST 1 >> (0012,0060) Clinical Trial Coordinating Center Name LO 1 >> (0018,0010) Contrast/Bolus Agent LO 1 >> (0018,0012) Contrast/Bolus Agent Sequence SQ 1 >> (0018,0014) Contrast/Bolus Administration Route Sequence SQ 1 >> (0018,0015) Body Part Examined CS 1 >> >> What I essentially want is to use python to process this file >> to give me >> >> >> (0012,0042); Clinical Trial Subject Reading ID; LO; 1 >> (0012,0050); Clinical Trial Time Point ID; LO; 1 >> (0012,0051); Clinical Trial Time Point Description; ST; 1 >> (0012,0060); Clinical Trial Coordinating Center Name; LO; 1 >> (0018,0010); Contrast/Bolus Agent; LO; 1 >> (0018,0012); Contrast/Bolus Agent Sequence; SQ ;1 >> (0018,0014); Contrast/Bolus Administration Route Sequence; SQ; 1 >> (0018,0015); Body Part Examined; CS; 1 >> >> so that I can import to excel using a delimiter. >> >> This file is extremely long and all I essentially want to do >> is to break it into it 'fields' >> >> Now I suspect that regular expressions are the way to go but >> I have only basic experience of using these and I have no >> idea what I should be doing. >> >> Can anyone help. >> >> Thanks >> > > H... You might be able to do this without the need for regular > expressions. You can split the row on spaces which will give you a > list. > Then you can reconstruct the row inserting your delimiter as needed > and > joining the rest with spaces again. > > In [63]: row = "(0012,0042) Clinical Trial Subject Reading ID LO 1" > > In [64]: row_items = row.split(' ') > > In [65]: row_items > Out[65]: ['(0012,0042)', 'Clinical', 'Trial', 'Subject', 'Reading', > 'ID', 'LO', > '1'] > > In [66]: tag = row_items.pop(0) > > In [67]: tag > Out[67]: '(0012,0042)' > > In [68]: vm = row_items.pop() > > In [69]: vm > Out[69]: '1' > > In [70]: vr = row_items.pop() > > In [71]: vr > Out[71]: 'LO' > > In [72]: desc = ' '.join(row_items) > > In [73]: new_row = "%s; %s; %s; %s" %(tag, desc, vr, vm, ) > > In [74]: new_row > Out[74]: '(0012,0042); Clinical Trial Subject Reading ID; LO; 1' > > Someone might think of a better way with them thar fancy lambdas and > list comprehensions thingys, but I think this will work. > > I sent this to Dean this morning: Dean, I would do something like this (if your pattern is always the same.) foo =['(0012,0042) Clinical Trial Subject Reading ID LO 1 ', '(0012,0050) Clinical Trial Time Point ID LO 1 ', '(0012,0051) Clinical Trial Time Point Description ST 1 ', '(0012,0060) Clinical Trial Coordinating Center Name LO 1 ', '(0018,0010) Contrast/Bolus Agent LO 1 ', '(0018,0012) Contrast/Bolus Agent Sequence SQ 1 ', '(0018,0014) Contrast/Bolus Administration Route Sequence SQ 1 ', '(0018,0015) Body Part Examined CS 1',] import csv writer = csv.writer(open('/Users/reed/tmp/foo.csv', 'w'), delimiter=';') for lin in foo: lin = lin.split() row = (lin[0], ' '.join(lin[1:-2]), lin[-2], lin[-1]) writer.writerow(row) more foo.csv (0012,0042);Clinical Trial Subject Reading ID;LO;1 (0012,0050);Clinical Trial Time Point ID;LO;1 (0012,0051);Clinical Trial Time Point Description;ST;1 (0012,0060);Clinical Trial Coordinating Center Name;LO;1 (0018,0010);Contrast/Bolus Agent;LO;1 (0018,0012);Contrast/Bolus Agent Sequence;SQ;1 (0018,0014);Contrast/Bolus Administration Route Sequence;SQ;1 (0018,0015);Body Part Examined;CS;1 HTH, ~reed ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] Removing a file from a tar
I was reading over the documentation for the tarfile module and it occurred to me that there didn't seem to be a way to remove an individual file from the tar. For example, suppose I did this: import tarfile tar = tarfile.open("sample.tar", "w") tar.add("unwanted") tar.add("wanted") tar.close() At this point, how could I come back and remove "unwanted" from the tar? ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Regular Expression help
Tom Tucker wrote: > #matchstr regex flow > # (\(\d+,\d+\)) # (0018,0014) > # \s # [space] > # (..*)# Contrast/Bolus Administration Route Sequence > # \s # space > # ([a-z]{2}) # SQ - two letters and no more > # \s # [space] > # (\d)# 1 - single digit > # re.I) # case insensitive > > matchstr = re.compile(r"(\(\d+,\d+\))\s(..*)\s([a-z]{2})\s(\d)",re.I) You should learn about re.VERBOSE: http://docs.python.org/lib/node46.html#l2h-414 With this flag, your commented version could be the actual regex, instead of repeating it in code with the whitespace and comments removed: matchstr = re.compile(r''' (\(\d+,\d+\))# (0018,0014) \s # [space] (..*)# Contrast/Bolus Administration Route Sequence \s # space ([a-z]{2}) # SQ - two letters and no more \s # [space] (\d)# 1 - single digit re.I) # case insensitive ''', re.I|re.VERBOSE) Kent ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Regular Expression help
Gardner, Dean wrote: > Hi > > I have a text file that I would like to split up so that I can use it in > Excel to filter a certain field. However as it is a flat text file I > need to do some processing on it so that Excel can correctly import it. > > File Example: > tag descVR VM > (0012,0042) Clinical Trial Subject Reading ID LO 1 > (0012,0050) Clinical Trial Time Point ID LO 1 > (0012,0051) Clinical Trial Time Point Description ST 1 > (0012,0060) Clinical Trial Coordinating Center Name LO 1 > (0018,0010) Contrast/Bolus Agent LO 1 > (0018,0012) Contrast/Bolus Agent Sequence SQ 1 > (0018,0014) Contrast/Bolus Administration Route Sequence SQ 1 > (0018,0015) Body Part Examined CS 1 > > What I essentially want is to use python to process this file to give me > > > (0012,0042); Clinical Trial Subject Reading ID; LO; 1 > (0012,0050); Clinical Trial Time Point ID; LO; 1 > (0012,0051); Clinical Trial Time Point Description; ST; 1 > (0012,0060); Clinical Trial Coordinating Center Name; LO; 1 > (0018,0010); Contrast/Bolus Agent; LO; 1 > (0018,0012); Contrast/Bolus Agent Sequence; SQ ;1 > (0018,0014); Contrast/Bolus Administration Route Sequence; SQ; 1 > (0018,0015); Body Part Examined; CS; 1 > > so that I can import to excel using a delimiter. > > This file is extremely long and all I essentially want to do is to break > it into it 'fields' > > Now I suspect that regular expressions are the way to go but I have only > basic experience of using these and I have no idea what I should be doing. This seems to work: data = '''\ (0012,0042) Clinical Trial Subject Reading ID LO 1 (0012,0050) Clinical Trial Time Point ID LO 1 (0012,0051) Clinical Trial Time Point Description ST 1 (0012,0060) Clinical Trial Coordinating Center Name LO 1 (0018,0010) Contrast/Bolus Agent LO 1 (0018,0012) Contrast/Bolus Agent Sequence SQ 1 (0018,0014) Contrast/Bolus Administration Route Sequence SQ 1 (0018,0015) Body Part Examined CS 1'''.splitlines() import re fieldsRe = re.compile(r'^(\(\d+,\d+\)) (.*?) (\w+) (\d+)$') for line in data: match = fieldsRe.match(line) if match: print ';'.join(match.group(1, 2, 3, 4)) I don't think you want the space after the ; that you put in your example; Excel wants a single-character delimiter. Kent ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Regular Expression help
Gardner, Dean wrote: > Hi > > I have a text file that I would like to split up so that I can use it in > Excel to filter a certain field. However as it is a flat text file I > need to do some processing on it so that Excel can correctly import it. > > File Example: > tag descVR VM > (0012,0042) Clinical Trial Subject Reading ID LO 1 > (0012,0050) Clinical Trial Time Point ID LO 1 > (0012,0051) Clinical Trial Time Point Description ST 1 > (0012,0060) Clinical Trial Coordinating Center Name LO 1 > (0018,0010) Contrast/Bolus Agent LO 1 > (0018,0012) Contrast/Bolus Agent Sequence SQ 1 > (0018,0014) Contrast/Bolus Administration Route Sequence SQ 1 > (0018,0015) Body Part Examined CS 1 > > What I essentially want is to use python to process this file to give me > > > (0012,0042); Clinical Trial Subject Reading ID; LO; 1 > (0012,0050); Clinical Trial Time Point ID; LO; 1 > (0012,0051); Clinical Trial Time Point Description; ST; 1 > (0012,0060); Clinical Trial Coordinating Center Name; LO; 1 > (0018,0010); Contrast/Bolus Agent; LO; 1 > (0018,0012); Contrast/Bolus Agent Sequence; SQ ;1 > (0018,0014); Contrast/Bolus Administration Route Sequence; SQ; 1 > (0018,0015); Body Part Examined; CS; 1 > > so that I can import to excel using a delimiter. > > This file is extremely long and all I essentially want to do is to break > it into it 'fields' > > Now I suspect that regular expressions are the way to go but I have only > basic experience of using these and I have no idea what I should be doing. This seems to work: data = '''\ (0012,0042) Clinical Trial Subject Reading ID LO 1 (0012,0050) Clinical Trial Time Point ID LO 1 (0012,0051) Clinical Trial Time Point Description ST 1 (0012,0060) Clinical Trial Coordinating Center Name LO 1 (0018,0010) Contrast/Bolus Agent LO 1 (0018,0012) Contrast/Bolus Agent Sequence SQ 1 (0018,0014) Contrast/Bolus Administration Route Sequence SQ 1 (0018,0015) Body Part Examined CS 1'''.splitlines() import re fieldsRe = re.compile(r'^(\(\d+,\d+\)) (.*?) (\w+) (\d+)$') for line in data: match = fieldsRe.match(line) if match: print ';'.join(match.group(1, 2, 3, 4)) I don't think you want the space after the ; that you put in your example; Excel wants a single-character delimiter. Kent ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Regular Expression help
Gardner, Dean wrote: > Hi > > I have a text file that I would like to split up so that I can use it in > Excel to filter a certain field. However as it is a flat text file I > need to do some processing on it so that Excel can correctly import it. > > File Example: > tag descVR VM > (0012,0042) Clinical Trial Subject Reading ID LO 1 > (0012,0050) Clinical Trial Time Point ID LO 1 > (0012,0051) Clinical Trial Time Point Description ST 1 > (0012,0060) Clinical Trial Coordinating Center Name LO 1 > (0018,0010) Contrast/Bolus Agent LO 1 > (0018,0012) Contrast/Bolus Agent Sequence SQ 1 > (0018,0014) Contrast/Bolus Administration Route Sequence SQ 1 > (0018,0015) Body Part Examined CS 1 > > What I essentially want is to use python to process this file to give me > > > (0012,0042); Clinical Trial Subject Reading ID; LO; 1 > (0012,0050); Clinical Trial Time Point ID; LO; 1 > (0012,0051); Clinical Trial Time Point Description; ST; 1 > (0012,0060); Clinical Trial Coordinating Center Name; LO; 1 > (0018,0010); Contrast/Bolus Agent; LO; 1 > (0018,0012); Contrast/Bolus Agent Sequence; SQ ;1 > (0018,0014); Contrast/Bolus Administration Route Sequence; SQ; 1 > (0018,0015); Body Part Examined; CS; 1 > > so that I can import to excel using a delimiter. > > This file is extremely long and all I essentially want to do is to break > it into it 'fields' > > Now I suspect that regular expressions are the way to go but I have only > basic experience of using these and I have no idea what I should be doing. This seems to work: data = '''\ (0012,0042) Clinical Trial Subject Reading ID LO 1 (0012,0050) Clinical Trial Time Point ID LO 1 (0012,0051) Clinical Trial Time Point Description ST 1 (0012,0060) Clinical Trial Coordinating Center Name LO 1 (0018,0010) Contrast/Bolus Agent LO 1 (0018,0012) Contrast/Bolus Agent Sequence SQ 1 (0018,0014) Contrast/Bolus Administration Route Sequence SQ 1 (0018,0015) Body Part Examined CS 1'''.splitlines() import re fieldsRe = re.compile(r'^(\(\d+,\d+\)) (.*?) (\w+) (\d+)$') for line in data: match = fieldsRe.match(line) if match: print ';'.join(match.group(1, 2, 3, 4)) I don't think you want the space after the ; that you put in your example; Excel wants a single-character delimiter. Kent ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Regular Expression help
Yikes! Sorry about all the duplicate postings. Thunderbird was telling me the send failed so I kept retrying; I guess it was actually sending! Kent ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Removing a file from a tar
* Adam A. Zajac <[EMAIL PROTECTED]> [2007-06-27 11:26]: > I was reading over the documentation for the tarfile module and it > occurred to me that there didn't seem to be a way to remove an > individual file from the tar. > > For example, suppose I did this: > > import tarfile > tar = tarfile.open("sample.tar", "w") > tar.add("unwanted") > tar.add("wanted") > tar.close() > > At this point, how could I come back and remove "unwanted" from the tar? Wel, it looks like tar's --remove-files is not supported yet, you would probably have to reopen the tarfile, write it to a new one file-by-file, excluding the ones you don't want. Messy :-( -- David Rock [EMAIL PROTECTED] ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Finding all locations of a sequence
Firstly, I'd like to thank everyone for their help. I ended up throwing something together using dictionaries (because I understood those best out of what I had), that was a lot faster than my initial attempt, but have run into a different problem, that I was hoping for help with. So, what I have is all the subsequences that I was looking for in separate entries in the dictionary, and where each of them is found as the value. If a subsequence binds to more than one other item, I want to have the locations of the items all together. The closest I've been able to manage to get to what I want is this: dict_of_bond_location = {} dict1 = {'AAA':['UUU'], 'AAU':['UUG', 'UUA'], 'AAC':['UUG'], 'AAG':['UUC', 'UUU'], 'CCC':['GGG']} dict2 = {'AAA':[1], 'AAU':[2], 'AAC':[3], 'AAG':[0, 4], 'GGG':[10]} dict3 = {'UUU':[3, 5], 'UUG':[0], 'UUA':[1], 'UUC':[2], 'GGG':[14]} for key in dict2: if key in dict1: matching_subseq = dict1.get(key) for item in matching_subseq: if item in dict3: location = dict3.get(item) dict_of_bond_location.setdefault(key, []).append(location) print dict_of_bond_location which gives this: {'AAU': [[0], [1]], 'AAG': [[2], [3, 5]], 'AAA': [[3, 5]], 'AAC': [[0]]} but what I want is 'AAU':[0, 1], 'AAG':[2, 3, 5], 'AAA':[3. 5], 'AAC':[0] the setdefault(key, []).append(location) thing sort of does what I want, but I don't want the result to be a list of lists...just one big list. The production of a new dictionary is not necessary, but it made sense to me a few hours ago. Anyway, is there a fast and dirty way to add lists together, if the lists are not named (I think that's essentially what I want?) Thanks again, Lauren On 19/06/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: Send Tutor mailing list submissions to tutor@python.org To subscribe or unsubscribe via the World Wide Web, visit http://mail.python.org/mailman/listinfo/tutor or, via email, send a message with subject or body 'help' to [EMAIL PROTECTED] You can reach the person managing the list at [EMAIL PROTECTED] When replying, please edit your Subject line so it is more specific than "Re: Contents of Tutor digest..." Today's Topics: 1. Re: iterating over a sequence question.. (Luke Paireepinart) 2. Re: Help converting base32 to base16 (Alan Gauld) 3. Re: Finding all locations of a sequence (fwd) (Danny Yoo) 4. Re: sockets ( Linus Nordstr?m ) 5. Re: sockets (Alan Gauld) 6. Re: Finding all locations of a sequence (fwd) (Alan Gauld) 7. Re: cannot pickle instancemethod objects (hok kakada) 8. Re: Python and XSI (Vishal Jain) -- Message: 1 Date: Mon, 18 Jun 2007 13:37:21 -0500 From: "Luke Paireepinart" <[EMAIL PROTECTED]> Subject: Re: [Tutor] iterating over a sequence question.. To: "Simon Hooper" <[EMAIL PROTECTED]>, tutor@python.org Message-ID: <[EMAIL PROTECTED]> Content-Type: text/plain; charset="iso-8859-1" On 6/18/07, Simon Hooper <[EMAIL PROTECTED]> wrote: > > Hi Luke, > > * On 17/06/07, Luke Paireepinart wrote: > > a more expanded version that accounts for either list being the longer > > one, or both being the same length, would be: > > > > >>> if len(t) > len(l): x = len(t) > > else: x = len(l) > > >>> print [(l[i%len(l)],t[i%len(t)]) for i in range(x)] > > [(1, 'r'), (2, 'g'), (3, 'b'), (4, 'r'), (5, 'g')] > > Being the duffer that I am, I'm very pleased with myself that I came up > with a similar solution (albeit as a function rather than a list > comprehension) :) > > You do not need the if statement either, Yeah, I never knew about the max() function! I noticed someone else used it in one of their solutions. I'm pretty sure I've seen it a lot before, just didn't remember it. -Luke -- next part -- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/tutor/attachments/20070618/1cf0ac67/attachment-0001.html -- Message: 2 Date: Mon, 18 Jun 2007 21:12:02 +0100 From: "Alan Gauld" <[EMAIL PROTECTED]> Subject: Re: [Tutor] Help converting base32 to base16 To: tutor@python.org Message-ID: <[EMAIL PROTECTED]> Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=original "Jason Massey" <[EMAIL PROTECTED]> wrote > Nice entry at wikipedia: > > http://en.wikipedia.org/wiki/Base_32 > Thanks for the link, I should have thought of oooking there! I've heardof Base64 for encoding email but never come across Base32 - any of the versions! Alan G. -- Message: 3 Date: Mon, 18 Jun 2007 16:54:53 -0400 (EDT) From: Danny Yoo <[EMAIL PROTECTED]> Subject: Re: [Tutor] Finding all locations of a sequence (fwd) To: tutor@python.org Message-ID: <[EMAIL PROTECTED]> Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Hi everyone, Can someone help Lauren? I apologize for this, but I am very time constrained a
Re: [Tutor] Finding all locations of a sequence
Lauren wrote: > Firstly, I'd like to thank everyone for their help. I ended up > throwing something together using dictionaries (because I understood > those best out of what I had), that was a lot faster than my initial > attempt, but have run into a different problem, that I was hoping for > help with. So, what I have is all the subsequences that I was looking > for in separate entries in the dictionary, and where each of them is > found as the value. If a subsequence binds to more than one other > item, I want to have the locations of the items all together. > The closest I've been able to manage to get to what I want is this: > > dict_of_bond_location = {} > dict1 = {'AAA':['UUU'], 'AAU':['UUG', 'UUA'], 'AAC':['UUG'], > 'AAG':['UUC', 'UUU'], 'CCC':['GGG']} > dict2 = {'AAA':[1], 'AAU':[2], 'AAC':[3], 'AAG':[0, 4], 'GGG':[10]} > dict3 = {'UUU':[3, 5], 'UUG':[0], 'UUA':[1], 'UUC':[2], 'GGG':[14]} > > > for key in dict2: > if key in dict1: > matching_subseq = dict1.get(key) > for item in matching_subseq: > if item in dict3: > location = dict3.get(item) > dict_of_bond_location.setdefault(key, > []).append(location) > print dict_of_bond_location > > which gives this: > {'AAU': [[0], [1]], 'AAG': [[2], [3, 5]], 'AAA': [[3, 5]], 'AAC': [[0]]} > > but what I want is > 'AAU':[0, 1], 'AAG':[2, 3, 5], 'AAA':[3. 5], 'AAC':[0] > > the setdefault(key, []).append(location) thing sort of does what I > want, but I don't want the result to be a list of lists...just one big > list. The production of a new dictionary is not necessary, but it made > sense to me a few hours ago. Anyway, is there a fast and dirty way to > add lists together, if the lists are not named (I think that's > essentially what I want?) Lauren, Try this: >>> x = ['a'] >>> y = x.extend(['b','c']) >>> x ['a', 'b', 'c'] But note that >>> print y None Because the extend method doesn't return anything. HTH, -Luke ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] python port scanner
thanks evryone for your help am starting on the project :) On 6/25/07, János Juhász <[EMAIL PROTECTED]> wrote: Dear dos, >>hello i am looking into writing a simple python port scanner but i cant find >>any good tutorials online if anyone can help or knows of any tutorials that >>could help it would be great. this would be my first program like this so i >>might need a little extra help I just recommend to take a look on twisted http://www.oreilly.com/catalog/twistedadn/ It is a nice book. There is an example on how to do it with twisted among the examples in chapter 2. You can read it on safari. Janos ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Regular Expression help
Gardner, Dean wrote: > Hi > > I have a text file that I would like to split up so that I can use it in > Excel to filter a certain field. However as it is a flat text file I > need to do some processing on it so that Excel can correctly import it. > > File Example: > tag descVR VM > (0012,0042) Clinical Trial Subject Reading ID LO 1 > (0012,0050) Clinical Trial Time Point ID LO 1 > (0012,0051) Clinical Trial Time Point Description ST 1 > (0012,0060) Clinical Trial Coordinating Center Name LO 1 > (0018,0010) Contrast/Bolus Agent LO 1 > (0018,0012) Contrast/Bolus Agent Sequence SQ 1 > (0018,0014) Contrast/Bolus Administration Route Sequence SQ 1 > (0018,0015) Body Part Examined CS 1 > > What I essentially want is to use python to process this file to give me > > > (0012,0042); Clinical Trial Subject Reading ID; LO; 1 > (0012,0050); Clinical Trial Time Point ID; LO; 1 > (0012,0051); Clinical Trial Time Point Description; ST; 1 > (0012,0060); Clinical Trial Coordinating Center Name; LO; 1 > (0018,0010); Contrast/Bolus Agent; LO; 1 > (0018,0012); Contrast/Bolus Agent Sequence; SQ ;1 > (0018,0014); Contrast/Bolus Administration Route Sequence; SQ; 1 > (0018,0015); Body Part Examined; CS; 1 > > so that I can import to excel using a delimiter. > > This file is extremely long and all I essentially want to do is to break > it into it 'fields' > > Now I suspect that regular expressions are the way to go but I have only > basic experience of using these and I have no idea what I should be doing. This seems to work: data = '''\ (0012,0042) Clinical Trial Subject Reading ID LO 1 (0012,0050) Clinical Trial Time Point ID LO 1 (0012,0051) Clinical Trial Time Point Description ST 1 (0012,0060) Clinical Trial Coordinating Center Name LO 1 (0018,0010) Contrast/Bolus Agent LO 1 (0018,0012) Contrast/Bolus Agent Sequence SQ 1 (0018,0014) Contrast/Bolus Administration Route Sequence SQ 1 (0018,0015) Body Part Examined CS 1'''.splitlines() import re fieldsRe = re.compile(r'^(\(\d+,\d+\)) (.*?) (\w+) (\d+)$') for line in data: match = fieldsRe.match(line) if match: print ';'.join(match.group(1, 2, 3, 4)) I don't think you want the space after the ; that you put in your example; Excel wants a single-character delimiter. Kent ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor