[Tutor] regular expression question

2009-04-28 Thread Kelie
Hello, The following code returns 'abc123abc45abc789jk'. How do I revise the pattern so that the return value will be 'abc789jk'? In other words, I want to find the pattern 'abc' that is closest to 'jk'. Here the string '123', '45' and '789' are just examples. They are actually quite different in

Re: [Tutor] Working with lines from file and printing to another keeping sequential order

2009-04-28 Thread spir
Le Mon, 27 Apr 2009 23:29:13 -0400, Dan Liang s'exprima ainsi: > Hi Bob, Shantanoo, Kent, and tutors, > > Thank you Bob, Shantanoo, Kent for all the nice feedback. Exception > handling, the concept of states in cs, and the use of the for loop with > offset helped a lot. Here is the code I now ha

Re: [Tutor] How to run a .py file or load a module?

2009-04-28 Thread Dayo Adewunmi
David wrote: Norman Khine wrote: On Mon, Apr 27, 2009 at 12:07 AM, Sander Sweers wrote: Here is another one for fun, you run it like python countdown.py 10 #!/usr/bin/env python import sys from time import sleep times = int(sys.argv[1]) # The argument given on the command line def countdow

Re: [Tutor] How to run a .py file or load a module?

2009-04-28 Thread Dayo Adewunmi
Denis, this mail was very comprehensive, and went a long way of driving it all home for me. There are several different concepts that are involved in this simple problem that I had, and you guys explaining them has really expanded my pythonic horizon, especially the explanations on the argv mod

Re: [Tutor] regular expression question

2009-04-28 Thread =?UTF-8?Q?Marek_Spoci=C5=84ski
> Hello, > > The following code returns 'abc123abc45abc789jk'. How do I revise the pattern > so > that the return value will be 'abc789jk'? In other words, I want to find the > pattern 'abc' that is closest to 'jk'. Here the string '123', '45' and '789' > are > just examples. They are actually q

Re: [Tutor] regular expression question

2009-04-28 Thread Marek Spociński , Poland
Dnia 28 kwietnia 2009 11:16 Andre Engels napisał(a): > 2009/4/28 Marek spociń...@go2.pl,Poland : > >> Hello, > >> > >> The following code returns 'abc123abc45abc789jk'. How do I revise the > >> pattern so > >> that the return value will be 'abc789jk'? In other words, I want to find > >> the > >>

Re: [Tutor] regular expression question

2009-04-28 Thread spir
Le Tue, 28 Apr 2009 11:06:16 +0200, Marek spociń...@go2.pl, Poland s'exprima ainsi: > > Hello, > > > > The following code returns 'abc123abc45abc789jk'. How do I revise the > > pattern so that the return value will be 'abc789jk'? In other words, I > > want to find the pattern 'abc' that is clos

Re: [Tutor] regular expression question

2009-04-28 Thread Kelie
Andre Engels gmail.com> writes: > > 2009/4/28 Marek Spociński go2.pl,Poland 10g.pl>: > > I suggest using r'abc.+?jk' instead. > > > > That was my first idea too, but it does not work for this case, > because Python will still try to _start_ the match as soon as > possible. yeah, i tried t

Re: [Tutor] regular expression question

2009-04-28 Thread Kent Johnson
2009/4/28 Marek spociń...@go2.pl,Poland : >> import re >> s = 'abc123abc45abc789jk' >> p = r'abc.+jk' >> lst = re.findall(p, s) >> print lst[0] > > I suggest using r'abc.+?jk' instead. > > the additional ? makes the preceeding '.+' non-greedy so instead of matching > as long string as it can it m

Re: [Tutor] regular expression question

2009-04-28 Thread Kent Johnson
On Tue, Apr 28, 2009 at 4:03 AM, Kelie wrote: > Hello, > > The following code returns 'abc123abc45abc789jk'. How do I revise the pattern > so > that the return value will be 'abc789jk'? In other words, I want to find the > pattern 'abc' that is closest to 'jk'. Here the string '123', '45' and '78

Re: [Tutor] regular expression question

2009-04-28 Thread Andre Engels
2009/4/28 Marek spociń...@go2.pl,Poland : >> Hello, >> >> The following code returns 'abc123abc45abc789jk'. How do I revise the >> pattern so >> that the return value will be 'abc789jk'? In other words, I want to find the >> pattern 'abc' that is closest to 'jk'. Here the string '123', '45' and '7

Re: [Tutor] regular expression question

2009-04-28 Thread Kelie
spir free.fr> writes: > To avoid that, use non-grouping parens (?:...). This also avoids the need for parens around the whole format: > p = Pattern(r'abc(?:(?!abc).)+jk') > print p.findall(s) > ['abc789jk'] > > Denis This one works! Thank you Denis. I'll try it out on the actual much longer (m

[Tutor] finding mismatched or unpaired html tags

2009-04-28 Thread Dinesh B Vadhia
I'm processing tens of thousands of html files and a few of them contain mismatched tags and ElementTree throws the error: "Unexpected error opening J:/F2/663/blahblah.html: mismatched tag: line 124, column 8" I now want to scan each file and simply identify each mismatched or unpaired tags (b

Re: [Tutor] finding mismatched or unpaired html tags

2009-04-28 Thread A.T.Hofkamp
Dinesh B Vadhia wrote: I'm processing tens of thousands of html files and a few of them contain mismatched tags and ElementTree throws the error: "Unexpected error opening J:/F2/663/blahblah.html: mismatched tag: line 124, column 8" I now want to scan each file and simply identify each mismat

Re: [Tutor] finding mismatched or unpaired html tags

2009-04-28 Thread Martin Walsh
A.T.Hofkamp wrote: > Dinesh B Vadhia wrote: >> I'm processing tens of thousands of html files and a few of them >> contain mismatched tags and ElementTree throws the error: >> >> "Unexpected error opening J:/F2/663/blahblah.html: mismatched tag: >> line 124, column 8" >> >> I now want to scan each

Re: [Tutor] finding mismatched or unpaired html tags

2009-04-28 Thread Dinesh B Vadhia
A.T. / Marty I'd prefer that the html parser didn't replace the missing tags as I want to know where and what the problems are. Also, the source html documents were generated by another computer ie. they are not web page documents. My sense is that it is only a few files out of tens of thousa

Re: [Tutor] finding mismatched or unpaired html tags

2009-04-28 Thread Kent Johnson
On Tue, Apr 28, 2009 at 8:54 AM, Dinesh B Vadhia wrote: > I'm processing tens of thousands of html files and a few of them contain > mismatched tags and ElementTree throws the error: > > "Unexpected error opening J:/F2/663/blahblah.html: mismatched tag: line 124, > column 8" > > I now want to scan

Re: [Tutor] finding mismatched or unpaired html tags

2009-04-28 Thread Martin Walsh
Dinesh B Vadhia wrote: > A.T. / Marty > > I'd prefer that the html parser didn't replace the missing tags as I > want to know where and what the problems are. Also, the source html > documents were generated by another computer ie. they are not web page > documents. My sense is that it is only

Re: [Tutor] finding mismatched or unpaired html tags

2009-04-28 Thread Dinesh B Vadhia
This is the error and traceback: Unexpected error opening J:/F2/html: mismatched tag: line 124, column 8 Traceback (most recent call last): File "C:\py", line 492, in raw = extractText(xhtmlfile) File "C:\py", line 334, in extractText tree = make_tree(xhtmlfile) File ".

Re: [Tutor] finding mismatched or unpaired html tags

2009-04-28 Thread spir
Le Tue, 28 Apr 2009 07:41:36 -0700, "Dinesh B Vadhia" s'exprima ainsi: > This is the error and traceback: > > Unexpected error opening J:/F2/html: mismatched tag: line 124, column 8 > > Traceback (most recent call last): > File "C:\py", line 492, in > raw = extractText(xhtmlfile)

Re: [Tutor] finding mismatched or unpaired html tags

2009-04-28 Thread Kent Johnson
On Tue, Apr 28, 2009 at 10:41 AM, Dinesh B Vadhia wrote: > This is the error and traceback: > > Unexpected error opening J:/F2/html: mismatched tag: line 124, column 8 > > Traceback (most recent call last): >   File "C:\py", line 492, in >     raw = extractText(xhtmlfile) >   File "C:\...

Re: [Tutor] finding mismatched or unpaired html tags

2009-04-28 Thread Dinesh B Vadhia
Found the mismatched tag on line 94: "My Name in Nelma Lois Thornton-S.S. No. sjn-yz-yokv/p>" should be: "My Name in Nelma Lois Thornton-S.S. No. sjn-yz-yokv" I'll run all the html files through a simple script to identify the mismatches using etree. Thanks. Dinesh From: Kent Johnson Sen

Re: [Tutor] finding mismatched or unpaired html tags

2009-04-28 Thread Alan Gauld
"Dinesh B Vadhia" wrote I'm processing tens of thousands of html files and a few of them contain mismatched tags and ElementTree throws the error: "Unexpected error opening J:/F2/663/blahblah.html: mismatched tag: line 124, column 8" IMHO the best way to cleanse HTML files is to use tidy. I

Re: [Tutor] finding mismatched or unpaired html tags

2009-04-28 Thread Stefan Behnel
A.T.Hofkamp wrote: > Dinesh B Vadhia wrote: >> I'm processing tens of thousands of html files and a few of them >> contain mismatched tags and ElementTree throws the error: >> >> "Unexpected error opening J:/F2/663/blahblah.html: mismatched tag: >> line 124, column 8" >> >> I now want to scan each

[Tutor] Add newline's, wrap, a long string

2009-04-28 Thread David
I am getting information from .txt files and posting them in fields on a web site. I need to break up single strings so they are around 80 characters then a new line because when I enter the info to the form on the website it has fields and it errors out with such a long string. here is a samp

Re: [Tutor] Add newline's, wrap, a long string

2009-04-28 Thread vince spicer
first, grabbing output from an external command try: import commands USE = commands.getoutput('grep USE /tmp/comprookie2000/emege_info.txt |head -n1|cut -d\\"-f2') then you can wrap strings, import textwrap Lines = textwrap.wrap(USE, 80) # return a list so in short: import commands, textwrap

Re: [Tutor] finding mismatched or unpaired html tags

2009-04-28 Thread Dinesh B Vadhia
Stefan / Alan et al Thank-you for all the advice and links. A simple script using etree is scanning 500K+ xhtml files and 2 files with mismatched files have been found so far which can be fixed manually. I'll definitely look into "tidy" as it sounds pretty cool. Because, we are running data

Re: [Tutor] Add newline's, wrap, a long string

2009-04-28 Thread David
vince spicer wrote: first, grabbing output from an external command try: import commands USE = commands.getoutput('grep USE /tmp/comprookie2000/emege_info.txt |head -n1|cut -d\\"-f2') then you can wrap strings, import textwrap Lines = textwrap.wrap(USE, 80) # return a list so in short:

[Tutor] Regular Expresions instances

2009-04-28 Thread Emilio Casbas
Hi, following the example from http://docs.python.org/3.0/howto/regex.html If I execute the following code on the python shell (3.1a1): >>> import re >>> p = re.compile('ab*') >>> p I get the msg: <_sre.SRE_Pattern object at 0x013A3440> instead of the msg from the example: Why I get an SRE_

Re: [Tutor] Regular Expresions instances

2009-04-28 Thread Emile van Sebille
Emilio Casbas wrote: Hi, following the example from http://docs.python.org/3.0/howto/regex.html ...from version 3.0 docs... If I execute the following code on the python shell (3.1a1): import re p = re.compile('ab*') p I get the msg: <_sre.SRE_Pattern object at 0x013A3440> ... is the

Re: [Tutor] Regular Expresions instances

2009-04-28 Thread Emile van Sebille
Emile van Sebille wrote: Emilio Casbas wrote: Hi, following the example from http://docs.python.org/3.0/howto/regex.html ...from version 3.0 docs... If I execute the following code on the python shell (3.1a1): import re p = re.compile('ab*') p I get the msg: <_sre.SRE_Pattern object at

Re: [Tutor] finding mismatched or unpaired html tags

2009-04-28 Thread Lie Ryan
Dinesh B Vadhia wrote: A.T. / Marty I'd prefer that the html parser didn't replace the missing tags as I want to know where and what the problems are. Also, the source html documents were generated by another computer ie. they are not web page documents. If the source document was gener

Re: [Tutor] Add newline's, wrap, a long string

2009-04-28 Thread Martin Walsh
David wrote: > I am getting information from .txt files and posting them in fields on a > web site. I need to break up single strings so they are around 80 > characters then a new line because when I enter the info to the form on > the website it has fields and it errors out with such a long string

Re: [Tutor] Add newline's, wrap, a long string

2009-04-28 Thread Martin Walsh
David wrote: > vince spicer wrote: >> first, grabbing output from an external command try: >> >> import commands >> >> USE = commands.getoutput('grep USE /tmp/comprookie2000/emege_info.txt >> |head -n1|cut -d\\"-f2') >> >> then you can wrap strings, >> >> import textwrap >> >> Lines = textwrap.wr

[Tutor] Can not run under python 2.6?

2009-04-28 Thread Jianchun Zhou
Hi, there: I am new to python, and now I got a trouble: I have an application named canola, it is written under python 2.5, and can run normally under python 2.5 But when it comes under python 2.6, problem up, it says: Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/