[Tutor] Exception Handling and Stack traces
Hello, I can't work out how to suppress stacktrace printing when exceptions are thrown. I want the thrown exception to pass a message on the console, just like Java does when I catch an exception and print e.getMessage(). I tried some of the examples of controlling traceback through the traceback module. I have to admit, this is one of the cases where I wonder why it has to be made so bloody complicated. The other day, I read an article by Guido in which he was proposing to remove 'map' and 'filter' from the language because he though the developer shouldn't have to think too much about using them. But the exception handling is so non-intuitive that the time saved from not having to think about using 'map' is spent instead on trying to figure out how to promote a message to the commandline when a method fails. It seems that exceptions automatically are thrown up, so even if I put a 'try/except' clause in a method; when that method fails, the exception will still be thrown up to the top of the call chain. So, there seems to be no reason whatever to catch an exception in any method except the top caller (i.e., 'main'). Example: def getData(auth) : # create opener with auth headers here try : data = opener.open(url) except urllib2.URLError("Badly formed URL") : formatted_lines = traceback.format_exc().splitlines() print formatted_lines[0] print formatted_lines[-1] sys.exit(1) return data This method is called by another method, printOutput(), which processes the return data (XML from web service). That method is called in main: [printOutput(w) for w in weeks] All I want to see is that if the exception is thrown in getData(), the message is printed to stdout and the script exits. Instead, I get the stacktrace printed back down from main, I don't get the exception handled in getData (i.e., the error message and exit). Now, I'm sure somebody is going to explain to me why it's so unreasonable to think that I ought to be able to get an error message from e.getMessage() and a stacktrace from e.printStacktrace() and that I ought to be able to choose between the two. Because, it seems that python is determined that I should have a stacktrace whether I want one or not. Sorry, I'm more than a little annoyed because even the example of using the traceback module from the python docs does not provide the handling that it is supposed to; and I really think this level of complexity for handling exceptions cleanly is just unwarranted. I need to be able to give this script to someone who will want to be able to read the error output without having to be a Python programmer experienced in reading stack traces. e.g. a "Badly formed URL" message that tells them they set up the parameters for connecting to the web service incorrectly. Hopefully, you can get past the rant and help me solve this problem. Thanks. mp -- Michael Powemich...@trollope.orgNaugatuck CT USA "The secret to strong security: less reliance on secrets." -- Whitfield Diffie pgpQKpff9ycFv.pgp Description: PGP signature ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Exception Handling and Stack traces
On Fri, Sep 10, 2010 at 03:56:51PM +0200, Peter Otten wrote: > Michael Powe wrote: > > > I can't work out how to suppress stacktrace printing when exceptions > > are thrown. > > [snip rant] > > It might have been a good idea to read a tutorial like > > http://docs.python.org/tutorial/errors.html#handling-exceptions > or ask before you got annoyed enough to write that rant ;) Hello, Thanks for the reply. Stupid me, I just read a half dozen articles on the web about python exception handling, including some at docs.python. At no point is the 'as' clause discussed as being required. Note that in section 8.3 of that article, the statement is made that if the exception matches the the exception type in the following format, the statements within the except clause are executed. except URLError : # do something That in fact, seems to me to be incorrect. It is not my experience (e.g., print statements are not executed in the example I gave and the sys.exit() is not called). I'll follow up on your suggestions. I appreciate the help. Thanks. mp > To catch an exception you have to put the class into the except clause, not > an instance. Basic example, using 2.6 syntax: > > WRONG: > > >>> try: > ... 1/0 > ... except ZeroDivisionError("whatever"): > ... print "caught" > ... > Traceback (most recent call last): > File "", line 2, in > ZeroDivisionError: integer division or modulo by zero > > CORRECT: > > >>> try: > ... 1/0 > ... except ZeroDivisionError as e: > ... print "caught", e > ... > caught integer division or modulo by zero > > Peter > > ___ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > http://mail.python.org/mailman/listinfo/tutor -- Michael Powemich...@trollope.orgNaugatuck CT USA "...we built a new continent in cyberspace where we could go live and be free. And the nice thing is that, because we built it, we didn't have to steal it from aboriginal peoples. It was completely empty, and we invite everyone to live there with us. No immigration restrictions. There's room for everyone in the world. Come live in the free world and be free. That's our idea." -- Richard Stallman pgpnp3PNLKdQk.pgp Description: PGP signature ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Exception Handling and Stack traces
On Fri, Sep 10, 2010 at 03:50:57PM +0200, Evert Rol wrote: > This is a bit of a guess, but as far as I know, you can catch exceptions like > that. > Try: > try: > data = opener.open(url) > except urllib2.URLError as msg: > print msg > sys.exit(1) > If you're using an older version of Python, you'll need: > > try: > data = opener.open(url) > except urllib2.URLError, msg: > print msg > sys.exit(1) > In your example, you are *creating* an exception (but doing nothing > with it; I have no idea what happens if you have a line like "except > ". Perhaps it tries to compare one on one with > that instance, but if it compares by id, that will not work). In > this way, you're not catching the exception. So, it will be pass > your except clause, and just do what it always does: print the whole > exception's traceback. Which is probably what you're seeing. Hello, Thanks for the reply. As I indicated in the other message I just wrote, the format I used is one I took straight from the documentation. Of course, there may be assumptions in the documented examples that I am not aware of. It looks like you and Peter have pulled me out of the ditch and for that I am grateful. Thanks. mp -- Michael Powemich...@trollope.orgNaugatuck CT USA It could have been an organically based disturbance of the brain -- perhaps a tumor or a metabolic deficiency -- but after a thorough neurological exam it was determined that Byron was simply a jerk. -- Jeff Jahnke, runner-up, Bulwer-Lytton contest pgpGqEYPwKKjY.pgp Description: PGP signature ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Exception Handling and Stack traces
On Fri, Sep 10, 2010 at 04:42:35PM +0200, Peter Otten wrote: > Michael Powe wrote: > > On Fri, Sep 10, 2010 at 03:56:51PM +0200, Peter Otten wrote: > >> Michael Powe wrote: > >> > I can't work out how to suppress stacktrace printing when exceptions > >> > are thrown. > >> WRONG: > >> > >> >>> try: > >> ... 1/0 > >> ... except ZeroDivisionError("whatever"): > >> ... print "caught" > >> ... > >> Traceback (most recent call last): > >> File "", line 2, in > >> ZeroDivisionError: integer division or modulo by zero > >> > >> CORRECT: > >> > >> >>> try: > >> ... 1/0 > >> ... except ZeroDivisionError as e: > >> ... print "caught", e > >> ... > >> caught integer division or modulo by zero > > Note that in section 8.3 of that article, the statement is made that > > if the exception matches the the exception type in the following > > format, the statements within the except clause are executed. > > > > except URLError : > > # do something > > > > That in fact, seems to me to be incorrect. It is not my experience > > (e.g., print statements are not executed in the example I gave and the > > sys.exit() is not called). > Sorry, the as-clause is *not* necessary. The relevant difference between the > correct and the wrong approach is that you must not instantiate the > exception: > WRONG: > > >>> try: > ... 1/0 > ... except ZeroDivisionError("whatever"): > ... print "caught" > ... > Traceback (most recent call last): > File "", line 2, in > ZeroDivisionError: integer division or modulo by zero > CORRECT: > > >>> try: > ... 1/0 > ... except ZeroDivisionError: > ... print "caught" > ... > caught > > I just put in the as-clause to show an easy way to print the exception. I > did not anticipate that it would obscure the message. Hello, No problem, I am working on getting this sorted out. The documentation seems to be written as reminder for people who already know how this stuff works, rather than as a clear explanation for anybody working with it. Eventually, I arrived at a workable conclusion by wrapping only the call in main and using your suggested 'as' clause. This successfully suppresses the traceback and gives a useable error message. Although, in the case of URLError, 'getaddrinfo failed' may not actually mean much to the end user, it'll have to do. I don't like the fact that I cannot locate my thrown exception at the point of throwing -- i.e., I don't necessarily mind catching the exception in main but I would like to be able to print out exactly where the exception occurred. This is more useful when troubleshooting. However, an entire stacktrace is unacceptably verbose. Thanks. mp -- Michael Powemich...@trollope.orgNaugatuck CT USA It's easier to fight for one's principles than to live up to them. pgpQdf9CtsOF9.pgp Description: PGP signature ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
[Tutor] Trapping HTTP Authentication Failure
Hello, My script to call a web service authenticates. I would like to be able to trap an exception if the authentication fails. The script loops over a list of dates and I don't want it to retry for every element in the list. This could take a long time and be very annoying when, after the long wait, a stacktrace spews out of the last attempt. The assumption is that if it fails, it is not a transient network or some other issue but that the credentials themselves are faulty. Now, the authentication header is sent with the initial request, so it does not look to me like the standard process of request, get a 401 and then re-request with credentials is relevant. However, clearly the opener issues a number of retries after the initial failure. But, I don't see a mechanism in urllib2 for taking notice of a failure and acting on it. Can somebody point me toward a solution? Thanks. mp -- Michael Powemich...@trollope.orgNaugatuck CT USA "And I'd be a Libertarian, if they weren't all a bunch of tax-dodging professional whiners." -- Berke Breathed pgptBq2mUrmnN.pgp Description: PGP signature ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Trapping HTTP Authentication Failure
On Sat, Sep 11, 2010 at 01:28:26AM +0200, Evert Rol wrote: > > My script to call a web service authenticates. > Sorry, but where is the (full) script? I missed an attachment or (preferably) > a link. Hello, Sorry, the verb of the sentence is "authenticates," as in, "My script ... authenticates." But I can show the authentication portion. 8<-- start >8 # Creates an authentication object with the credentials for a given URL def createPasswordManager(headers) : passwordManager = urllib2.HTTPPasswordMgrWithDefaultRealm() passwordManager.add_password(None,overview_url,headers[0],headers[1]) return passwordManager # Creates an authentication handler for the authentication object created above def createAuthenticationHandler(passwordManager) : authenticationHandler = urllib2.HTTPBasicAuthHandler(passwordManager) return authenticationHandler # Creates an opener that sets the credentials in the Request def createOpener(authHandler) : return urllib2.build_opener(authHandler) # Retrieves the data def getData(authHeaders) : opener = createOpener(createAuthenticationHandler(createPasswordManager(authHeaders))) data = opener.open(overview_url) return data 8<--- end >8 So, to restate the question, how can I trap an exception in the cases in which authentication fails? Right now, the whole script is complete and working (thanks for your help with my other exception-handling question). Except for the case of bad credentials. The use case is that the user misspells a username or password or puts in a wrong account information. Then, I don't want them to sit for 10 minutes while the script makes 30 data connections, retries and fails each time. Thanks. mp -- Michael Powemich...@trollope.orgNaugatuck CT USA I hate a fellow whom pride, or cowardice, or laziness drives into a corner, and who does nothing when he is there but sit and ; let him come out as I do, and . -- Samuel Johnson pgpaSYEIaSF3p.pgp Description: PGP signature ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Trapping HTTP Authentication Failure
On Sat, Sep 11, 2010 at 01:09:31PM +0200, Evert Rol wrote: > >>> My script to call a web service authenticates. > > > >> Sorry, but where is the (full) script? I missed an attachment or > >> (preferably) a link. > > > > Hello, > > > > Sorry, the verb of the sentence is "authenticates," as in, "My script > > ... authenticates." > > Sorry, misread that. > Although code does help :-). > I assume the rest of the code is just the loop around the items you > want to fetch, calling getData each time with a new URL. Yes, it's looping over a list. Specifically, at startup the script does a date detection and creates a list of identifiers for each week of the year up to the point of invocation. So, if it is week 36, I get a list of 36 weeks. Then, the data retrieval and processing takes place like this: [processData(w) for w in weeks] Where the processData function calls getData() and passes in the current week to process. [ code snipped ] > > So, to restate the question, how can I trap an exception in the cases > > in which authentication fails? [ snip ] > I'm not sure what you're exactly doing here, or what you're getting, >but I did get curious and dug around urllib2.py. Apparently, there is >a hardcoded 5 retries before the authentication really fails. So any >stack trace would be the normal stack trace times 5. Not the 30 you >mentioned, but annoying enough anyway (I don't see how it would fail >for every element in the loop though. Once it raises an exception, >the program basically ends). It never throws an exception. Or, if it does, something about the way I'm calling suppresses it. IOW, I can put in a bogus credential and start the script and sit here for 5 minutes and see nothing. Then ^C and I get a huge stacktrace that shows the repeated calls. After the timeout on one element in the list, it goes to the next element, times out, goes to the next. > I don't know why it's hard-coded that way, and not just an option > with a default of 5, but that's currently how it is (maybe someone > else on this list knows?). I don't know, but even if I could set it to 1, I'm not helped unless there's a way for me to make it throw an exception and exit the loop. > If that's what you're finding, perhaps the quickest way is to > subclass urllib2.HTTPBasicAuthHandler, and override the > http_error_auth_reqed method (essentially keeping it exactly the > same apart from the hard-coded 5). Now there's a challenge! ;-) Thanks. mp -- Michael Powemich...@trollope.orgNaugatuck CT USA Is it time for your medication or mine? pgpCntmjOvEx8.pgp Description: PGP signature ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Trapping HTTP Authentication Failure
On Sat, Sep 11, 2010 at 02:25:24PM +0200, Evert Rol wrote: > > >> I'm not sure what you're exactly doing here, or what you're getting, > >> but I did get curious and dug around urllib2.py. Apparently, there is > >> a hardcoded 5 retries before the authentication really fails. So any > >> stack trace would be the normal stack trace times 5. Not the 30 you > >> mentioned, but annoying enough anyway (I don't see how it would fail > >> for every element in the loop though. Once it raises an exception, > >> the program basically ends). > > It never throws an exception. Or, if it does, something about the way > > I'm calling suppresses it. IOW, I can put in a bogus credential and > > start the script and sit here for 5 minutes and see nothing. Then ^C > > and I get a huge stacktrace that shows the repeated calls. After the > > timeout on one element in the list, it goes to the next element, times > > out, goes to the next. > Ok, now I had to try and recreate something myself. So my processData is: > def processData(f): >global overview_url >overview_url = baseurl + f >getData(authHeaders) > > (f being a filename, out of a list of many). Other code same as yours. > It definitely throws a 401 exception after 5 retries. No time-outs, > no long waits. In fact, a time-out would to me indicate another > problem (it still should throw an exception, though). So, unless > you're catching the exception in processData somehow, I don't see > where things could go wrong. > I assume you have no problem with correct credentials or simply > using a webbrowser? Hello, Yes, I can retrieve data without any problem. I can break the URL and generate a 404 exception that is trapped and I can break it in other ways that generate other types of exceptions. And trap them. I went back and looked at the code in urllib2.py and I see the timeout counter and that it raises an HTTPError after 5 tries. But I don't get anything back. If I just let the code run to completion, I get sent back to the prompt. I put a try/catch in the method and I already have one on the call in main. > >> I don't know why it's hard-coded that way, and not just an option > >> with a default of 5, but that's currently how it is (maybe someone > >> else on this list knows?). > > > > I don't know, but even if I could set it to 1, I'm not helped unless > > there's a way for me to make it throw an exception and exit the loop. Actually, there's a comment in the code about why it is set to 5 -- it's arbitrary, and allows for the Password Manager to prompt for credentials while not letting the request be reissued until 'recursion depth is exceeded.' I guess I'll have to go back to ground zero and write a stub to generate the error and then build back up to where it disappears. Thanks. mp -- Michael Powemich...@trollope.orgNaugatuck CT USA It turns out that it will be easier to simply block the top offenders manually; the rules for pattern matching are too arcane, obscure, and difficult to program. -- t. pascal, comp.mail.misc, "procmail to filter spam" pgpPHQmwJIOGq.pgp Description: PGP signature ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Trapping HTTP Authentication Failure
On Sat, Sep 11, 2010 at 10:48:13AM -0400, Michael Powe wrote: > On Sat, Sep 11, 2010 at 02:25:24PM +0200, Evert Rol wrote: > > > > > >> I'm not sure what you're exactly doing here, or what you're getting, > > >> but I did get curious and dug around urllib2.py. Apparently, there is > > >> a hardcoded 5 retries before the authentication really fails. So any > > >> stack trace would be the normal stack trace times 5. Not the 30 you > > >> mentioned, but annoying enough anyway (I don't see how it would fail > > >> for every element in the loop though. Once it raises an exception, > > >> the program basically ends). > > > It never throws an exception. Or, if it does, something about the way > > > I'm calling suppresses it. IOW, I can put in a bogus credential and > > > start the script and sit here for 5 minutes and see nothing. Then ^C > > > and I get a huge stacktrace that shows the repeated calls. After the > > > timeout on one element in the list, it goes to the next element, times > > > out, goes to the next. Hello, More experimentation revealed that one problem was testing the script in Idle. Idle does something to suppress the script failure for that particular case (IOW, it correctly returns HTTPError for things like '404' and URLError for things like a bad domain name). When I run the script from the command line (cmd), it actually ignores the '5' retry limit, seemingly. I added another catch block: except Exception as e: print "exception: ",e That prints out "exception: maximum recursion depth exceeded." I wonder if there is something hinky in Windows that is causing this to happen. Thanks. mp > > Ok, now I had to try and recreate something myself. So my processData is: > > > def processData(f): > >global overview_url > >overview_url = baseurl + f > >getData(authHeaders) > > > > (f being a filename, out of a list of many). Other code same as yours. > > > It definitely throws a 401 exception after 5 retries. No time-outs, > > no long waits. In fact, a time-out would to me indicate another > > problem (it still should throw an exception, though). So, unless > > you're catching the exception in processData somehow, I don't see > > where things could go wrong. > > > I assume you have no problem with correct credentials or simply > > using a webbrowser? > > Hello, > > Yes, I can retrieve data without any problem. I can break the URL and > generate a 404 exception that is trapped and I can break it in other > ways that generate other types of exceptions. And trap them. > > I went back and looked at the code in urllib2.py and I see the > timeout counter and that it raises an HTTPError after 5 tries. But I > don't get anything back. If I just let the code run to completion, I > get sent back to the prompt. I put a try/catch in the method and I > already have one on the call in main. > > > > > >> I don't know why it's hard-coded that way, and not just an option > > >> with a default of 5, but that's currently how it is (maybe someone > > >> else on this list knows?). > > > > > > I don't know, but even if I could set it to 1, I'm not helped unless > > > there's a way for me to make it throw an exception and exit the loop. > > Actually, there's a comment in the code about why it is set to 5 -- > it's arbitrary, and allows for the Password Manager to prompt for > credentials while not letting the request be reissued until 'recursion > depth is exceeded.' > > I guess I'll have to go back to ground zero and write a stub to > generate the error and then build back up to where it disappears. > > Thanks. > > mp > > -- > Michael Powe mich...@trollope.orgNaugatuck CT USA > It turns out that it will be easier to simply block the top offenders > manually; the rules for pattern matching are too arcane, obscure, and > difficult to program. -- t. pascal, comp.mail.misc, "procmail to > filter spam" > ___ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > http://mail.python.org/mailman/listinfo/tutor -- Michael Powemich...@trollope.orgNaugatuck CT USA I hate a fellow whom pride, or cowardice, or laziness drives into a corner, and who does nothing when he is there but sit and ; let him come out as I do, and . -- Samuel Johnson pgplLqVty92Am.pgp Description: PGP signature ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
[Tutor] SOLVED: Re: Trapping HTTP Authentication Failure
Hello, It is bloody Winblows. The script works as designed and traps the 401 exception on my slackware box ... something in the implementation of urllib2 on Windoze is broken. This has to be a known issue. Just did not see it known anywhere. Thanks. mp On Sat, Sep 11, 2010 at 02:16:07PM -0400, Michael Powe wrote: > On Sat, Sep 11, 2010 at 10:48:13AM -0400, Michael Powe wrote: > > On Sat, Sep 11, 2010 at 02:25:24PM +0200, Evert Rol wrote: > > > > > > > > >> I'm not sure what you're exactly doing here, or what you're getting, > > > >> but I did get curious and dug around urllib2.py. Apparently, there is > > > >> a hardcoded 5 retries before the authentication really fails. So any > > > >> stack trace would be the normal stack trace times 5. Not the 30 you > > > >> mentioned, but annoying enough anyway (I don't see how it would fail > > > >> for every element in the loop though. Once it raises an exception, > > > >> the program basically ends). > > > > > It never throws an exception. Or, if it does, something about the way > > > > I'm calling suppresses it. IOW, I can put in a bogus credential and > > > > start the script and sit here for 5 minutes and see nothing. Then ^C > > > > and I get a huge stacktrace that shows the repeated calls. After the > > > > timeout on one element in the list, it goes to the next element, times > > > > out, goes to the next. > > Hello, > > More experimentation revealed that one problem was testing the script > in Idle. Idle does something to suppress the script failure for that > particular case (IOW, it correctly returns HTTPError for things like > '404' and URLError for things like a bad domain name). > > When I run the script from the command line (cmd), it actually ignores > the '5' retry limit, seemingly. I added another catch block: > > except Exception as e: > print "exception: ",e > > That prints out "exception: maximum recursion depth exceeded." > > I wonder if there is something hinky in Windows that is causing this > to happen. > > Thanks. > > mp > > > > Ok, now I had to try and recreate something myself. So my processData is: > > > > > def processData(f): > > >global overview_url > > >overview_url = baseurl + f > > >getData(authHeaders) > > > > > > (f being a filename, out of a list of many). Other code same as yours. > > > > > It definitely throws a 401 exception after 5 retries. No time-outs, > > > no long waits. In fact, a time-out would to me indicate another > > > problem (it still should throw an exception, though). So, unless > > > you're catching the exception in processData somehow, I don't see > > > where things could go wrong. > > > > > I assume you have no problem with correct credentials or simply > > > using a webbrowser? > > > > Hello, > > > > Yes, I can retrieve data without any problem. I can break the URL and > > generate a 404 exception that is trapped and I can break it in other > > ways that generate other types of exceptions. And trap them. > > > > I went back and looked at the code in urllib2.py and I see the > > timeout counter and that it raises an HTTPError after 5 tries. But I > > don't get anything back. If I just let the code run to completion, I > > get sent back to the prompt. I put a try/catch in the method and I > > already have one on the call in main. > > > > > > > > > >> I don't know why it's hard-coded that way, and not just an option > > > >> with a default of 5, but that's currently how it is (maybe someone > > > >> else on this list knows?). > > > > > > > > I don't know, but even if I could set it to 1, I'm not helped unless > > > > there's a way for me to make it throw an exception and exit the loop. > > > > Actually, there's a comment in the code about why it is set to 5 -- > > it's arbitrary, and allows for the Password Manager to prompt for > > credentials while not letting the request be reissued until 'recursion > > depth is exceeded.' > > > > I guess I'll have to go back to ground zero and write a stub to > > generate the error and then build back up to where it disappears. > > > > Thanks. > > > > mp > > > > -- > > Michael Powem
[Tutor] Documenting a Module
Hello, Are there any tools for documenting a module other than Sphinx? Apparently, I need a full-blown dev box with Visual Studio installed to get Sphinx up, due to the dependency on Jinja, which comes source-only and requires VC. I wrote a module, I'd like to produce a decent document of its functionality from the comments and doc strings; and I already wasted a considerable part of my Sunday afternoon trying to get along with Sphinx. I'm not talking about a huge Python project, nor am I likely to need that type of documentation tool in the near future. Thanks. mp -- Michael Powemich...@trollope.orgNaugatuck CT USA Is it time for your medication or mine? pgpbJeRok4c8e.pgp Description: PGP signature ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] wierd replace problem
Hello, In your script, you need to escape each backslash, in order to have it come out correctly to the interpreter. IOW, in the shell, '\\' is what is being processed. But in your script, you want to send '\\' to the shell, and in order to do that, you have to escape each backslash, or ''. In a script, '\\' means 'send through a backslash' for shell processing. The left-most backslash is escaping the one to its right. Thanks. mp On Mon, Sep 13, 2010 at 12:19:23PM +, Roelof Wobben wrote: > > Hello, > > I have this string called test with the contents of 'het is een wonder \\TIS' > > Now I want to get rid of the \\ so I do this : test2 = test.replace ('\\', '') > And I get at the python prompt this answer : 'het is een wonder TIS' > So that's right. > > Now I try the same in a IDE with this programm : > > woorden =[] > letter_counts = {} > file = open ('alice_in_wonderland.txt', 'r') > for line in file: > line2 = line.replace ("\\","") > line3 = line2.lower() > woorden = line3.split() > for letter in woorden: > letter_counts[letter] = letter_counts.get (letter, 0) + 1 > letter_items = letter_counts.items() > letter_items.sort() > print letter_items > > But now Im gettting this output : > > [('"\'tis', 1), > > Why does the \ stays here. It should have gone as the test in the python > prompt says. > > Roelof > > > > ___ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > http://mail.python.org/mailman/listinfo/tutor -- Michael Powemich...@trollope.orgNaugatuck CT USA "...we built a new continent in cyberspace where we could go live and be free. And the nice thing is that, because we built it, we didn't have to steal it from aboriginal peoples. It was completely empty, and we invite everyone to live there with us. No immigration restrictions. There's room for everyone in the world. Come live in the free world and be free. That's our idea." -- Richard Stallman pgpW0enXO7mqW.pgp Description: PGP signature ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Documenting a Module
On Mon, Sep 13, 2010 at 12:03:27PM +0200, r...@schoenian-online.de wrote: > > Hi Michael, > ?? > I can recommend epydoc. You can find it here: > ??http://epydoc.sourceforge.net/??It's a nice tool and you should have no > problems > with the installation. > ?? > Ralf > ?? > ?? > ?? Thank you, much appreciated. mp -- Michael Powemich...@trollope.orgNaugatuck CT USA 47.3% of all statistics are made up on the spot. - Steven Wright pgpZzxvhgS3f7.pgp Description: PGP signature ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] wierd replace problem
On Mon, Sep 13, 2010 at 12:19:23PM +, Roelof Wobben wrote: > > Hello, > > I have this string called test with the contents of 'het is een wonder \\TIS' > > Now I want to get rid of the \\ so I do this : test2 = test.replace ('\\', '') > And I get at the python prompt this answer : 'het is een wonder TIS' > So that's right. > > Now I try the same in a IDE with this programm : > > woorden =[] > letter_counts = {} > file = open ('alice_in_wonderland.txt', 'r') > for line in file: > line2 = line.replace ("\\","") > line3 = line2.lower() > woorden = line3.split() > for letter in woorden: > letter_counts[letter] = letter_counts.get (letter, 0) + 1 > letter_items = letter_counts.items() > letter_items.sort() > print letter_items > > But now Im gettting this output : > > [('"\'tis', 1), > > Why does the \ stays here. It should have gone as the test in the python > prompt says. Hello, Actually, on closer look I can see the answer. The original text must look something like this: \\"'tis the season to be jolly," said santa. When you run your process against a string in that format, you get the output shown: ('"\'tis', 1). The appearance of the backslash is fortuitous. It has nothing to do with the string.replace(), it's there to escape the single quote, which is appearing in the middle of a single-quoted string. IF the double-quote had not also been there, python would have replaced the outer quotes with double quotes, as it does on my system before I got to thinking about that lonely double quote. Thanks. mp -- Michael Powemich...@trollope.orgNaugatuck CT USA "I wrote what I did because as a woman, as a mother, I was oppressed and brokenhearted with the sorrows and injustice I saw, because as a Christian I felt the dishonor to Christianity, -- because as a lover of my country, I trembled at the coming day of wrath." -- H.B. Stowe pgpvaJk6H7T5x.pgp Description: PGP signature ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Documenting a Module
On Mon, Sep 13, 2010 at 12:03:27PM +0200, r...@schoenian-online.de wrote: > > Hi Michael, > I can recommend epydoc. You can find it here: > ??http://epydoc.sourceforge.net/??It's a nice tool and you should > have no problems with the installation. > Ralf Hello, I just want to follow up that epydoc is completely awesome and exactly the right tool for the job. I was able to get my doc generated and wrapped up nicely in less time than I spent trying to get Sphinx installed yesterday. Thanks. mp -- Michael Powemich...@trollope.orgNaugatuck CT USA I hate a fellow whom pride, or cowardice, or laziness drives into a corner, and who does nothing when he is there but sit and ; let him come out as I do, and . -- Samuel Johnson pgpJwZcQZEZ6N.pgp Description: PGP signature ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
[Tutor] A Plea for Sensible Quoting
Hello, I read this list in a linux console using mutt. I dunno, maybe I'm the only idiot left still using a console mail client. I write my emails and email replies in emacs, too, which may make me even weirder. Few elements make a mail harder to read than 130 lines of nested quoting, into which has been inserted 3 or 5 lines of comment. Mutt does a good job of identifying the quote nesting and highlighting the quoted material; and I can use the `T' command to remove the quoting. But sometimes, "context" in the sense of what was being replied to is useful. When a thread reponse is quoting 3 other respondents plus the OP, "context" turns into "what everybody is saying," which is more like standing in a crowded train terminal, where everybody is shouting, than having a quiet, private conversation among 4 people. Then, I have to page down four screens to find the 5 lines of new comment, and try to figure out what portion of that three-level-nested dialog that preceeds it was the trigger for the response. Just a plea to remember to take the time to `C-k' or `dd' or whatever is required to get that extraneous material out of the mail. A little formatting goes a long way. Thanks. mp -- Michael Powemich...@trollope.orgNaugatuck CT USA Fun experiments: Get a can of shaving cream, throw it in a freezer for about a week. Then take it out, peel the metal off and put it where you want... bedroom, car, etc. As it thaws, it expands an unbelievable amount. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] re.findall parentheses problem
On Tue, Sep 14, 2010 at 01:09:21PM -0400, Michael Scharf wrote: > Thank you. I should have figured "groups" were the paren groups. I see it > clearly now. And your solution will work for the larger thing I'm trying to > do --- thanks. > And yes: I know this matches some non-date-like dates, but the data is such > that it should work out ok. Hello, I second the advice to use named groups. This option makes it sooo much easier to retrieve your captures; especially, if you have nested groups. No more working out if the capture you want is in group 1 or group 3. Just matcher.group('January'). Thanks. mp -- Michael Powemich...@trollope.orgNaugatuck CT USA "We had pierced the veneer of outside things. We had `suffered, starved, and triumphed, groveled down yet grasped at glory, grown bigger in the bigness of the whole.' We had seen God in his splendors, heard the text that Nature renders. We had reached the naked soul of man." -- Sir Ernest Shackleton, pgpCawppq4rj0.pgp Description: PGP signature ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
[Tutor] Regex comments
Hello, The re module includes the option to comment a regular expression with the syntax (?#comment). e.g. p=r'(.*) (?PWT.dl)(?#parameter)=(?P[^&]+)(?#value).*' Is there a mechanism for extracting these values from the match, in the way that group names are extracted? I don't see one. The point would be that in my processing of the match, I could implement the comments as identifiers for the matched value. Thanks. mp -- Michael Powemich...@trollope.orgNaugatuck CT USA "The most likely way for the world to be destroyed, most experts agree, is by accident. That's where we come in. We're computer professionals. We cause accidents." -- Nathaniel Borenstein, inventor of MIME pgpPf9G6Q0dBX.pgp Description: PGP signature ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Regex comments
On Thu, Sep 16, 2010 at 01:31:09PM +0200, Peter Otten wrote: > Michael Powe wrote: > > The re module includes the option to comment a regular expression with > > the syntax (?#comment). e.g. > > Is there a mechanism for extracting these values from the match, in > > the way that group names are extracted? > > I don't see one. > You could write a regular expression to extract them ;) ;-) > > The point would be that in my processing of the match, I could > > implement the comments as identifiers for the matched value. > > But that's what the names are for, e. g.: > > >>> re.compile(r'(.*) (?PWT.dl)=(?P[^&]+).*').search( > " WTxdl=yadda&ignored").groupdict() > {'parameter': 'WTxdl', 'value': 'yadda'} That's right, I forgot about the dictionary. Thanks! mp -- Michael Powemich...@trollope.orgNaugatuck CT USA "No provision in our Constitution ought to be dearer to man than that which protects the rights of conscience against the enterprises of the civil authority." -- Thomas Jefferson to New London Methodists, 1809. ME 16:332 pgpH7zLFLwttx.pgp Description: PGP signature ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
[Tutor] "Overloading" methods
Hello, Strictly speaking, this isn't overloading a method in the way we do it in Java. But similar. Maybe. I am writing a module for processing web server log files and one of the methods I provide is to extract a given query parameter and its value. Because there are several types of log files with different line structures, I had the thought to write methods with descriptive names that simply returned a generic method that processed the method arguments. e.g., def setpattern_iis(self,pattern,parameter) : type='iis' return pattern_generator(self,type,pattern,parameter) In this case, creating a regular expression to parse the log lines for a query parameter. This is just a bit more "self documenting" than using the generic method with the 'type' argument and requiring the user to enter the type. At the same time, it allows me to put all the parsing code in one method. My question is, is this a bad thing to do in python? Thanks. mp -- Michael Powemich...@trollope.orgNaugatuck CT USA War is a sociological safety valve that cleverly diverts popular hatred for the ruling classes into a happy occasion to mutilate or kill foreign enemies. - Ernest Becker pgp4F5gOWmvX7.pgp Description: PGP signature ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
[Tutor] Comparing two lists
Hello, I have two lists. alist = ['label', 'guid'] blist = ['column0label', 'column1label', 'dimension0guid', 'description', 'columnid'] I want to iterate over blist and extract the items that match my substrings in alist; alternatively, throw out the items that aren't in alist (but, I've had bad experiences removing items from lists "in place," so I tend toward the "copy" motif.) In real life, blist column entries could have embedded column numbers from 0 to 19. I can do this with excrutiatingly painful 'for' loops. Looking for something more efficient and elegant. Thanks. mp -- Michael Powemich...@trollope.orgNaugatuck CT USA "The secret to strong security: less reliance on secrets." -- Whitfield Diffie pgpbadqbDTubZ.pgp Description: PGP signature ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Comparing two lists
On Thu, Sep 16, 2010 at 12:59:08PM -0600, Vince Spicer wrote: > On Thu, Sep 16, 2010 at 12:49 PM, Vince Spicer wrote: > > On Thu, Sep 16, 2010 at 12:27 PM, Michael Powe wrote: > >> alist = ['label', 'guid'] > >> blist = ['column0label', 'column1label', 'dimension0guid', > >> 'description', 'columnid'] > >> > >> I want to iterate over blist and extract the items that match my > >> substrings in alist; alternatively, throw out the items that aren't in > >> alist (but, I've had bad experiences removing items from lists "in > >> place," so I tend toward the "copy" motif.) > > One solution is to use list comprehensions. > > newlist = [x for x in blist if [a for a in alist if a in x]] > > This works, although there may be more efficient ways to accomplish this > On major speed up is to make a simple filter that returns as soon as a match > is found instead of > completing the loop every element in alist > def filter_(x, against): > for a in against: > if a in x: > return True > return False > > newlist = [x for x in blist if filter_(x, alist)] Hello, Very cool, thanks. I've used list comprehensions before but I just couldn't get the structure right this time, for some reason. Thanks. mp -- Michael Powemich...@trollope.orgNaugatuck CT USA "No provision in our Constitution ought to be dearer to man than that which protects the rights of conscience against the enterprises of the civil authority." -- Thomas Jefferson to New London Methodists, 1809. ME 16:332 pgp9X5ry33hg8.pgp Description: PGP signature ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Comparing two lists
On Thu, Sep 16, 2010 at 12:59:08PM -0600, Vince Spicer wrote: > On Thu, Sep 16, 2010 at 12:49 PM, Vince Spicer wrote: > > On Thu, Sep 16, 2010 at 12:27 PM, Michael Powe wrote: > >> I have two lists. > >> alist = ['label', 'guid'] > >> > >> blist = ['column0label', 'column1label', 'dimension0guid', > >> 'description', 'columnid'] > >> I want to iterate over blist and extract the items that match my > >> substrings in alist; alternatively, throw out the items that aren't in > >> alist (but, I've had bad experiences removing items from lists "in > >> place," so I tend toward the "copy" motif.) > On major speed up is to make a simple filter that returns as soon as > a match is found instead of completing the loop every element in > alist > def filter_(x, against): > for a in against: > if a in x: > return True > return False Hello, Totally awesome. I actually have a dictionary, with the key being an ini file header and the value being one of these lists of ini settings. With your method, I am able to loop through the dictionary, and expunge the unwanted settings. I knew there had to be a way to take advantage of the fact that the 'i in s' object test acts like a substring test for strings. Thanks. mp -- Michael Powemich...@trollope.orgNaugatuck CT USA 'Unless we approve your idea, it will not be permitted, it will not be allowed.' -- Hilary Rosen, President, RIAA pgprtToq26Kft.pgp Description: PGP signature ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Best Python Editor
On Sun, Jun 14, 2009 at 12:53:19PM -0700, Emile van Sebille wrote: > On 6/14/2009 8:04 AM Alan Gauld said... > >"Tom Green" wrote > >>Since VIM seems to be the editor of choice and I have been programming in > >>Python for many years using Pyscripter and Eclipse I was wondering how I > >>could transition away from the IDE world to VIM. > >With great difficulty and to be honest I wouyldn't bother. > >If you are used to eclipse then there is little to be gained from moving > >to vim. > Seconded. Editors are a personal choice and often the loss of > productivity while learning a new environment isn't worth it. My > introduction to coding Turbo Pascal was eased owing to the WordStar > compatible/modeled editor it used. Nowadays, any editor with > configurable syntax capability is sufficient. It's all personal > preference after that. It's good to see so much common sense prevailing on this topic. An IDE such as eclipse or VS really only becomes a necessity for productivity when (a) you are dealing with multiple code files and proper compilation and linking and so forth becomes complicated; or (b) you are working in a language with which you are not as familiar as you should like to be, and autocompletion because a real asset. However, I will say that while following this thread, it occurred to me that the one feature that VS and even the VBA editor in MS Office has, is the ability to pop you into the debugger on error. This feature is so useful that it surprises me nobody else seems to do it. Most often, simply the ability to jump to the error line is provided and I suppose that must be generally acceptable. Thanks. mp ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Best Python Editor
On Mon, Jun 15, 2009 at 06:30:50AM -0700, johnf wrote: > On Sunday 14 June 2009 07:31:53 pm Michael Powe wrote: > > However, I will say that while following this thread, it occurred to > > me that the one feature that VS and even the VBA editor in MS Office > > has, is the ability to pop you into the debugger on error. This > > feature is so useful that it surprises me nobody else seems to do it. > > Most often, simply the ability to jump to the error line is provided > > and I suppose that must be generally acceptable. > Wing does. When error occurs it stops on the line and the programmer is > working in the debugger. Hello, I'll have to look at that. I have a kind of collection of editors -- the way I collect books, I guess. TextPad, vbsEdit, UltraEdit, SciTE, XmlCopyEditor, EditPlus, emacs. I never do anything with vi except munge conf files. For actual "projects" I use VS and NetBeans. When I get on a "back to basics" kick, I re-enter emacs. It used to be a joke about emacs not being an editor but an operating system. There is nothing on the linux side that even comes close, IMO. I don't like GUI-based stuff, though, so right off, any editor built on the assumption that I'm a mouse-oriented user is right out. Thanks. mp ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Best Python Editor
On Mon, Jun 15, 2009 at 06:34:04AM -0700, Emile van Sebille wrote: > On 6/15/2009 2:49 AM Tom Green said... > >Yes, vim or any text editor is suitable for Python, but I > >prefer having a nice GUI interface while coding. I mean the automobile > >replaced the horse and buggy, while they both get you to your > >destination I would still rather travel in a car. > Anyone know of any studies comparing text based vs GUI IDE based code > development? As I recall, programming productivity is measured in > LOC/day and last time I noticed it seemed to be a very small number. > I'm wondering if there might be documented benefits to migrating from my > horse and buggy. :) Are you in a hurry to get somewhere? ;-) I recently worked on a module for a large existing Java application. The module I wrote had to be plugged in to the existing code base. So of course, I had to have all kinds of tie-ins to existing libraries and classes. First, I couldn't run the full application, so I had to rely on unit testing to verify my functionality. Second, I had to connect to hundreds of classes inside the application. I'm not that smart -- I could not have done it without NetBeans, which has fantastic introspection and can tell me most of the ways I'm violating protocol while I'm working. I stubbed out a lot of stuff and prototyped in jEdit. But when it was game on, I had to go to NB. It probably comes down to, How much stuff can you carry in your head? Thanks. mp ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] here documents
Hello, In perl, I create variables of fairly involved text using here documents. For example, $msg = <<"EOF"; a bunch of text here. ... EOF Is there an equivalent method in python? I usually use this method when creating help messages for scripts -- put all the text into a variable and the 'print $msg' for the output. I find it an easy way to produce formatted text. Now, I'm trying to switch over to python and want to recreate or adapt my processes. Thanks. mp ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] here documents
On Mon, Jan 03, 2005 at 11:54:06PM -, Alan Gauld wrote: > There was a detailed thread on this recently either here > or on usenet group comp.lang.python... I checked the archives for this list but didn't see anything. I'll try the ng. Thanks. > The bottom line was to use string formatting and triple > quoted strings... > > msg = ''' > A very long string that overspills > onto multiple lines and includes > my name which is %{name}s > and a number which is my age: %{age}d > ''' > > print msg % vars() This is great, thanks to you and to Bill. Much appreciated. mp ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Re: here documents
On Mon, Jan 03, 2005 at 10:04:18PM -0200, Jorge Luiz Godoy Filho wrote: > Alan Gauld, Segunda 03 Janeiro 2005 21:56, wrote: > > > Oops, those should have been () not {} > > I always do the same mistake ;-) Using "{}" seems more intuitive to me. perhaps because of ${var} shell syntax? ;-) mp ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] regex problem
Hello, I'm having erratic results with a regex. I'm hoping someone can pinpoint the problem. This function removes HTML formatting codes from a text email that is poorly exported -- it is supposed to be a text version of an HTML mailing, but it's basically just a text version of the HTML page. I'm not after anything elaborate, but it has gotten to be a bit of an itch. ;-) def parseFile(inFile) : import re bSpace = re.compile("^ ") multiSpace = re.compile(r"\s\s+") nbsp = re.compile(r" ") HTMLRegEx = re.compile(r"(<|<)/?((!--.*--)|(STYLE.*STYLE)|(P|BR|b|STRONG))/?(>|>) ",re.I) f = open(inFile,"r") lines = f.readlines() newLines = [] for line in lines : line = HTMLRegEx.sub(' ',line) line = bSpace.sub('',line) line = nbsp.sub(' ',line) line = multiSpace.sub(' ',line) newLines.append(line) f.close() return newLines Now, the main issue I'm looking at is with the multiSpace regex. When applied, this removes some blank lines but not others. I don't want it to remove any blank lines, just contiguous multiple spaces in a line. BTB, this also illustrates a difference between python and perl -- in perl, i can change "line" and it automatically changes the entry in the array; this doesn't work in python. A bit annoying, actually. ;-) Thanks for any help. If there's a better way to do this, I'm open to suggestions on that regard, too. mp ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] regex problem
On Tue, Jan 04, 2005 at 09:15:46PM -0800, Danny Yoo wrote: > > > On Tue, 4 Jan 2005, Michael Powe wrote: > > > def parseFile(inFile) : > > import re > > bSpace = re.compile("^ ") > > multiSpace = re.compile(r"\s\s+") > > nbsp = re.compile(r" ") > > HTMLRegEx = > > > > re.compile(r"(<|<)/?((!--.*--)|(STYLE.*STYLE)|(P|BR|b|STRONG))/?(>|>) > > ",re.I) > > > > f = open(inFile,"r") > > lines = f.readlines() > > newLines = [] > > for line in lines : > > line = HTMLRegEx.sub(' ',line) > > line = bSpace.sub('',line) > > line = nbsp.sub(' ',line) > > line = multiSpace.sub(' ',line) > > newLines.append(line) > > f.close() > > return newLines > > > > Now, the main issue I'm looking at is with the multiSpace regex. When > > applied, this removes some blank lines but not others. I don't want it > > to remove any blank lines, just contiguous multiple spaces in a line. > > > Hi Michael, > > Do you have an example of a file where this bug takes place? As far as I > can tell, since the processing is being done line-by-line, the program > shouldn't be losing any blank lines at all. That is what I thought. And the effect is erratic, it removes some but not all empty lines. > Do you mean that the 'multiSpace' pattern is eating the line-terminating > newlines? If you don't want it to do this, you can modify the pattern > slightly. '\s' is defined to be this group of characters: > > '[ \t\n\r\f\v]' > > (from http://www.python.org/doc/lib/re-syntax.html) > > So we can adjust our pattern from: > > r"\s\s+" > > to > > r"[ \t\f\v][ \t\f\v]+" > > so that we don't capture newlines or carriage returns. Regular > expressions have a brace operator for dealing with repetition: > if we're looking for at least 2 or more > of some thing 'x', we can say: I will take a look at this option. Thanks. mp ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] regex problem
On Wed, Jan 05, 2005 at 07:37:58AM -, Alan Gauld wrote: > > This function removes HTML formatting codes from a text email > Using regex to remove HTML is usually the wrong approach unless > you can guarantee the format of the HTML in advance. The > HTMLparser is usually better and simpler. I think theres an example > in the module doc of converting HTML to plain text. Thanks. This is one of those projects I've had in mind for a long time, decided it was a good way to learn some python. I will look at the HTMLParser module. But then once I get started on one of these projects, it has a way of taking over. ;-) mp ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] regex problem
On Wed, Jan 05, 2005 at 06:33:32AM -0500, Kent Johnson wrote: > If you search comp.lang.python for 'convert html text', the top four > results all have solutions for this problem including a reference to this > cookbook recipe: > http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/52297 > > comp.lang.python can be found here: > http://groups-beta.google.com/group/comp.lang.python?hl=en&lr=&ie=UTF-8&c2coff=1 Shame on me, I have to get back into that habit. I will check these references, thanks. mp ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Something is wrong in file input output functions.
On Mon, Jan 10, 2005 at 12:15:18PM -0800, kumar s wrote: > Dear group, > I have written a small piece of code that takes a file > and selects the columns that I am interested in and > checks the value of the column on a condition (value > that eqauls 25) and then write it the to another file. > > > > Code: > import sys > from string import split > import string > print "enter the file name" ### Takes the file name### > psl = sys.stdin.readline() ### psl has the file > object### I may be wrong but it does not appear to me that you open the files for reading/writing. The variable psl does not contain the file object, it contains the file name. To create a file object, you have to open it. E.g., f = open(psl,"r") w = open(out,"w") Now str_psl = f.readlines() creates an array of strings -- what you are trying to do with psl.split? I don't know what sys.stdout.write returns (and I'm not looking it up), but my guess would be something like the number of characters written. As a matter of form, I suggest writing all function definitions and then follow with execution code (input and function calls) -- makes it easier to read and follow what you're doing. I think it's unfortunate that python does not allow us to put function defs at the end of the file, so we can put execution code at the top ... but that's the way of it. I put my execution in a main function, and then call that. Seems tidier. HTH mp > > f2 = sys.stdout.write("File name to write") > def extCor(psl): > ''' This function, splits the file and writes the > desired columns to > to another file only if the first column value equals > 25.''' > str_psl = psl.split('\n') > str_psl = str_psl[5:] > for ele in range(len(str_psl)): > cols = split(str_psl[ele],'\t') > des_cols = > cols[0]+'\t'+cols[1]+'\t'+cols[8]+'\t'+cols[9]+'\t'+cols[11]+'\t'+cols[12]+'\t'+cols[13]+'\t'+cols[15]+'\t'+cols[16]+'\t'+cols[17]) > if cols[0] == 25: > '''This condition checks if the first > column value == 25, then it writes it to the file, if > not then it does not''' > f2.write(des_cols) > f2.write("\n") > > extCor(psl) > > > > Question: > when i give it the file name that it should parse, I > do not get to asked the file name i am interested in > it gives me nothing. Please help me. > Thanks > K > > > > __ > Do you Yahoo!? > The all-new My Yahoo! - Get yours free! > http://my.yahoo.com > > > ___ > Tutor maillist - Tutor@python.org > http://mail.python.org/mailman/listinfo/tutor ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] Clash of the Titans and Mundane Matters
Clash of the Titans >From "Dive into Python": __init__ is called immediately after an instance of the class is created. It would be tempting but incorrect to call this the constructor of the class. It's tempting, because it looks like a constructor (by convention, __init__ is the first method defined for the class), acts like one (it's the first piece of code executed in a newly created instance of the class), and even sounds like one ("init" certainly suggests a constructor-ish nature). Incorrect, because the object has already been constructed by the time __init__ is called, and you already have a valid reference to the new instance of the class. But __init__ is the closest thing you're going to get to a constructor in Python, and it fills much the same role. >From Alan's book "Learning to Program": One of the methods of this class is called __init__ and it is a special method called a constructor. The reason for the name is that it is called when a new object instance is created or constructed. Any variables assigned (and hence created in Python) inside this method will be unique to the new instance. There are a number of special methods like this in Python, nearly all distinguished by the __xxx__ naming format. Mundane Matters I'm having a hard time with classes in python, but it's coming slowly. One thing that I think is generally difficult is to parse a task into "objects." Here's an example: in Java, I wrote an application to track my travelling expenses (I'm a consultant; this tracking of expenses is the itch I am constantly scratching. ;-) I've also written this application in a perl/CGI web application as well.) It's easy to see the outline of this task: create an abstract class for expense and then extend it for the particular types of expenses -- travel, food, transportation, lodging and so forth. In python, I guess I'd create a class and then "subclass" it. But ... what are reading/writing to files and printing? Of course, I create a "file object" in order to accomplish these tasks -- but how is this object fit into the application design? Do I just create methods within the expense class to accomplish these parts of the task? When I tried this on, it seemed hacky. The other option seemed to be creating an I/O class and passing the expense objects to it. But, should that class be an interface or an object? The one thing you don't see in "how to program" java books is an implementation of I/O in the context of an application. A similar problem occurs with my HTML-parsing routine that I brought to the list recently. Use of HTMLParser was suggested. I've looked into this and usage means subclassing HTMLParser in order to implement the methods in the way that will accomplish my task. Conceptually, I'm having a hard time with the "object" here. (The fairly poor documentation for HTMLParser doesn't help.) Apparently, I'm creating a "parser" object and feeding it data. At least, that's the closest I can get to understanding this process. How I'm actually feeding data to the "parser" object and retrieving the results are matters open to discussion. I'll be working on that when I get another chance. Finally, in terms of "understanding python," the question I keep coming up against is: why do we have both functions and methods? What is the rationale for making join() a string method and a os.path function? Thanks for your time. It's late. ;-) Sometimes, I just have to get these things off my chest. mp ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor