[Tutor] "words", tags, "nonwords" in xml/text files
I'm developing an application to do interlineal (an extreme type of literal) translations of natural language texts and xml. Here's an example of a text: '''Para eso son los amigos. Para celebrar las gracias del otro.''' and the expected translation with all of the original tags, whitespace, etc intact: '''For that are the friends. For toCelebrate the graces ofThe other.''' I was unable to find (in htmlparser, string or unicode) a way to define words as a series of letters (including non-ascii char sets) outside of an xml tag and whitespace/punctuation, so I wrote the code below to create a list of the words, nonwords, and xml tags in a text. My intuition tells me that its an awful lot of code to do a simple thing, but it's the best I could come up with. I forsee several problems: -it currently requires that the entire string (or file) be processed into memory. if i should want to process a large file line by line, a tab which spans more than one line would be ignored. (that's assuming i would not be able to store state information in the function, which is something i've not yet learned how to do) -html comments may not be supported. (i'm not really sure about this) -it may be very slow as it indexes instead of iterating over the string. what can i do to overcome these issues? Am I reinventing the wheel? Should I be using re? thanks, brian # -*- coding: utf-8 -*- # html2list.py def split(alltext, charset='ñÑçÇáÁéÉíÍóÓúÚ'): #in= string; out= list of words, nonwords, html tags. '''builds a list of the words, tags, and nonwords in a text.''' length = len(alltext) str2list = [] url = [] word = [] nonword = [] i = 0 if alltext[i] == '<': url.append(alltext[i]) elif alltext[i].isalpha() or alltext[i] in charset: word.append(alltext[i]) else: nonword.append(alltext[i]) i += 1 while i < length: if url: if alltext[i] == '>':#end url: url.append(alltext[i]) str2list.append("".join(url)) url = [] i += 1 if alltext[i].isalpha() or alltext[i] in charset: #start word word.append(alltext[i]) else: #start nonword nonword.append(alltext[i]) else: url.append(alltext[i]) elif word: if alltext[i].isalpha() or alltext[i] in charset:#continue word word.append(alltext[i]) elif alltext[i] == '<': #start url str2list.append("".join(word)) word = [] url.append(alltext[i]) else: #start nonword str2list.append("".join(word)) word = [] nonword.append(alltext[i]) elif nonword: if alltext[i].isalpha() or alltext[i] in charset:#start word str2list.append("".join(nonword)) nonword = [] word.append(alltext[i]) elif alltext[i] == '<': #start url str2list.append("".join(nonword)) nonword = [] url.append(alltext[i]) else: #continue nonword nonword.append(alltext[i]) else: print 'error', i += 1 if nonword: str2list.append("".join(nonword)) if url: str2list.append("".join(url)) if word: str2list.append("".join(word)) return str2list ## example: text = '''El aguardiente de caña le quemó la garganta y devolvió la botella con una mueca. No se me ponga feo, doctor. Esto mata los bichos de las tripas dijo Antonio José Bolívar, pero no pudo seguir hablando.''' print split(text) ___ Try the all-new Yahoo! Mail. "The New Version is radically easier to use" The Wall Street Journal http://uk.docs.yahoo.com/nowyoucan.html ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] "words", tags, "nonwords" in xml/text files
rio wrote: > I'm developing an application to do interlineal (an extreme type of > literal) translations of natural language texts and xml. Here's an example > of a text: > > '''Para eso son los amigos. Para celebrar las gracias del otro.''' > > and the expected translation with all of the original tags, whitespace, > etc intact: > > '''For that are the friends. For toCelebrate the graces ofThe > other.''' > > I was unable to find (in htmlparser, string or unicode) a way to define > words as a series of letters (including non-ascii char sets) outside of an > xml tag and whitespace/punctuation, so I wrote the code below to create a > list of the words, nonwords, and xml tags in a text. My intuition tells > me that its an awful lot of code to do a simple thing, but it's the best I > could come up with. I forsee several problems: > > -it currently requires that the entire string (or file) be processed into > memory. if i should want to process a large file line by line, a tab which > spans more than one line would be ignored. (that's assuming i would not be > able to store state information in the function, which is something i've > not yet learned how to do) > -html comments may not be supported. (i'm not really sure about this) > -it may be very slow as it indexes instead of iterating over the string. > > what can i do to overcome these issues? Am I reinventing the wheel? Should > I be using re? You should probably be using sgmllib. Here is an example that is pretty close to what you are doing: http://diveintopython.org/html_processing/index.html Kent ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Download file from the web and store it locally.
Matt- Have you tried running this line-by-line in IDLE? I've done a script almost exactly the same as what you're doing ( I downloaded a .jpg file from a web-server ), and when I was trying to learn what commands did what, I quickly changed from trying to write a 'program' to running lines individually in IDLE so I could print out the results of each command, and to see exactly what lines 'work', which have syntax errors and which have logic errors. People often want a compiler which spits out executables, but I realized it's nice to be able to type a few lines of an algorithm into the interpreter and see the results, so I can see if I get the results I expect, then copy that and paste it into my 'program'. -Dave Now, for wxPython (a windowing superset of python), my routine was: ---SNIP--- WebImage = urllib2.urlopen("%s/jpg/fullsize.jpg" %netCamAddress) rawJpg = WebImage.read() photostream = cStringIO.StringIO(rawJpg) photoJpg = wx.ImageFromStream(photostream) photoBmp = wxBitmapFromImage(photoJpg) liveVideo.SetBitmap(photoBmp) ---SNIP--- "Give a man a fish; you have fed him for today. Teach a man to fish; and you have fed him for a lifetime. Show a man what a fish looks like so he doesn't try to eat a tire he fishes out of the pond." So, let me show you what a "fish" looks like. I expect the first two lines are of use to you, where the bottom-four lines are specific to wxPython. Realize what is happening with the string in the first line? I'm replacing part of the web-address with the contents of the variable netCamAddress. Can you figure out how that is working? hint- you can try printing "%s/jpg/fullsize.jpg" %netCamAddress directly in IDLE after setting netCamAddress to a string value to see what the result is, so you know what urllib2 is opening After line two is executed, the contents of rawJpg is the .jpg file I downloaded. Do you know what to do next? Maybe open a file in binary mode, and write rawJpg' to the file then close it? Also, do you realize I'm using a different library for urlopen than you used? Do you know which one, and how to import it? Lastly, there are a couple ways you can specify your file in python. "D:\\Temp\\file" but I like using raw strings. r"d:\directory\file.ext" (I hate double-slashing. It's so visually messy, it makes it easy to miss things like commas I think...) http://docs.python.org/tut/node9.html#SECTION00920 Maybe sections 7.2 and 7.2.1 are relevent to the rest of your goals? -Dave ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] Question on regular expressions
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi Everyone, I have two Perl expressions If windows: perl -ple "s/([^\w\s])/sprintf(q#%%%2X#, ord $1)/ge" somefile.txt If posix perl -ple 's/([^\w\s])/sprintf("%%%2X", ord $1)/ge' somefile.txt The [^\w\s] is a negated expression stating that any character a-zA-Z0-9_, space or tab is ignored. The () captures whatever matches and throws it into the $1 for processing by the sprintf In this case, %%%2X which is a three character hex value. How would you convert this to a python equivalent using the re or similar module? I've begun reading about using re expressions at http://www.amk.ca/python/howto/regex/ but I am still hazy on implementation. Any help you can provide would be greatly appreciated. - -- Thank you, Andrew Robert Systems Architect Information Technologies MFS Investment Management Phone: 617-954-5882 E-mail: [EMAIL PROTECTED] Linux User Number: #201204 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.3 (MingW32) iD8DBQFEdIoKDvn/4H0LjDwRAi24AKDFmRohKFfp13z/M9c8O1LQElGzMgCglcRw 3ERK7FxWejsuFcnDSNdOYjM= =28Lx -END PGP SIGNATURE- ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Question on regular expressions
> perl -ple "s/([^\w\s])/sprintf(q#%%%2X#, ord $1)/ge" somefile.txt Hi Andrew, Give me a second. I'm trying to understand the command line switches: (Looking in 'perl --help'...) -p assume loop like -n but print line also, like sed -l[octal] enable line ending processing, specifies line terminator -e program one line of program (several -e's allowed, omit programfile) and the regular expression modifiers there --- 'g' and 'e' --- mean ... (reading 'perldoc perlop'...) g Match globally, i.e., find all occurrences. e Evaluate the right side as an expression. Ok, I have a better idea of what's going on here now. This takes a file, and translates every non-whitespace character into a hex string. That's a dense one-liner. > How would you convert this to a python equivalent using the re or > similar module? The substitution on the right hand side in the Perl code actually is evaluated rather than literally substituted. To get the same effect from Python, we pass a function off as the substituting value to re.sub(). For example, we can translate every word-like character by shifting it one place ('a' -> 'b', 'b' -> 'c', etc...) ### >>> import re >>> def rot1(ch): ... return chr((ord(ch) + 1) % 256) ... >>> def rot1_on_match(match): ... return rot1(match.group(0)) ... >>> re.sub(r'\w', rot1_on_match, "hello world") 'ifmmp xpsme' ### > I've begun reading about using re expressions at > http://www.amk.ca/python/howto/regex/ but I am still hazy on implementation. The part in: http://www.amk.ca/python/howto/regex/regex.html#SECTION00062 that talks about a "replacement function" is relevant to what you're asking. We need to provide a replacement function to simulate the right-hand-side "evaluation" that's happening in the Perl code. Good luck! ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] Hex conversion strangeness
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi Everyone, I am trying to understand conversion of a value to hex. Could you please confirm that what I am doing is right? Get the hex value of ! and store it in a a=hex(ord('!')) print a 0X21 I see examples including this kind of a statement but I am not sure why. int(a,16) 33 If the 0X21 is the actual hex value, then why convert to integer? Is this the ASCII table reference to the hex value? - -- Thank you, Andrew Robert Systems Architect Information Technologies MFS Investment Management Phone: 617-954-5882 E-mail: [EMAIL PROTECTED] Linux User Number: #201204 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.3 (MingW32) iD8DBQFEdNoTDvn/4H0LjDwRAp6yAKC5wI55jX5BJeu89ahj55hA7i8eAQCgu2Op rVbo0kgQ9GQv8N0TB34StlY= =pe50 -END PGP SIGNATURE- ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] cisco router + telnetlib
Hello List,I am rather new to programming and I was wondering y'all think the best way to configure a cisco router using python would be. currently I am using telnetlib. my problem is, I get an error after connecting to the router. here is the error I get when I use IDLE: Enter IP: 205.180.0.3Warning: Problem with getpass. Passwords may be echoed.Router Password: ciscoWarning: Problem with getpass. Passwords may be echoed.Router Secret: class Enter Router hostname: RouterBobTraceback (most recent call last): File "C:\Documents and Settings\bob\My Documents\Administration\Scripts\cisco_config.py", line 20, in ? tn.write("hostname " + hostName + "\n") File "C:\Python24\lib\telnetlib.py", line 292, in write self.sock.sendall(buffer) File "", line 1, in sendallerror: (10053, 'Software caused connection abort')>>> I have also attached the script that I use. could you please point me in the right direction.thank you in advance,-- Daniel McQuayboxster.homelinux.org Dead Possum Productions814.825.0847 import getpass import sys import telnetlib HOST = raw_input("Enter IP: ") password = getpass.getpass('Router Password: ') secret = getpass.getpass('Router Secret: ') tn = telnetlib.Telnet(HOST) tn.write("\n") tn.write("\n") tn.read_until("Router>") #what if it's configured already? tn.write(password + "\n") tn.write("enable\n") tn.write(secret + "\n") tn.write("config t\n") #User input for router config hostName = raw_input("Enter Router hostname: ") tn.write("hostname " + hostName + "\n") interface = raw_input("Enter interface to be configured: ") tn.write("int " + interface + "\n") intAddress = raw_input("Enter interface IP address and Subnet Mask: ") tn.write("ip address " + intAddress + "\n") intDesc = raw_input("Enter description for interface: ") tn.write("desc " + intDesc + "\n") bannerMOTD = raw_input("Enter the Message of the day: ") tn.write("banner motd " + bannerMOTD + "\n") lineCON = raw_input("Enter CON password: ") tn.write("line con 0\n") tn.write("password " + lineCON + "\n") tn.write("login\n") lineAUX = raw_input("Enter AUX passowrd: ") tn.write("line aux 0\n") tn.write("password " + lineAUX + "\n") tn.write("login\n") lineVTY = raw_input("Enter VTY password: ") tn.write("line vty 0 4\n") tn.write("password " + lineVTY + "\n") tn.write("login\n") tn.write("exit\n") tn.write("show run\n") confirm = raw_input("Press Enter To Close:") tn.close() ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Hex conversion strangeness
On 25/05/06, Andrew Robert <[EMAIL PROTECTED]> wrote: > If the 0X21 is the actual hex value, then why convert to integer? > > Is this the ASCII table reference to the hex value? Hi Andrew, There is a difference between a number and the representation of that number. For example, there is a number that, in base 10, I can represent as "33". In base 8, that number is represented as "41". In base 2, it would be "11". If I were, say, an ancient Mayan, it might look quite different indeed (http://en.wikipedia.org/wiki/Maya_numerals). But in all cases, the number is the same. So, in python, the hex() and oct() functions will give you the hexadecimal and octal representations of an integer. Notice that they return strings, not numbers. If you want to convert from a string to an integer, you can use the int() function with a second parameter. That's what's happening when you type int('21', 16) or int('0x21', 16). You're converting the string to an integer, which happens to be represented internally by a bitstring. Because you're working at the interactive prompt, python automatically displays the result of every expression you type. In this case, the result of int('0x21', 16) is an integer, and python displays integers in base 10 by default --- hence, you see 33. By the way, you can enter integers in base 8 or base 16, if you want: >>> 0x21, 041 (33, 33) >>> 0x21 is 041 is 33 True HTH! -- John. ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] cisco router + telnetlib
On Wed, 2006-05-24 at 18:18 -0400, Daniel McQuay wrote: > Hello List, > > I am rather new to programming and I was wondering y'all think the > best way to configure a cisco router using python would be. currently > I am using telnetlib. my problem is, I get an error after connecting > to the router. here is the error I get when I use IDLE: > http://mail.python.org/pipermail/tutor/2006-March/045547.html This is part of an earlier thread about configuring routers using a script. I'd recommend using tcpwatch to debug the connection. The router may have sent an error message before closing the connection. Remember that the router software was designed for human interaction. tcpwatch will allow you to see both sides of the conversation. You'll see in the earlier thread that I'd also recommend using tftp to send most of the commands and limit the interactive telnet script to: logging on running tftp logging off > Enter IP: 205.180.0.3 > Warning: Problem with getpass. Passwords may be echoed. > Router Password: cisco > Warning: Problem with getpass. Passwords may be echoed. > Router Secret: class > Enter Router hostname: RouterBob > Traceback (most recent call last): > File "C:\Documents and Settings\bob\My Documents\Administration > \Scripts\cisco_config.py", line 20, in ? > tn.write("hostname " + hostName + "\n") > File "C:\Python24\lib\telnetlib.py", line 292, in write > self.sock.sendall(buffer) > File "", line 1, in sendall > error: (10053, 'Software caused connection abort') > >>> > > I have also attached the script that I use. could you please point me > in the right direction. > > thank you in advance, > > > -- > Daniel McQuay > boxster.homelinux.org > Dead Possum Productions > 814.825.0847 > ___ > Tutor maillist - Tutor@python.org > http://mail.python.org/mailman/listinfo/tutor -- Lloyd Kvam Venix Corp ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Question on regular expressions
On 24 Mai 2006, [EMAIL PROTECTED] wrote: > I have two Perl expressions > > > If windows: > > perl -ple "s/([^\w\s])/sprintf(q#%%%2X#, ord $1)/ge" somefile.txt > > If posix > > perl -ple 's/([^\w\s])/sprintf("%%%2X", ord $1)/ge' somefile.txt > > > > The [^\w\s] is a negated expression stating that any character > a-zA-Z0-9_, space or tab is ignored. > > The () captures whatever matches and throws it into the $1 for > processing by the sprintf > > In this case, %%%2X which is a three character hex value. > > How would you convert this to a python equivalent using the re or > similar module? python -c "import re, sys;print re.sub(r'([^\w\s])', lambda s: '%%%2X' % ord(s.group()), sys.stdin.read())," < somefile It's not as short as the Perl version (and might have problems with big files). Python does not have such useful command line switches like -p (but you doesn't use Python so much for one liners as Perl) but it does the same ; at least in this special case (Python lacks something like the -l switch). With bash it's a bit easier. (maybe there's also a way with cmd.com to write multiple lines)? $ python -c "import re,sys for line in sys.stdin: print re.sub(r'([^\w\s])', lambda s: '%%%2X' % ord(s.group()), line)," < somefile Karl -- Please do *not* send copies of replies to me. I read the list ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] class Writer
I've been working my way through an online tutorial and came across the following sample script: import sys class Writer: def __init__(self, filename): self.filename = filename def write(self, msg): f = file(self.filename, 'a') f.write(msg) f.close() sys.stdout = Writer('tmp.log') print 'Log message #1' print 'Log message #2' print 'Log message #3' The script created the tmp.log file with the following lines: Log message #1 Log message #2 Log message #3 I understand that the class is taking the strings from stdout (supplied by the print statements) and writing them to a text file. Does the user need to explicitly call the write function? For example: sys.stdout = Writer('tmp.log').write(whatever the message is) ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Question on regular expressions
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Wow!!.. That awesome! My goal was not to make it a one-liner per-se.. I was simply trying to show the functionality I was trying to duplicate. Boiling your one-liner down into a multi-line piece of code, I did: #!c:\python24\python import re,sys a = open(r'e:\pycode\csums.txt','rb').readlines() for line in a: print re.sub(r'([^\w\s])', lambda s: '%%%2X' % ord(s.group()), line) Breaking down the command, you appear to be calling an un-named function to act against any characters trapped by the regular expression. Not familiar with lamda :). The un-named function does in-place transformation of the character to the established hex value. Does this sound right? If I then saved the altered output to a file and wanted to transform it back to its original form, I would do the following in perl. perl -ple 's/(?:%([0-9A-F]{2}))/chr hex $1/eg' somefiletxt How would you reverse the process from a python point of view? Karl Pflästerer wrote: > python -c "import re, sys;print re.sub(r'([^\w\s])', lambda s: '%%%2X' % > ord(s.group()), sys.stdin.read())," < somefile > > It's not as short as the Perl version (and might have problems with big > files). Python does not have such useful command line switches like -p > (but you doesn't use Python so much for one liners as Perl) but it does > the same ; at least in this special case (Python lacks something like the > -l switch). > > With bash it's a bit easier. (maybe there's also a way with cmd.com to > write multiple lines)? > > $ python -c "import re,sys > for line in sys.stdin: print re.sub(r'([^\w\s])', lambda s: '%%%2X' % > ord(s.group()), line)," < somefile > > >Karl - -- Thank you, Andrew Robert Systems Architect Information Technologies MFS Investment Management Phone: 617-954-5882 E-mail: [EMAIL PROTECTED] Linux User Number: #201204 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.3 (MingW32) iD8DBQFEdPFwDvn/4H0LjDwRAuzuAKCOPja9Js1ueP2GoT+B0hoFubDEegCguzfT QL87gmKUx6znmGQxXqg6V+A= =7MT2 -END PGP SIGNATURE- ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Question on regular expressions (fwd)
[forwarding to tutor, although it looks like Andrew's making some good headway from other messages] -- Forwarded message -- Date: Wed, 24 May 2006 14:59:43 -0400 From: Andrew Robert <[EMAIL PROTECTED]> To: Danny Yoo <[EMAIL PROTECTED]> Subject: Re: [Tutor] Question on regular expressions -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hey Danny, Your code put me right on track. - From that point, I crafted the following code. What is confusing is how to take the captured character and transform it into a 3 digit hex value. Do you know how that might be accomplished? #!/usr/bin/python import re # Evaluate captured character as hex def ret_hex(ch): return chr((ord(ch) + 1 ) % 256 ) # Evaluate the value of whatever was matched def eval_match(match): return ret_hex(match.group(0)) # open file file = open(r'm:\mq\mq\scripts\sigh.txt','r') # Read each line, pass any matches on line to function for # line in file.readlines(): for line in file: a=re.sub('[^\w\s]',eval_match, line) print a - -- Thank you, Andrew Robert Systems Architect Information Technologies MFS Investment Management Phone: 617-954-5882 E-mail: [EMAIL PROTECTED] Linux User Number: #201204 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.3 (MingW32) iD8DBQFEdK0fDvn/4H0LjDwRAuipAKDFqOeQQkJ+WkaI+veIgC8oEn9/CQCfUfNO xb7AT8W04B/F684i+Lw6kxw= =5mPe -END PGP SIGNATURE- ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] class Writer
On Wed, 24 May 2006, Christopher Spears wrote: > I've been working my way through an online tutorial and came across the > following sample script: > > import sys > > class Writer: >def __init__(self, filename): >self.filename = filename >def write(self, msg): >f = file(self.filename, 'a') >f.write(msg) >f.close() Just as a side note: the 'logging' module in the Standard Library would probably be the way to handle things like this. http://www.python.org/doc/lib/module-logging.html > I understand that the class is taking the strings from stdout (supplied > by the print statements) and writing them to a text file. Not exactly. Something more subtle is happening. Every call to the print statement causes Python to do something like: print foo> sys.stdout.write(str(foo) + "\n") At least, to a first approximation, that's what 'print' does. We can try it out by, from a clean interpreter, doing: ## import sys sys.stdout.write("hello") sys.stdout.write("world") ## We should see "helloworld" because we have not told sys.stdout to write a newline to separate the two words. Going back to the code that you show: sys.stdout = Writer('tmp.log') is explicitely reassigning the standard output file to something else entirely. Future print statements talk to the Writer instance as if it were standard output. Does this distinction clear things up? ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] help requested: port not free, under Windows XP
[message re-sent; original seemed not to have been received, according to the archives. Apologies if this is not the case.]Hi all-===Preliminaries===I wrote a new app (Crunchy Frog) which is meant to transform "boring" traditional python tutorial into truly interactive experiences. It is still at an alpha stage but is promising imo; for those interested, you can find it at: https://sourceforge.net/project/showfiles.php?group_id=125834 It currently requires both CherryPy and Elementtree as additional packages.Starting the app launch your favourite browser (or open a new tab/window). This is done through the instruction: cherrypy.server.start_with_callback(webbrowser.open, ('http://localhost:8080',),)===Now the question=== Someone on edu-sig tried to get it working on her computer running Windows XP home edition (just like mine, where it works fine!). However, she gets an error message about " port 8080 not free on local host." This is after she made sure nothing else internet-related was working. [This kind of message can happen if another instance of Crunchy Frog is already running, which she made sure wasn't.] I am stumpedI am thinking it might be a firewall issue (I have ZoneAlarm installed myself), but I am really not sure I thought there were enough smart and friendly people here that I could find an answer ;-) Thanks in advance for any help, pointers, etc. Of course, you can reply directly on the edu-sig list if you want!André ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Question on regular expressions
Andrew Robert wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > Wow!!.. > > That awesome! > > > My goal was not to make it a one-liner per-se.. > > I was simply trying to show the functionality I was trying to duplicate. > > Boiling your one-liner down into a multi-line piece of code, I did: > > #!c:\python24\python > > import re,sys > > a = open(r'e:\pycode\csums.txt','rb').readlines() > > for line in a: You probably want to open the file in text mode, not binary. You don't have to read all the lines of the file, you can iterate reading one line at a time. Combining these two changes, the above two lines consolidate to for line in open(r'e:\pycode\csums.txt'): > print re.sub(r'([^\w\s])', lambda s: '%%%2X' % ord(s.group()), line) > > > Breaking down the command, you appear to be calling an un-named function > to act against any characters trapped by the regular expression. > > Not familiar with lamda :). It is a way to make an anonymous function, occasionally abused to write Python one-liners. You could just as well spell it out: def hexify(match): return ''%%%2X' % ord(match.group()) print re.sub(r'([^\w\s])', hexify, line) > > The un-named function does in-place transformation of the character to > the established hex value. > > > Does this sound right? Yes. Kent ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] _next
How does this script work? #!/usr/bin/python class IteratorExample: def __init__(self, s): self.s = s self.next = self._next().next self.exhausted = 0 def _next(self): if not self.exhausted: flag = 0 for x in self.s: if flag: flag = 0 yield x else: flag = 1 self.exhausted = 1 def __iter__(self): return self._next() def main(): a = IteratorExample('edcba') for x in a: print x print '=' * 30 a = IteratorExample('abcde') print a.next() print a.next() print a.next() print a.next() print a.next() print a.next() if __name__ == '__main__': main() Here is the output: d b == b d Traceback (most recent call last): File "./python_101_iterator_class.py", line 35, in ? main() File "./python_101_iterator_class.py", line 29, in main print a.next() StopIteration I think a lot of my confusion comes from not understanding what _next is. I got this script from an online tutorial at python.org. Is there a better way to write the script, so I can actually understand it? "I'm the last person to pretend that I'm a radio. I'd rather go out and be a color television set." -David Bowie "Who dares wins" -British military motto "I generally know what I'm doing." -Buster Keaton ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] _next
Christopher Spears wrote: > How does this script work? > > #!/usr/bin/python > > class IteratorExample: > def __init__(self, s): > self.s = s > self.next = self._next().next > self.exhausted = 0 > def _next(self): > if not self.exhausted: > flag = 0 > for x in self.s: > if flag: > flag = 0 > yield x > else: > flag = 1 > self.exhausted = 1 > def __iter__(self): > return self._next() > > def main(): > a = IteratorExample('edcba') > for x in a: > print x > print '=' * 30 > a = IteratorExample('abcde') > print a.next() > print a.next() > print a.next() > print a.next() > print a.next() > print a.next() > > if __name__ == '__main__': > main() > > > Here is the output: > > d > b > == > b > d > Traceback (most recent call last): > File "./python_101_iterator_class.py", line 35, in ? > main() > File "./python_101_iterator_class.py", line 29, in > main > print a.next() > StopIteration > > I think a lot of my confusion comes from not > understanding what _next is. I got this script from > an online tutorial at python.org. Is there a better > way to write the script, so I can actually understand it? _next() is a generator - a function which, when called, returns an iterator. Each time the yield statement is reached, the iterator returns a new value. When the generator returns, the iteration ends. Generators are a very convenient way to package up iteration and state. Here is a simple example of a generator that counts to 2: In [1]: def count2(): ...: yield 1 ...: yield 2 ...: ...: You can iterate over the generator in a for loop: In [2]: for i in count2(): ...: print i ...: ...: 1 2 If you prefer you can explicitly call the next() method, which is a common method of all iterators: In [3]: c=count2() In [4]: c.next() Out[4]: 1 In [5]: c.next() Out[5]: 2 When the iterator is exhausted, it raises StopIteration. Again, this is standard behaviour for all iterators: In [6]: c.next() Traceback (most recent call last): File "", line 1, in ? StopIteration Your example seems like a complicated way to wrap an iterable in an iterator which returns every other element. Maybe I am missing something, but I would write it like this: In [7]: def skipper(seq): ...: it = iter(seq) # make sure we have an iterator ...: while True: ...: it.next() # skip a value ...: yield it.next() # return a value ...: ...: In [8]: for a in skipper('edcba'): ...: print a ...: ...: d b In [9]: a = skipper('edcba') In [10]: a.next() Out[10]: 'd' In [11]: a.next() Out[11]: 'b' In [12]: a.next() Traceback (most recent call last): File "", line 1, in ? File "", line 5, in skipper StopIteration You can read more about the iterator protocol and generators here: http://www.python.org/doc/2.2.3/whatsnew/node4.html http://www.python.org/doc/2.2.3/whatsnew/node5.html Read the referenced PEPs for all the juicy details. Hmm, a little Googling finds the tutorial you mention here: http://www.rexx.com/~dkuhlman/python_101/python_101.html#SECTION00446 IMO this is a very confused example. You can write class-based iterators and you can write generator-based iterators, but combining them both to achieve such a simple result makes no sense to me. Kent ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] help requested: port not free, under Windows XP
> Someone on edu-sig tried to get it working on her computer running Windows > XP home edition (just like mine, where it works fine!). However, she gets an > error message about > " port 8080 not free on local host." This is after she made sure nothing > else internet-related was working. [This kind of message can happen if > another instance of Crunchy Frog is already running, which she made sure > wasn't.] > I am stumped Try netstat -o (shows the owning pid for each network port) Alan ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] class Writer
> I understand that the class is taking the strings from > stdout (supplied by the print statements) and writing > them to a text file. Does the user need to explicitly > call the write function? For example: > > sys.stdout = Writer('tmp.log').write(whatever the > message is) No, that's what print does. The Writer is replacing the stdout file-like object with your Writer file-like object. print just uses whatever file-like object is connected to stdout. Alan G. ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Question on regular expressions
> a = open(r'e:\pycode\csums.txt','rb').readlines() > > for line in a: >print re.sub(r'([^\w\s])', lambda s: '%%%2X' % ord(s.group()), > line) Or just for line in open(r'e:\pycode\csums.txt','rb'): print. > Breaking down the command, you appear to be calling an un-named > function > to act against any characters trapped by the regular expression. > > Not familiar with lamda :). You ae absolutely right. It creates an un-named(or anonymous function). :-) > The un-named function does in-place transformation of the character > to > the established hex value. Its actually the call to re.sub() that makes in in place. > How would you reverse the process from a python point of view? Just write a reverse function for the lamda... Alan G. ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Question on regular expressions (fwd)
> Your code put me right on track. > > - From that point, I crafted the following code. > > What is confusing is how to take the captured character and > transform it > into a 3 digit hex value. In general I prefer to use string formatting to convert into hex format. print "%3X% % myValue you can play around with the length specifier, left/right formatting etc etc. Think sprintf in C... Alan G. ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] help requested: port not free, under Windows XP
Hi all-===Preliminaries===I wrote a new app (Crunchy Frog) which is meant to transform "boring" traditional python tutorial into truly interactive experiences. It is still at an alpha stage but is promising imo; for those interested, you can find it at: https://sourceforge.net/project/showfiles.php?group_id=125834 It currently requires both CherryPy and Elementtree as additional packages.Starting the app launch your favourite browser (or open a new tab/window). This is done through the instruction: cherrypy.server.start_with_callback(webbrowser.open, ('http://localhost:8080',),)===Now the question=== Someone on edu-sig tried to get it working on her computer running Windows XP home edition (just like mine, where it works fine!). However, she gets an error message about " port 8080 not free on local host." This is after she made sure nothing else internet-related was working. [This kind of message can happen if another instance of Crunchy Frog is already running, which she made sure wasn't.] I am stumpedI am thinking it might be a firewall issue (I have ZoneAlarm installed myself), but I am really not sure I thought there were enough smart and friendly people here that I could find an answer ;-) Thanks in advance for any help, pointers, etc. Of course, you can reply directly on the edu-sig list if you want!André ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor