Re: [Tutor] rstrip in list?
On Wed, 10 Feb 2010 02:28:43 am Ken G. wrote: > I printed out some random numbers to a list and use 'print mylist' > and they came out like this: > > ['102\n', '231\n', '463\n', '487\n', '555\n', '961\n'] > > I was using 'print mylist.rstrip()' to strip off the '\n' > > but kept getting an error of : > > AttributeError: 'list' object has no attribute 'rstrip' You have to apply rstrip to each item in the list, not the list itself. Here are two ways to do it: #1: modify the list in a for-loop for i, item in enumerate(mylist): mylist[i] = item.rstrip() #2: make a new list with a list comprehension mylist = [item.rstrip() for item in mylist] Of the two, I prefer the second. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] string to list
On Thu, 11 Feb 2010 12:32:52 am Owain Clarke wrote: > My son was doing a statistics project in which he had to sort some > data by either one of two sets of numbers, representing armspan and > height of a group of children - a boring and painstaking job. I came > across this piece of code:- > > li=[[2,6],[1,3],[5,4]] # would also work with li=[(2,6),(1,3),(5,4)] > li.sort(key=lambda x:x[1] ) > print li > > It occurred to me that I could adapt this so that he could input his > data at the command line and then sort by x:x[0] or x:x[1]. And I > have not discovered a way of inputting a list, only strings or > various number types. Which command line do you mean? If you are talking about the Python interactive interpreter, then you can input either lists or strings: >>> li = [1, 2, 3] # a list >>> li = "[1, 2, 3]" # a string If you mean the external shell (say, "bash" under Linux or the DOS command line in Windows, or similar) then you can only input strings -- everything is a string in such shells. There are two ways to convert such strings to lists: the easy, unsafe way; or the slightly more difficult but safer way. Think of it like The Force, where the Dark Side is simpler and more attractive but ultimately more dangerous :) First, the easy way. If you absolutely trust the source of the data, then you can use the eval function: >>> s = "[1, 2, 3]" # quote marks make it a string >>> li = eval(s) The interpreter will read the string s as a Python expression, and do whatever it says. In this example, it says "make a list with the numbers 1, 2 and 3 in it". But it could say *anything*, like "erase the hard disk", and Python would dutifully do what it is told. This is why eval is easy and fast but dangerous, and you must absolutely trust the source of the string. Alternatively, you could do this: >>> s = "1 2 3 4" # numbers separated by spaces >>> li = s.split() # li now contains the strings "1", "2" etc. >>> li = map(int, li) # convert the strings to actual ints >>> print li [1, 2, 3, 4] Naturally the string s would come from the shell, otherwise there is no point! If you are typing the data directly into the Python interpreter, you would enter it directly as a list and not a string. Hope this helps, -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] aliasing an imported module
On Sat, 13 Feb 2010 10:51:38 am Garry Willgoose wrote: > I want to be able to import multiple instances of a module and call > each by a unique name and it doesn't appear at first glance that > either import or __import__ have what I need. No, such a thing is not officially supported by Python. Python treats modules as singletons -- there is only one instance of any module at any time. It is possible to force Python to break that promise, but that is fighting the language, and there's no guarantee that it won't cause other breakages further along. So if you choose to create multiple instances of a module, you're doing something the language doesn't want you to do, and if you end up shooting yourself in the foot you'll have no one to blame but yourself. Having said that, the following trick should do what you want. Start with a simple module holding state: # mymodule.py state = [] Here's Python's normal behaviour: >>> import mymodule >>> mymodule.state [] >>> mymodule.state.append(123) >>> mymodule.state [123] >>> import mymodule as something_else >>> something_else.state [123] And here's the trick: >>> import sys >>> del sys.modules['mymodule'] >>> import mymodule as another_name >>> another_name.state [] >>> mymodule.state # Check the original. [123] This *should* work across all versions and implementations of Python, but remember that it's a hack. You're fighting the language rather than working with it. Another problem is that there's no guarantee that the module holds all its state inside itself: it might in turn import a second module, and store state in there. Many packages, in particular, may do this. The best solution is to avoid global state if you possibly can. You also say: > The key problem is that > the module might locally store some partial results ready for the > next time its called to save CPU time (typically the results for one > timestep ready for the next timestep). I'm going to take a wild guess as to what you're doing, and make a suggestion for how you can do something better. I guess you have functions something like this: STATE = None def func(): global STATE if STATE is None: # No global state recorded, so we start from scratch. step = 1 partial = 0 else: step, partial = STATE step += 1 partial = do_some_calculations(step, partial) STATE = (step, partial) return partial But as you point out, that means all calls to func() use the same global state. Here are two alternatives. Here's a rather messy one, but it tickles my fancy: use a token to identify the caller, so each caller gets their own state and not somebody else's. LAST_TOKEN = 0 def get_token(): global LAST_TOKEN LAST_TOKEN += 1 return LAST_TOKEN STATE = {} def func(token): global STATE if not STATE[token]: # No global state recorded, so we start from scratch. step = 1 partial = 0 else: step, partial = STATE[token] step += 1 partial = do_some_calculations(step, partial) # Defined elsewhere. STATE[token] = (step, partial) return partial Then, before each independent use of func, the caller simply calls get_token() and passes that to the function. I'm sure you can see problems with this: - the caller has to store their tokens and make sure they pass the right one; - the function needs to deal with invalid tokens; - the global STATE ends up storing the result of intermediate calculations long after they are no longer needed; - the separation is a "gentleman's agreement" -- there is nothing stopping one caller from guessing another valid token. Although I'm remarkably fond of this solution in theory, in practice I would never use it. A better solution is to write func in a more object-oriented fashion: class FuncCalculator(object): """Class that calculates func""" def __init__(self): # Start with initial state. self.STATE = (1, 0.0) def __call__(self): step, partial = self.STATE step += 1 partial = self.do_some_calculations(step, partial) self.STATE = (step, partial) return partial def do_some_calculations(self, step, partial): return partial + 1.0/step # or whatever Instances of the class are callable as if they were functions. In C++ terminology, this is called a "functor". In Python, we normally just call it a callable. So we can create as many independent "functions" as needed, each with their own state: >>> f = FuncCalculator() >>> g = FuncCalculator() >>> h = FuncCalculator() >>> f() 0.5 >>> f() 0.83326 >>> f() 1.0833 >>> g() 0.5 >>> g() 0.83326 >>> h() 0.5 >>> f() 1.2832 Hope this helps! -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
tutor@python.org
On Sun, 14 Feb 2010 10:58:10 am Alan Gauld wrote: > "spir" wrote > > > PS: in "l>>24 & 255", the & operation is useless, since all 24 > > higher bits are already thrown away by the shift: > > They are not gone however there are still 32 bits in an integer so > the top bits *should* be set to zero. No, Python ints are not 32 bit native ints. They're not even 64 bit ints. Python has unified the old "int" type with "long", so that ints automatically grow as needed. This is in Python 3.1: >>> (0).bit_length() 0 >>> (1).bit_length() 1 >>> (2).bit_length() 2 >>> (3).bit_length() 2 >>> (10**100).bit_length() 333 Consequently, if you have an arbitrary int that you don't know where it came from, you can't make any assumptions about the number of bits it uses. > But glitches can occur from time to time... If Python had a glitch of the magnitude of right-shifting non-zero bits into a number, that would be not just a bug but a HUGE bug. That would be as serious as having 1+1 return 374 instead of 2. Guarding against (say) 8 >> 1 returning anything other than 4 makes as much sense as guarding against 8//2 returning something other than 4: if you can't trust Python to get simple integer arithmetic right, then you can't trust it to do *anything*, and your guard (ANDing it with 255) can't be trusted either. > It is good practice to restrict the range to the 8 bits needed by > and'ing with 255 > even when you think you should be safe. It is certainly good practice if you are dealing with numbers which might be more than 24 bits to start with: >>> n = 5**25 >>> n >> 24 17763568394 >>> n >> 24 & 255 10 But *if* you know the int is no more than 32 bits, then adding in a guard to protect against bugs in the >> operator is just wasting CPU cycles and needlessly complicating the code. The right way to guard against "this will never happen" scenarios is with assert: assert n.bit_length() <= 32 # or "assert 0 <= n < 2**32" print(n >> 24) This has two additional advantages: (1) It clearly signals to the reader what your intention is ("I'm absolutely 100% sure than n will not be more than 32 bits, but since I'm a fallible human, I'd rather find out about an error in my logic as soon as possible"). (2) If the caller cares enough about speed to object to the tiny little cost of the assertion, he or she can disable it by passing the -O (O for Optimise) switch to Python. (More likely, while each assert is very cheap, a big application might have many, many asserts.) -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Tutor list as pair progamming plush toy
On Sat, 13 Feb 2010 02:33:04 am Mac Ryan wrote: > whenever I get stuck, I begin to write a message to the > list, and in the process of explaining what is the intended behaviour > and outcome of my code, I systematically find the bug by myself. [...] > Does anybody else experience the same? Yes! -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] asigning class variables
On Sun, 14 Feb 2010 12:08:40 pm the kids wrote: > Hi, my name is John Paul. > > I was trying to define a class but I can't make it asign particular > objects particular variables at their creation. Please help. > > John Paul I don't quite follow what you mean, it is best if you can show example code and describe what you expect and what you get instead. But first a brief note on terminology. In some other languages, people talk about "class variables" being variables that are part of the class. To me, that's inconsistent with everything else: an int variable holds an int; a string variable holds a string; a list variable holds a list; so a class variable should hold a class. (You might argue, why would you ever need a variable that was a class? As an experience Python programmer, I can tell you such a thing is very, very useful, and far more common than you might think.) Anyway, in Python circles, it is much more common to describe class variables as attributes. Other languages use the term "members", and people who used Apple's Hypercard will remember the term "properties" (but properties in Python are slightly different). Python has two sorts of attributes (class variables, a.k.a. members): * Attributes which are attached to the class itself, and therefore shared by all the instances. These are known as "class attributes". * Much more common is the ordinary sort of attribute that is attached to instances; these are usually known as "instance attributes", or more often, just plain old attributes. You create a class attribute (shared by all instances) by assigning to a name inside the class definition: class K: shared = [] This is now shared between all instances. You create an instance attribute (not shared) by assigning it inside a method, using the usual attribute syntax. The usual place to do this is inside the __init__ method. (Note that __init__ has TWO underscores at both the beginning and end.) class K: shared = [] def __init__(self): self.attribute = [] And in use: >>> a = K() >>> b = K() >>> a.shared.append(23) >>> b.shared [23] >>> a.attribute.append(23) >>> a.attribute [23] >>> b.attribute [] Using class attributes is sometimes tricky, because Python makes it really easy to accidentally hide a class attribute with an instance attribute with the same name, but fortunately having shared data like this is quite rare, so it's unusual to be a problem. Hope this helps, -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Getting caller name without the help of "sys._getframe(1).f_code.co_name" ?
On Sun, 14 Feb 2010 10:33:09 pm patrice laporte wrote: > I got a class that takes a file name in its __init__ method (it > could be elsewhere, but why not here ?). Then, somewhere in that > class, a method will do something with that file name, such as "try > to open that file". > > If the file do esn't exist, bing ! I got an exception "I/O Error n°2 > : file doesn't exist". Are you sure? What version of Python are you using? I get a completely different error message: >>> open("no such file.txt", "r") Traceback (most recent call last): File "", line 1, in IOError: [Errno 2] No such file or directory: 'no such file.txt' Note carefully that the exception shows you the file name. > That's nice, I of course catch this exception, but it's not enough > for the user : What file are we talking about ? And how to tell the > user what is that file, and make him understand he tell the app to > use a file that doesn't exist ? >>> try: ... open("no such file.txt", "r") ... except IOError, e: ... pass ... >>> e.filename 'no such file.txt' > And this is not enough for developer : where that error happened ? > what class ? what method ? All these things are displayed by the default traceback mechanism. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] A Stuborn Tab Problem in IDLE
On Mon, 15 Feb 2010 09:19:35 am Wayne Watson wrote: > When I use F5 to execute a py program in IDLE, Win7, I get a tab > error on an indented else. I've selected all and untabifed with 4 > spaces several times, and get the same problem. I've tried re-typing > the line with zero results. What next? I had been modifying the > program repeatedly over several hours, and executing it without any > trouble like this. Can you copy and paste the exact error message displayed? -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
tutor@python.org
On Sun, 14 Feb 2010 08:16:18 pm Alan Gauld wrote: > >> But glitches can occur from time to time... > > > > If Python had a glitch of the magnitude of right-shifting non-zero > > bits into a number, that would be not just a bug but a HUGE bug. > > Bit shifting is machine specific. Pardon me, but that's incorrect. Python is not assembly, or C, and the behaviour of bit shifting in Python is NOT machine specific. Python doesn't merely expose the native bit shift operations on native ints, it is a high-level object-oriented method with carefully defined semantics. http://docs.python.org/library/stdtypes.html#bit-string-operations-on-integer-types In Python, a left shift of n MUST return the equivalent of multiplication by 2**n, and a right shift MUST return the equivalent of integer division by 2**n. Any other result is a SERIOUS bug in Python of the same magnitude (and the same likelihood) as 10/2 returning 18. So while I bow to your knowledge of bit operations in assembler on obscure four bit processors, Python does not do that. (I'm not even sure if Python runs on any four bit CPUs!) Python is a high-level language, not an assembler, and the behaviour of the bit operators >> and << is guaranteed to be the same no matter what CPU you are using. (The only low-level ops that Python exposes are floating point ops: Python mostly does whatever the C library on your platform does.) > > It is certainly good practice if you are dealing with numbers which > > might be more than 24 bits to start with: > > Its more than good practice there, its essential. Hardly. There are other ways of truncating a number to 8 bits, e.g. by using n % 256. If you're dealing with signed numbers, using & 255 will throw away the sign bit, which may be undesirable. And of course, it isn't desirable (let alone essential) to truncate the number if you don't need an 8 bit number in the first place! [and discussing the case where you know your input is already 8 bits] > In the case in point the & 255 keeps the coding style consistent > and provides an extra measure of protection against unexpected > oddities so I would keep it in there. So you add unnecessary operations to be consistent? That's terrible practice. So if you have an operation like this: n = 12*i**3 + 7 and later on, you then want n = i+1, do you write: n = 1*i**1 + 1 instead to be "consistent"? I would hope not! > > cycles and needlessly complicating the code. The right way to guard > > against "this will never happen" scenarios is with assert: > > > > assert n.bit_length() <= 32 # or "assert 0 <= n < 2**32" > > I would accept the second condition but the mask is much faster. Premature (micro) optimizations is the root of all evil. An assert that can be turned off and not executed is infinitely faster than a bit shift which is always executed whether you want it or not. And either way, the 20 seconds I lose trying to interpret the bit ops when I read the code is far more important than the 0.01 seconds I lose executing the assert :) > bit_length doesn't seem to work on any of my Pythons (2.5,2.6 and > 3.1) It won't work in 2.5 or 2.6. You're probably trying this: 123.bit_length() and getting a syntax error. That's because the Python parser sees the . and interprets it as a float, and 123.bit_length is not a valid decimal float. You need to either group the int, or refer to it by name: (123).bit_length() n = 123 n.bit_length() -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] command error
On Wed, 17 Feb 2010 07:32:44 am Shurui Liu (Aaron Liu) wrote: > Here is a program I wrote, I don't know why I cannot exit when I > tried 10 times? Hope somebody can help me. Thank you! Here is a program that loops ten times, then stops: for i in range(10): print "Loop number", i Here is another program that loops ten times counting down backwards: for i in range(10, 0, -1): # 10, 9, 8, ... , 1 print "Loop number", i Lastly, here is a program that loops ten times backwards, but exits early when a test becomes true. As a bonus, it prints a message if the test never becomes true: for i in range(10, 0, -1): # 10, 9, 8, ... , 1 if i <= 4: # the test print "we exit the loop early" break print "Loop number", i else: # for...else print "we didn't exit the loop early" -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] pure symbol -- __subtype__
On Thu, 18 Feb 2010 08:13:33 pm spir wrote: > Hello, > > I was lately implementing a kind of "pure symbol" type. What I call > pure symbols is these kinds of constants that refer to pure "idea", > so that they have no real obvious value. We usually _arbitrarily_ > give them as value an int, a string, a boolean, an empty object: If you are interested in this, there are various modules on PyPI for them, such as: http://pypi.python.org/pypi/enum/ which probably does everything you want. Unfortunately any enum solution is going to be rather heavyweight compared to (say) Pascal, which can simply give each enum an integer value and have the compiler enforce separation of types. (In other words, all the heavy complexity is in the compiler rather than the runtime environment.) If you want a lightweight solution, just use constant strings or integers. > Actually, what I need is a kind of __subtype__ magic method that acts > for subtyping the same way __init__ does for instanciation. That's what metaclasses are for. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Reading large bz2 Files
On Fri, 19 Feb 2010 11:42:07 pm Norman Rieß wrote: > Hello, > > i am trying to read a large bz2 file with this code: > > source_file = bz2.BZ2File(file, "r") > for line in source_file: > print line.strip() > > But after 4311 lines, it stoppes without a errormessage. The bz2 file > is much bigger though. > > How can i read the whole file line by line? "for line in file" works for me: >>> import bz2 >>> >>> writer = bz2.BZ2File('file.bz2', 'w') >>> for i in xrange(2): ... # write some variable text to a line ... writer.write('abc'*(i % 5) + '\n') ... >>> writer.close() >>> reader = bz2.BZ2File('file.bz2', 'r') >>> i = 0 >>> for line in reader: ... i += 1 ... >>> reader.close() >>> i 2 My guess is one of two things: (1) You are mistaken that the file is bigger than 4311 lines. (2) You are using Windows, and somehow there is a Ctrl-Z (0x26) character in the file, which Windows interprets as End Of File when reading files in text mode. Try changing the mode to "rb" and see if the behaviour goes away. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Is it possible for a Python Program to send commands to the Python Interpreter?
On Sat, 20 Feb 2010 06:33:04 am Kent Johnson wrote: > It sounds like you are looking for eval() > > (Standard warning - use eval() only on trusted data) This is the tutor list, aimed at beginners to Python, many of whom are also beginners to programming as well. Even experienced programmers often get security very, very, very badly wrong. Do you think that a glib eight-word one-line "standard warning" really is sufficient? James, if you are still reading, I should expand on Kent's warning. The eval function, like exec, can be very dangerous in the wrong hands. There is a whole class of very, very common security bugs caused by functions like eval: http://en.wikipedia.org/wiki/Code_injection The way the bug works is that the programmer writes a function that takes some data, and directly or indirectly applies eval to it: >>> def mylist(string): ... # Convert a string to a list. ... string = string.strip() ... if string.startswith('[') and string.endswith(']'): ... return eval(string) ... else: ... raise ValueError('not a list') ... >>> mylist(" [1, 2, 3] ") [1, 2, 3] This seems pretty innocuous, but it contains a deadly land-mine. This function then gets used in an application that uses strings produced by untrusted users. Say, it ends up in a web application, and the user types text into a field and the application ends up calling mylist on the contents of that field. Then, some malicious user types this into the input form: "[] or __import__('os').system('echo YOUR SYSTEM IS MINE') or [1,2,3]" (only imagine something much worse than an echo command) and your web application does this: >>> s = "[] or __import__('os').system('echo YOUR SYSTEM IS MINE') or [1,2,3]" >>> mylist(s) YOUR SYSTEM IS MINE [1, 2, 3] You have just had your web server compromised by a code injection bug. (A web application is only one example of how you can get untrusted data into your app. It is the biggest risk though.) Now, you might think that you can work around this by clever programming. Well, maybe, but sanitising strings so they are safe is a VERY difficult job. And naturally if you aren't aware they need to be sanitised, you won't do it. My advice to anyone thinking they need to use eval or exec is: (1) Don't do it. (2) If you think you need to use them, you probably don't. (3) If you really, really need to use them, then use the most restrictive environment possible. Instead of eval(s), use: eval(s, {'__builtins__': None}, {}) which gives you some protection against naive attackers. The really clever ones will still break it. (4) Sanitise your data. Don't use Javascript to sanitise it at the browser, because the Bad Guys know how to bypass your Javascript checks. If you're expecting (say) a list of integers, then there is no reason for the list to contain *any* alphabetic characters or underscores, and if there are any, reject the string and report an error: def sanitise(string): safe = "1234567890[], \n\t" for c in string: if c not in safe: raise ValueError('unsafe string') If your needs are more complicated, then sanitising the string becomes exponentially more difficult. It will probably be less work to write your own safe parser than to sanitise the input. Have I scared you about using eval? If so, good. Don't let eval or exec anywhere near data provided by untrusted users, and don't confuse authentication with trust. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Superclass call problem
On Sun, 21 Feb 2010 04:50:49 am Alan Harris-Reid wrote: > Hi, > > I am having trouble understanding how superclass calls work. Here's > some code... What version of Python are you using? In Python 2.x, you MUST inherit from object to use super, and you MUST explicitly pass the class and self: class ParentClass(object): def __init__(self, a, b, c): do something here class ChildClass(ParentClass): def __init__(self, a, b, c): super(ChildClass, self).__init__(a, b, c) In Python 3.x, all classes inherit from object and you no longer need to explicitly say so, and super becomes a bit smarter about where it is called from: # Python 3 only class ParentClass: def __init__(self, a, b, c): do something here class ChildClass(ParentClass): def __init__(self, a, b, c): super().__init__(a, b, c) I assume you are using Python 3.0 or 3.1. (If you're 3.0, you should upgrade to 3.1: 3.0 is s-l-o-w and no longer supported.) Your mistake was to pass self as an explicit argument to __init__. This is not needed, because Python methods automatically get passed self: > def __init__(self): > super().__init__(self) That has the effect of passing self *twice*, when __init__ expects to get self *once*, hence the error message you see: > When the super().__init__ line runs I get the error "__init__() takes > exactly 1 positional argument (2 given)" -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Replacing part of a URL
On Sun, 21 Feb 2010 09:34:34 am Lao Mao wrote: > Hello, > > I need to be able to replace the last bit of a bunch of URLs. > > The urls look like this: > > www.somesite.com/some/path/to/something.html > > They may be of varying lengths, but they'll always end with > .something_or_other.html > > I want to take the "something" and replace it with something else. > > My current plan is to simply do a string.split("/")[-1] > > and then another .split('.') to result in ['something', 'html'], and > then replace sometihing, and join them together again. > > But - wouldn't it make more sense to do this with re.sub? Heavens no! Why do you need a 80 pound sledgehammer to crack a peanut??? "Some people, when confronted with a problem, think 'I know, I'll use regular expressions.' Now they have two problems." -- Jamie Zawinski -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] os.path.basename() issue with path slashes
On Sun, 21 Feb 2010 10:49:19 am Dayo Adewunmi wrote: > Hi all, > > This script I'm working on, should take all the image files in the > current directory and generate an HTML thumbnails. > > import os > import urllib You import urllib, but don't appear to use it anywhere. > > # Generate thumbnail gallery > def genThumbs(): > # Get current directory name > absolutePath = os.getcwd() > urlprefix = "http://kili.org/~dayo"; > currentdir = os.path.basename(absolutePath) > for dirname, subdirname, filenames in os.walk(absolutePath): > for filename in filenames: > print "" > %(currentdir,filename,currentdir,filename) You don't need to escape quotes, just use the other quote. Instead of: print "" use: print '' Also, I'm not sure why sometimes you use / as a directory separator and sometimes \. I think you're trying to do too much in one line, leading to repeated code. # Untested. for dirname, subdirname, filenames in os.walk(absolutePath): for filename in filenames: fullpath = os.path.join(currentdir, filename) if os.name == 'nt': fullpath.replace('\\', '/') print '' % (fullpath, fullpath) Hope that helps. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Functions returning multiple values
On Mon, 22 Feb 2010 03:00:32 am Giorgio wrote: > Hi, > > do you know if there is a way so that i can get multiple values from > a function? > > For example: > > def count(a,b): > c = a + b > d = a - b > > How can I return the value of C and D? Return a tuple of c and d: >>> def count(a, b): ... c = a + b ... d = a - b ... return c, d ... >>> t = count(15, 11) >>> t (26, 4) You can also unpack the tuple immediately: >>> x, y = count(15, 11) >>> x 26 >>> y 4 > Then, i have another question: i've read, some time ago, this guide > http://hetland.org/writing/instant-python.html, skipping the > object-related part. Now i've started reading it, and have found > something strange: just go where it says "Of course, now you know > there is a better way. And why don’t we give it the default value of > [] in the first place? Because of the way Python works, this would > give all the Baskets the same empty list as default contents.". Can > you please help me understanding this part? When you declare a default value in a function like this: def f(a, b, c=SOMETHING): the expression SOMETHING is calculated once, when the function is defined, and *not* each time you call the function. So if I do this: x = 1 y = 2 def f(a, b, c=x+y): return a+b+c the default value for c is calculated once, and stored inside the function: >>> f(0, 0) 3 Even if I change x or y: >>> x = >>> f(0, 0) 3 So if I use a list as a default value (or a dict), the default is calculated once and stored in the function. You can see it by looking at the function's defaults: >>> def f(alist=[]): ... alist.append(1) ... return alist >>> >>> f.func_defaults[0] [] Now, call the function without an argument: >>> f() [1] >>> f() [1, 1] >>> f() [1, 1, 1] >>> f.func_defaults[0] [1, 1, 1] How is this happening? Because every time you call the function, it appends 1 to the argument. If you don't supply an argument, it appends 1 to the default, changing it in place. Why doesn't the same thing happen here? >>> def g(x=0): ... x += 1 ... return x ... >>> g.func_defaults[0] 0 >>> g() 1 >>> g() 1 >>> g.func_defaults[0] 0 The answer is that ints are immutable: you can't change their value. When you do x+=1, it doesn't modify the int 0 in place, turning it into 1. It leaves 0 as zero, and gives you a new int equal to one. So the default value stored in the function never changes. The difference boils down to immutable objects, which can't be changed in place, and mutable objects, which can. Immutable: ints, floats, strings, tuples, frozensets Mutable: lists, dicts, sets, most custom classes -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] fast sampling with replacement
On Sun, 21 Feb 2010 03:22:19 am Andrew Fithian wrote: > Hi tutor, > > I'm have a statistical bootstrapping script that is bottlenecking on > a python function sample_with_replacement(). I wrote this function > myself because I couldn't find a similar function in python's random > library. random.choice(list) does sample with replacement. If you need more than one sample, call it in a loop or a list comprehension: >>> mylist = [1, 2, 4, 8, 16, 32, 64, 128] >>> choice = random.choice >>> values = [choice(mylist) for _ in xrange(4)] >>> values [64, 32, 4, 4] > This is the fastest version of the function I could come up > with (I used cProfile.run() to time every version I wrote) but it's > not fast enough, can you help me speed it up even more? Profiling doesn't tell you how *fast* something is, but how much time it is using. If something takes a long time to run, perhaps that's because you are calling it lots of times. If that's the case, you should try to find a way to call it less often. E.g. instead of taking a million samples, can you get by with only a thousand? Do you need to prepare all the samples ahead of time, or do them only when needed? > import random > def sample_with_replacement(list): > l = len(list) # the sample needs to be as long as list > r = xrange(l) > _random = random.random > return [list[int(_random()*l)] for i in r] # using > list[int(_random()*l)] is faster than random.choice(list) Well, maybe... it's a near thing. I won't bother showing my timing code (hint: use the timeit module) unless you ask, but in my tests, I don't get a clear consistent winner between random.choice and your version. Your version is a little bit faster about 2 out of 3 trials, but slower 1 out of 3. This isn't surprising if you look at the random.choice function: def choice(self, seq): return seq[int(self.random() * len(seq))] It is essentially identical to your version, the major difference being you use the same algorithm directly in a list comprehension instead of calling it as a function. So I would expect any difference to be small. I would say that you aren't going to speed up the sampling algorithm in pure Python. This leaves you with some options: (1) Live with the speed as it is. Are you sure it's too slow? (2) Re-write the random number generator in C. (3) Find a way to use fewer samples. (4) Use a lower-quality random number generator which is faster. > FWIW, my bootstrapping script is spending roughly half of the run > time in sample_with_replacement() much more than any other function > or method. Thanks in advance for any advice you can give me. You don't tell us whether that actually matters. Is it taking 3 hours in a 6 hour run time? If so, then it is worth spending time and effort to cut that down by a few hours. Or is it taking 0.2 seconds out of 0.4 seconds? If that's the case, then who cares? :) -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Regex to find files ending with one of a given set of extensions
On Mon, 22 Feb 2010 04:23:04 am Dayo Adewunmi wrote: > Hi all > > I'm trying use regex to match image formats: Perhaps you should use a simpler way. def isimagefile(filename): ext = os.path.splitext(filename)[1] return (ext.lower() in ('.jpg', '.jpeg', '.gif', '.png', '.tif', '.tiff')) def findImageFiles(): someFiles = [ "sdfinsf.png","dsiasd.dgf","wecn.GIF","iewijiefi.jPg","iasjasd.py"] return filter(isimagefile, someFiles) > $ python test.py > Traceback (most recent call last): > File "test.py", line 25, in > main() > File "test.py", line 21, in main > findImageFiles() > File "test.py", line 14, in findImageFiles > findImages = imageRx(someFiles) > TypeError: '_sre.SRE_Pattern' object is not callable The error is the line findImages = imageRx(someFiles) You don't call regexes, you have to use the match or search methods. And you can't call it on a list of file names, you have to call it on each file name separately. # untested for filename in someFiles: mo = imageRx.search(filename) if mo is None: # no match pass else: print filename -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Python file escaping issue?
On Mon, 22 Feb 2010 05:22:10 am Sithembewena Lloyd Dube wrote: > Hi all, > > I'm trying to read a file (Python 2.5.2, Windows XP) as follows: > > assignment_file = open('C:\Documents and Settings\coderoid\My > Documents\Downloads\code_sample.txt', 'r+').readlines() > new_file = open(new_file.txt, 'w+') > for line in assignment_file: > new_file.write(line) > > new_file.close() > assignment_file.close() > > When the code runs, the file path has the slashes converted to double > slashes. When try to escape them, i just seemto add more slashes. > What am i missing? An understanding of how backslash escapes work in Python. Backslashes in string literals (but not in text you read from a file, say) are used to inject special characters into the string, just like C and other languages do. These backslash escapes include: \t tab \n newline \f formfeed \\ backslash and many others. Any other non-special backslash is left alone. So when you write a string literal including backslashes and a special character, you get this: >>> s = 'abc\tz' # tab >>> print s abc z >>> print repr(s) 'abc\tz' >>> len(s) 5 But if the escape is not a special character: >>> s = 'abc\dz' # nothing special >>> print s abc\dz >>> print repr(s) 'abc\\dz' >>> len(s) 6 The double backslash is part of the *display* of the string, like the quotation marks, and not part of the string itself. The string itself only has a single backslash and no quote marks. So if you write a pathname like this: >>> path = 'C:\datafile.txt' >>> print path C:\datafile.txt >>> len(path) 15 It *seems* to work, because \d is left as backlash-d. But then you do this, and wonder why you can't open the file: >>> path = 'C:\textfile.txt' >>> print path C: extfile.txt >>> len(path) 14 Some people recommend using raw strings. Raw strings turn off backslash processing, so you can do this: >>> path = r'C:\textfile.txt' >>> print path C:\textfile.txt But raw strings were invented for the regular expression module, not for Windows pathnames, and they have a major limitation: you can't end a raw string with a backslash. >>> path = r'C:\directory\' File "", line 1 path = r'C:\directory\' ^ SyntaxError: EOL while scanning single-quoted string The best advice is to remember that Windows allows both forward and backwards slashes as the path separator, and just write all your paths using the forward slash: 'C:/directory/' 'C:textfile.txt' -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Drawing faces
On Mon, 22 Feb 2010 12:49:51 pm Olufemi Onanuga wrote: > Hello, > I am trying to write and test a function to meet the following > specifications > drawFace(center,size,win),center is a point,size is an int,and win is > a GraphWin.Draws a simple face of the given size in win. > I want the function to be able to draw three differnet faces in a > single window,when i invoke drawFace(center,size,win) into def > main(). > thanks > kola Do you actually have a question? Please show the code you have already written, and tell us what doesn't work. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Python file escaping issue?
On Mon, 22 Feb 2010 06:37:21 pm spir wrote: > > It *seems* to work, because \d is left as backlash-d. But then you > > do this, and wonder why you can't open the file: > > I consider this misleading, since it can only confuse newcomers. > Maybe "lonely" single backslashes (not forming a "code" with > following character(s)) should be invalid. Meaning literal > backslashes would always be doubled (in plain, non-raw, strings). > What do you think? Certainly it can lead to confusion for beginners, but it follows the convention of languages like bash and (I think) C++. There are three main ways to deal with an unrecognised escape sequence: * raise an error * ignore the backslash, e.g. \d -> d * keep the backslash, e.g. \d -> \d There are good arguments for all three, so I don't think you'll get consensus for any change. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Encryption
On Tue, 23 Feb 2010 07:50:12 am Antonio Buzzelli wrote: > Hi everyone! > I need a simple way to encrypt and decrypt some strings with a key > > someone can help me?? > > Thanks. I am the author of this package which might be enough for you: http://pypi.python.org/pypi/obfuscate/ If all you want is to "hide" some text from casual snoopers (say, to hide strings in a game so that people can't just open the game in a hex editor and read the game messages) then obfuscate may be enough. I can't emphasis this enough: the encryption algorithms in obfuscate are not up to modern standards and are NOT cryptographically secure. Do not use this where serious security is required. Otherwise, google for "python encryption". You might also like to ask on the Python newsgroup comp.lang.python for advice. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Python 3 Statistics?
On Wed, 24 Feb 2010 07:43:28 am James Reynolds wrote: > For me to progress further though, I need to do some statistics work > in Python. Does anyone know of a python module in Python 3.1 which > will allow me to do statistics work? Otherwise I will have to go back > to Python 2.5? to get numpy and scipy? If you need to go back to the 2.x series, you should use 2.6 not 2.5. The Windows installer for numpy only supports 2.5, but the source install should work with any recent Python 2.x version. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] ask
On Wed, 24 Feb 2010 02:58:52 pm Shurui Liu (Aaron Liu) wrote: > This time is not my assignment, I promise. > > In python, when we want to list numbers, we use the command "range", > like, if we want to list integer from 0 to 9, we can write: > range(10); if we want to list integer from 10 to 29, we can write: > range(10,30). I was going to show a list of number from 1.0 to 1.9, > and I did this in the same way as integer: range(1.0,2.0,0.1), but it > doesn't work. Can you help me? Thank you! Hope this helps: http://code.activestate.com/recipes/577068-floating-point-range/ -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] strange bidi requirement
On Thu, 25 Feb 2010 03:34:02 am rick wrote: > I'm trying to write a math quiz program that replicates an old book > on arithmetic. > > when it comes to summing a long column, I need to read the answer > from the user, one digit at a time. > > so, if the answer should be something like > > 14238.83 > > I would need to read the .03, then the .8, and so on. I figure all > strings, cast to int (well, for this example, float). Would this be > easier if I learned some GUI programming? Hell no! GUI programming is a whole new lot of stuff to learn! It's worth learning if you want a GUI interface, but not because it makes other things easier. > Can it be done at all in just console? Do you actually need to ask the user for one digit at a time? I don't imagine so, but could be wrong. So you can ask the user for the number at the console, and then process it in reverse: >>> s = raw_input("Please enter a decimal number: ") Please enter a decimal number: 123.456 >>> >>> print s 123.456 >>> >>> for c in reversed(s): ... print c ... 6 5 4 . 3 2 1 Hope this helps. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] raising number to a power
On Fri, 26 Feb 2010 05:34:39 am Ricardo Aráoz wrote: > So why would the coders of the math module go to the trouble of > creating the pow function? http://docs.python.org/library/math.html#math.pow http://docs.python.org/library/functions.html#pow The math module is mostly a thin wrapper around the native C maths library. The builtin pow function has more capabilities, and came before the ** operator. > Did they create a sum function As a matter of fact they did: http://docs.python.org/library/math.html#math.fsum -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] raising number to a power
On Fri, 26 Feb 2010 04:27:06 am Monte Milanuk wrote: > So... pow(4,4) is equivalent to 4**4, which works on anything - > integers, floats, etc., but math.pow(4,4) only works on floats... and > in this case it converts or interprets (4,4) as (4.0,4.0), hence > returning a float: 256.0. Is that about right? Pretty much, but the builtin pow also takes an optional third argument: >>> pow(4, 4, 65) # same as 4**4 % 65 only more efficient 61 By the way, are you aware that you can get help in the interactive interpreter? help(pow) -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Verifying My Troublesome Linkage Claim between Python and Win7
On Sun, 28 Feb 2010 05:30:49 am Wayne Watson wrote: > Ok, I'm back after a three day trip. You are correct about the use of > pronouns and a few misplaced words. I should have reread what I > wrote. I had described this in better detail elsewhere, and followed > that description with the request here probably thinking back to it. > I think I was getting a bit weary of trying to find an answer. Try > t;his. > > > Folder1 > track1.py >data1.txt >data2.txt >data3.txt > > Folder2 > track1.py > dset1.txt > dset2.txt > ... > dset8.txt > > data and dset files have the same record formats. track1.py was > copied into Folder2 with ctrl-c + ctrl-v. When I run track1.py from > folder1, it clearly has examined the data.txt files. If I run the > copy of track1.py in folder2, it clearly operates on folder1 (one) > data.txt files. This should not be. Without seeing the code in track1.py, we cannot judge whether it should be or not. I can think of lots of reasons why it should be. For example: if you have hard-coded the path to Folder1 if you call os.chdir(Folder1) if you have changed the PATH so that Folder1 gets searched before the current directory then the behaviour you describe theoretically could happen. How are you calling track1.py? Do you do this? cd Folder2 python track1.py What if you change the second line to: python ./track1.py Are you perhaps using this? python -m track1 If you change the name of the copy from track1.py to copy_of_track1.py, and then call this: python copy_of_track1.py how does the behaviour change? > If I look at the properties of track1.py in folder2 (two), it is > pointing back to the program in folder1 (one). What? "Pointing back", as in a Shortcut? Or a symlink? If you've created a shortcut instead of a copy, I'm not surprised you are executing it in the "wrong" folder. That's what shortcuts do. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Verifying My Troublesome Linkage Claim between Python and Win7
Hello Wayne, I sympathise with your problem, but please understand you are not making it easy for us when you give us incoherent information. You tell us to "try this" and give a folder structure: Folder1 track1.py data1.txt data2.txt data3.txt Folder2 track1.py dset1.txt dset2.txt ... dset8.txt but then when you send a copy of the actual code you are running, it is called "ReportingToolAwww.py" and it is 417 lines long. What happened to track1.py? What is in that? Does track1.py reproduce the fault? There are five possible faults: 1 A problem in your Python code. 2 A serious bug in Python itself. 3 A serious bug in Windows file system. 4 Disk corruption making Windows confused. 5 A PEBCAK problem. I can confirm that ReportingToolAwww.py doesn't seem to contain any "funny" path manipulations that would cause the problem: it simply calls open on relative path names, which will open files in the current directory. The problem does NOT appear to be in your Python code. A serious bug in either Python or Windows is very unlikely. Not impossible, but unless somebody else reports that they too have seen the fault, we can dismiss them. Disk corruption is possible. If all else fails, you can run the Windows disk utility to see if it finds anything. But the most likely cause of the fault is that you aren't running what you think you are running. When you say: "If I've created a shortcut, it wasn't by design. Ctrl-c to ctrl-v most likely." "Most likely"? Meaning you're not sure? Given that you are talking about the Properties window talking about "pointing to" things, I think it is very likely that in fact you have created a shortcut, or a symlink, and when you think you are running a copy in Folder2 you are actually running a shortcut to Folder1. That would *exactly* explain the problem you are experiencing. Please take a screenshot of the Properties window showing the "pointing to" stuff. I think you will find that track1.py in Folder2 is a shortcut back to track1.py in Folder1. (For the record, Windows does in fact have symlinks, as well as hard links and a third type of link called junction points. They are undersupported by Explorer, and so people hardly every use them. Most people don't even know they exist, even though some of them go back all the way to Windows NT. But as far as I can tell, there is no way for you to have created a symlink from Explorer.) -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Over-riding radians as default for trig calculations
On Mon, 1 Mar 2010 06:39:10 am AG wrote: > After importing the math module and running > > math.cos( x ) > > the result is in radians. It certainly is not. The *result* of cos is a unitless number, not an angle. What you mean is that the *input* to cos, x, has to be supplied in radians. No, you can't change that anywhere, but you can do this: >>> math.cos(math.radians(45)) 0.70710678118654757 So of course you can write your own function: def cos(x): return math.cos(math.radians(x)) and call that. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Any Tutor there ? Removing redundant parameters in a models file having include files.
On Tue, 2 Mar 2010 07:07:57 am Karim Liateni wrote: > Thanks for this precision! > I'm using standard python so this is ok! > Why people use proprietary python ? > To have more trouble ? To be different from the rest of community ? Python is a language, but there can be many different implementations of that language, just like there are different C compilers or different Javascript engines. CPython is the version which was made first, it is the most common version, but it is not the only one. It is called CPython because it is written in C. Jython is a version of Python written in Java, and it was created by people wanting to use Python as a front-end to Java libraries, and to take advantage of Java's garbage collector. IronPython is Microsoft's version of Python written for .Net and Mono. PyPy is an experimental version of Python written in Python, used by people wanting to experiment with Python compilers. "Python for S60" is a version of Python written for Nokia's S60 devices. CapPython is an experimental version of Python designed for security. There are many others, they are all Python, but they have differences. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Any Tutor there ? Removing redundant par ameters in a models file having include files.
On Tue, 2 Mar 2010 11:25:44 am Andreas Kostyrka wrote: > Furthermore I do not think that most of the "core" community has a > problem with the alternate implementations, as they provide very > useful functions (it helps on the architecture side, because it > limits somewhat what can be done, it helps on the personal side, > because it increases the value of Python skills, ...), ... The Python development team values alternative implementations, as it gives Python the language a much wider user base. It also allows other people to shoulder some of the development burden. For example, people who want Python without the limitations of the C call stack can use Stackless Python, instead of ordinary CPython. Google is sponsoring a highly optimized version of Python with a JIT compiler: Unladen Swallow. It looks likely that Unladen Swallow will end up being merged with CPython too, which will be a great benefit. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] parsing a "chunked" text file
On Tue, 2 Mar 2010 05:22:43 pm Andrew Fithian wrote: > Hi tutor, > > I have a large text file that has chunks of data like this: > > headerA n1 > line 1 > line 2 > ... > line n1 > headerB n2 > line 1 > line 2 > ... > line n2 > > Where each chunk is a header and the lines that follow it (up to the > next header). A header has the number of lines in the chunk as its > second field. And what happens if the header is wrong? How do you handle situations like missing headers and empty sections, header lines which are wrong, and duplicate headers? line 1 line 2 headerB 0 headerC 1 line 1 headerD 2 line 1 line 2 line 3 line 4 headerE 23 line 1 line 2 headerB 1 line 1 This is a policy decision: do you try to recover, raise an exception, raise a warning, pad missing lines as blank, throw away excess lines, or what? > I would like to turn this file into a dictionary like: > dict = {'headerA':[line 1, line 2, ... , line n1], 'headerB':[line1, > line 2, ... , line n2]} > > Is there a way to do this with a dictionary comprehension or do I > have to iterate over the file with a "while 1" loop? I wouldn't do either. I would treat this as a pipe-line problem: you have a series of lines that need to be processed. You can feed them through a pipe-line of filters: def skip_blanks(lines): """Remove leading and trailing whitespace, ignore blank lines.""" for line in lines: line = line.strip() if line: yield line def collate_section(lines): """Return a list of lines that belong in a section.""" current_header = "" accumulator = [] for line in lines: if line.startswith("header"): yield (current_header, accumulator) current_header = line accumulator = [] else: accumulator.append(line) yield (current_header, accumulator) Then put them together like this: fp = open("my_file.dat", "r") data = {} # don't shadow the built-in dict non_blank_lines = skip_blanks(fp) sections = collate_sections(non_blank_lines) for (header, lines) in sections: data[header] = lines Of course you can add your own error checking. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] List comprehension possible with condition statements?
On Wed, 3 Mar 2010 07:46:39 pm Alan Gauld wrote: > mylist = [irtem for item in aList where item != None] Comparisons with None almost always should be one of: item is None item is not None The reason is that "item is None" is ONLY ever true if the item actually is the singleton object None (accept no substitutes!). On the other hand, "item == None" might be true for some customer items. So if you actually *do* want to accept substitutes, you can use ==, but that would be an unusual thing to do, and worthy of a comment explaining that you did mean == and it isn't a mistake. Likewise for "item != None" versus "item is not None". -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] unittest
On Thu, 4 Mar 2010 04:27:23 am C.T. Matsumoto wrote: > Hello, > > Can someone tell me the difference between unittests assertEqual and > assertEquals? assertEqual, assertEquals and failUnless are three spellings for the same thing. There is no difference. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] sorting algorithm
On Thu, 4 Mar 2010 04:34:09 am Glen Zangirolami wrote: > http://www.sorting-algorithms.com/ > > It is a fantastic website that explains each kind of sort and how it > works. They also show you visuals how the sorts work and how fast > they go based on the amount of data. For actual practical work, you aren't going to beat the performance of Python's built-in sort. Unless you are an expert, don't even think about trying. If you are an expert, you've surely got better things to do. > Depending on the amount/kind of data I would choose a sorting > algorithm that fits your needs. > > Bubble sorts tends to get worse the larger the data set but can be > very useful to sort small lists. Bubble sorts are useless for any real work. They are a shockingly bad way of sorting anything beyond perhaps 4 or 5 items, and even for lists that small there are other algorithms which are barely more complicated but significantly faster. Bubble sorts not only get slower as the list gets bigger, but they do so at an every increasing rate: let's say it takes 1 second to sort 100 items (for the sake of the argument). Then it will take: 4 seconds to sort 200 items 9 seconds to sort 300 items 16 seconds to sort 400 items 25 seconds to sort 500 items 36 seconds to sort 600 items ... and so forth. In other words, multiplying the number of items by a factor of X will multiply the time taken by X squared. The only advantage of bubble sort is that the algorithm is easy to code. Otherwise it is horrible in just about every imaginable way. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] How to wrap ctype functions
On Thu, 4 Mar 2010 04:43:28 am Jan Jansen wrote: > Hi there, > > I wonder what's the best way to wrap given function calls (in this > case ctype function calls but more generally built-in functions and > those kinds). I have a huge c library and almost all functions return > an error code. The library also contains a function, that returns the > corresponding error message to the error code. So, what I need to do > for every call to one of the libraries functions looks like this: > > error_code = my_c_library.SOME_FUNCTION_ > CALL(argument) > if error_code != 0: >error_message = my_c_library.GET_ERROR_TEXT(error_code) >print "error in function call SOME_FUNCTION_CALL" >print error_message >my_c_library.EXIT_AND_CLEAN_UP() Something like this: class MyCLibraryError(ValueError): # Create our own exception. pass import functools def decorate_error_code(func): @functools.wraps(func) def inner(*args, **kwargs): error_code = func(*args, **kwargs) if error_code != 0: msg = my_c_library.GET_ERROR_TEXT(error_code) my_c_library.EXIT_AND_CLEAN_UP() raise MyCLibraryError(msg) return inner # note *no* brackets Then wrap the functions: some_function_call = decorate_error_code( my_c_library.SOME_FUNCTION_CALL) another_function_call = decorate_error_code( my_c_library.ANOTHER_FUNCTION_CALL) > Also, for some function calls I would need to do some preperations > like: > > error_code = my_c_library.LOG_IN_AS_ADMINISTRATOR(admin_user, > admin_password) > error_code = my_c_library.SOME_FUNCTION_CALL(argument) > > I like the decorator idea, but I can't figure out if it's applicable > here. To be able to call the function in a manner like this would be > great, e.g. > > @change_user(admin_user, admin_password) > @error_handling > my_c_library.SOME_FUNCTION_CALL(argument) That's not how the decorator syntax works. You can only use the @decorator syntax immediately above a function definition: @error_handling def some_function(): ... Otherwise, you use the decorator like a function, you give the name of another function as argument, and it returns a new function: new_function = error_handling(some_function) -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] List comprehension possible with condition statements?
On Thu, 4 Mar 2010 05:18:40 am Alan Gauld wrote: > "Steven D'Aprano" wrote > > > Comparisons with None almost always should be one of: > > > > item is None > > item is not None > > Yes, but the reason I changed it (I originally had "is not") is that > != is a more general test for illustrating the use of 'if' within a > LC which seemed to be the real issue within the question. List comps can include *any* comparison: [x+1 for x in data if (3*x+2)**2 > 100*x or x < -5] -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] unittest
On Thu, 4 Mar 2010 05:32:22 pm you wrote: > Steven D'Aprano wrote: > > On Thu, 4 Mar 2010 04:27:23 am C.T. Matsumoto wrote: > >> Hello, > >> > >> Can someone tell me the difference between unittests assertEqual > >> and assertEquals? > > > > assertEqual, assertEquals and failUnless are three spellings for > > the same thing. There is no difference. > > Thanks, > Okay, does anyone know why unittests have 3 ways to do the same > thing? They're not three different ways, they are three names for the same way. The unittest module is meant to be equivalent to Java's unit test library, so possibly it is because Java has three names for the same thing, and so Python's version tried to be as similar as possible. Or possibly because the author(s) of the unittest module couldn't agree on what name to give the functions. Or possibly it was deliberate, because the authors felt that sometimes you want a positive test "assert this is true" and sometimes you want a negative test "fail unless this is true". -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] lazy? vs not lazy? and yielding
On Fri, 5 Mar 2010 07:57:18 am Andreas Kostyrka wrote: > > A list comprehension builds a whole list at one time. So if the > > list needed is large enough in size, it'll never finish, and > > besides, you'll run out of memory and crash. A generator > > expression builds a function instead which *acts* like a list, but > > actually doesn't build the values > > Well, it act like an iterable. A list-like object would probably have > __getitem__ and friends. Historically (as in I doubt you'll find a > version that still does it that way in the wild, I looked it up, > guess around 2.2 iterators were introduced), the for loop in python > has been doing a "serial" __getitem__ call. for loops still support the serial __getitem__ protocol. While it is obsolete, it hasn't been removed, as it is still needed to support old code using it. When you write "for item in thing", Python first attempts the iterator protocol. It tries to call thing.next(), and if that succeeds, it repeatedly calls that until it raises StopIteration. If thing doesn't have a next method, it tries calling thing.__getitem__(0), __getitem__(1), __getitem__(2), and so on, until it raises IndexError. If thing doesn't have a __getitem__ method either, then the loop fails with TypeError. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] object representation
On Fri, 5 Mar 2010 01:22:52 am Dave Angel wrote: > spir wrote: [...] > > PS: Would someone point me to typical hash funcs for string keys, > > and the one used in python? > > http://effbot.org/zone/python-hash.htm But note that this was written a few years ago, and so may have been changed. As for "typical" hash functions, I don't think there is such a thing. I think everyone creates there own, unless there is a built-in hash function provided by the language or the library, which is what Python does. I've seen hash functions as simple as: def hash(s): if not s: return 0 else: return ord(s[0]) but of course that leads to *many* collisions. Slightly better (but not much!) is def hash(s): n = 0 for c in s: n += ord(c) return n % sys.maxint This is a pretty awful hash function though. Don't use it. You might also like to read this thread, to get an idea of the thought that has been put into Python's hashing: http://mail.python.org/pipermail/python-dev/2004-April/044244.html [...] > I figure every object has exactly three items in it: a ref count, a > implementation pointer, and a payload. Not quite. CPython has ref counts. IronPython and Jython don't. Other Pythons may or may not. I'm not quite sure what you mean by "implementation pointer" -- I think you're talking about a pointer back to the type itself. It's normal to just to refer to this as "the type", and ignore the fact that it's actually a pointer. The type data structure (which itself will be an object) itself is not embedded in every object! And of course, other Pythons may use some other mechanism for linking objects back to their type. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] object representation
On Thu, 4 Mar 2010 06:47:04 pm spir wrote: > Hello, > > In python like in most languages, I guess, objects (at least > composite ones -- I don't know about ints, for instance -- someone > knows?) are internally represented as associative arrays. No. You can consider a Python object to be something like a C struct, or a Pascal record. The actual structure of the object depends on the type of the object (ints, strings, lists, etc will be slightly different). See Dave Angel's post for a good description. And of course this is implementation dependent: CPython, being written in C, uses C structs to implement objects. Other Pythons, written in other languages, will use whatever data structure makes sense for their language. That's almost certainly going to be a struct-like structure, since that is a pretty fundamental data type, but it could be different. If you want to know exactly how objects are represented internally by Python, I'm afraid you will probably need to read the source code. But a good start is here: http://effbot.org/zone/python-objects.htm > Python > associative arrays are dicts, which in turn are implemented as hash > tables. Correct? Does this mean that the associative arrays > representing objects are implemented like python dicts, thus hash > tables? "Associate array" is a generic term for a data structure that maps keys to items. There are lots of ways of implementing such an associative array: at least two different sorts of hash tables, about a dozen different types of binary trees, and so on. In Python, associate arrays are called "dicts", and they are implemented in CPython as hash tables with chaining. But objects are not hash tables. *Some* objects INCLUDE a hash table as one of its fields, but not all. For example: >>> int.__dict__ >>> (2).__dict__ Traceback (most recent call last): File "", line 1, in AttributeError: 'int' object has no attribute '__dict__' >>> class C: pass ... >>> C.__dict__ {'__module__': '__main__', '__doc__': None} That is why you can't add attributes to arbitrary built-in objects: >>> {}.x = None Traceback (most recent call last): File "", line 1, in AttributeError: 'dict' object has no attribute 'x' Because the dict instance has no __dict__, there is nowhere to insert the attribute. > I was wondering about the question because I guess the constraints > are quite different: * Dict keys are of any type, including > heterogeneous (mixed). Object keys are names, ie a subset of strings. CPython's implementation of hash tables is highly optimized for speed. The biggest bottleneck is the hash function, and that is tuned to be extremely efficient for strings and ints. [...] > So, I guess the best implementations for objects and dicts may be > quite different. i wonder about alternatives for objects, in > particuliar trie and variants: http://en.wikipedia.org/wiki/Trie, > because they are specialised for associative arrays which keys are > strings. *shrug* Maybe, maybe not. Tries are a more complicated data structure, which means bigger code and more bugs. They don't fit in very well with CPython's memory management system. And they use a large number of pointers, which can be wasteful. E.g. a trie needs six pointers just to represent the single key "python": '' -> 'p' -> 'y' -> 't' -> 'h' -> 'o' -> 'n' while a hash table uses just one: -> 'python' -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Understanding (Complex) Modules
articular problem. So far, looking at the > plentiful number of examples of MPL, and probably some of the other > modules mentioned above have not provided a lot of insight. Big, complex modules tend to have steep learning curves. There's no magic path to learning how to use a big complex module any more than there is a magic path to learning how to be a brain surgeon, or a car mechanic. > Is there some relationship between modules and objects that I'm not > seeing that could be of value? Modules are themselves objects. Everything in Python is an object: strings, ints, floats, lists, tuples, everything. Modules are compound objects, in that they contain other objects accessible by name: math.sin means "look up the name 'sin' in the math module, and return whatever you find", which in this case is a function object. And that's pretty much it. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Really miss the C preprocessor!!
On Sat, 6 Mar 2010 04:13:49 pm Gil Cosson wrote: > I have some code to perform an exponential backoff. I am interacting > with a few different web services. Checking for a specific Exception > and then re-invoking the failed call after waiting a bit. I don't > want to code the backoff in ten different places. Really miss the C > pre-processor. > > I am tempted to try something like the following, but I understand > that eval should be a last resort. > > Any pythonic best practice for this situation?: Functions are first-class objects that can be passed around as data. Use them. def check_some_website(url, x): # whatever def exponential_backoff(function, args, exception, numretries): # Backoff 1 second, 2 seconds, 4 seconds, 8, 16, ... t = 1 for i in xrange(numretries): try: return function(*args) except exception: time.sleep(t) t *= 2 raise exception("no connection after %d attempts" % numretries) result = exponential_backoff(check_some_website, ("http://example.com";, 42), HTTPError, 8) Any time you think you need eval, you almost certainly don't. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Really miss the C preprocessor!!
On Sat, 6 Mar 2010 07:17:43 pm you wrote: > I thought about suggesting using decorators for this, since I've done > something similar (not exactly exponential backoff, but retrying a > few times on exception). However, as I started writing the example, I > got stuck at expressing a generic way to pass the exception and > numretries. I was just wondering is there some way to do this sort of > thing with decorators ?? [...] > is there a way to decorate a function alongwith additional parameters > which will be passed to the function ? Yes, you need a decorator factory -- a function which returns a decorator. def factory(exception, numretries): def decorator(func): @functools.wraps(func) def inner(*args, **kwargs): t = 1 for i in range(numretries): try: return func(*args, **kwargs) except exception: time.sleep(t) t *= 2 msg = "no connection after %d attempts" % numretries raise exception(msg) return inner return decorator @factory(HTTPError, 8) def check_some_website(url, x): ... I haven't tested the above code, but it should do the trick. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] recursive generator
On Sun, 7 Mar 2010 11:58:05 pm spir wrote: > Hello, > > Is it possible at all to have a recursive generator? Yes. >>> def recursive_generator(n): ... if n == 0: ... return ... yield n ... for i in recursive_generator(n-1): ... yield i ... >>> it = recursive_generator(5) >>> it.next() 5 >>> it.next() 4 >>> it.next() 3 >>> it.next() 2 >>> it.next() 1 >>> it.next() Traceback (most recent call last): File "", line 1, in StopIteration > I think at a > iterator for a recursive data structure (here, a trie). The following > code does not work: it only yields a single value. Like if > child.__iter__() were never called. > > def __iter__(self): > ''' Iteration on (key,value) pairs. ''' > print '*', > if self.holdsEntry: > yield (self.key,self.value) > for child in self.children: > print "<", > child.__iter__() > print ">", > raise StopIteration __iter__ should be an ordinary function, not a generator. Something like this should work: # Untested. def __iter__(self): ''' Iteration on (key,value) pairs. ''' def inner(): print '*', # Side effects bad... if self.holdsEntry: yield (self.key,self.value) for child in self.children: print "<", child.__iter__() print ">", raise StopIteration return inner() This means that your class won't *itself* be an iterator, but calling iter() on it will return a generator object, which of course is an iterator. If you want to make your class an iterator itself, then you need to follow the iterator protocol. __iter__ must return the instance itself, and next must return (not yield) the next iteration. class MyIterator(object): def __init__(self, n): self.value = n def next(self): if self.value == 0: raise StopIteration self.value //= 2 return self.value def __iter__(self): return self See the discussion in the docs about the iterator protocol: http://docs.python.org/library/stdtypes.html#iterator-types > Why is no child.__iter__() executed at all? I imagine this can be > caused by the special feature of a generator recording current > execution point. That's exactly what generators do: when they reach a yield, execution is halted, but the state of the generator is remembered. Then when you call the next() method, execution resumes. > (But then, is it at all possible to have a call in a > generator? Or does the issue only appear whan the callee is a > generator itself?) Else, the must be an obvious error in my code. I don't understand what you mean. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] recursive generator
On Mon, 8 Mar 2010 01:49:21 am Stefan Behnel wrote: > Steven D'Aprano, 07.03.2010 14:27: > > __iter__ should be an ordinary function, not a generator. Something > > like this should work: [...] > That's just an unnecessarily redundant variation on the above. It's > perfectly ok if __iter__() is a generator method. So it is. Thank you for the correction. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] __iter__: one obvious way to do it
On Mon, 8 Mar 2010 02:07:41 am spir wrote: > [sorry, forgot the code] > > Hello, > > Below 6 working way to implement __iter__ for a container here > simulated with a plain inner list. Sure, the example is a bit > artificial ;-) > 1. __iter__ returns a generator _expression_ > def __iter__(self): > return (pair(n) for n in self.items) Seems perfectly acceptable to me. That's syntactic sugar for the next one: > 2. __iter__ *is* a generator > def __iter__(self): > for n in self.items: > yield pair(n) > raise StopIteration As Stefan pointed out, __iter__ can be a generator, so that's okay too. However, the StopIteration at the end is redundant: generators automatically raise StopIteration when they "fall off the end" at the end of the code. So this is equivalent to the above: def __iter__(self): for n in self.items: yield pair(n) > 3. __iter__ returns a generator >(this one is a bit weird, i guess) > def __iter__(self): > return self.pairs() There's nothing weird about it. It's the difference between writing code directly inline, and calling it indirectly: def f(): return [1, 2, 3] versus: def indirect(): return [1, 2, 3] def f(): return indirect() If you have a good reason for the indirection, it is perfectly fine. E.g. you might have a class with multiple iterators: class X: def iter_width(self): """Iterate left-to-right""" pass def iter_depth(self): """Iterate top-to-bottom""" pass def iter_spiral(self): pass def __iter__(self): # Default. return self.iter_width() > 4. __iter__ returns self, its own iterator via next() > def __iter__(self): > self.i=0 > return self That's how you would make the class an iterator directly. > 5. __iter__ returns an external iterator object > def __iter__(self): > return Iter(self) Built-ins such as lists, dicts, tuples and sets use that strategy: >>> iter([1,2,3]) >>> iter(dict(a=1,b=2)) > 6. __iter__ returns iter() of a collection built just on time >(this one is really contrived) > def __iter__(self): > return iter(tuple([pair(n) for n in self.items])) Not contrived, but inefficient. First you build a list, all at once, using a list comprehension. So much for lazy iteration, but sometimes you have good reason for this (see below). Then you copy everything in the list into a tuple. Why? Then you create an iterator from the tuple. If you remove the intermediate tuple, it is a reasonable approach for ensuring that you can modify the original object without changing any iterators made from it. In other words, __iter__ returns a *copy* of the data in self. But the easiest way to do that: def __iter__(self): return iter([pair(n) for n in self.items]) No need to make a tuple first. > Also, one can always traverse the collection (already existing or > built then) itself if it not quasi-infinite (no __iter__ at all). The point of __iter__ is to have a standard way to traverse data structures, so you can traverse them with for-loops. Otherwise, every data structure needs a different method: for item in tree.traverse(): for item in mapping.next_key(): for item in sequence.get_next_item(): for item in obj.walker(): > "There should be one-- and preferably only one --obvious way to do > it" http://www.python.org/dev/peps/pep-0020/ This doesn't mean that there should be *only* one way to do something. It means that the should be one OBVIOUS way to do it. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Printing time without "if" statement
On Mon, 8 Mar 2010 03:38:49 pm Elisha Rosensweig wrote: > Hi, > > I have an event-based simulator written in Python (of course). It > takes a while to run, and I want to have messages printed every so > often to the screen, indicating the simulation time that has passed. > Currently, every event that occurs follows the following code > snippet: [...] > This seems stupid to me - why should I have to check each round if > the time has been passed? I was thinking that there might be a > simpler way, using threading, but I have zero experience with > threading, so need your help. The idea was to have a simple thread > that waited X time (CPU cycles, say) and then read the "current_time" > variable and printed it. > > Any simple (simpler?) solutions to this? That's a brilliant idea. I think I will steal it for my own code :) (Why didn't I think of it myself? Probably because I almost never use threads...) Anyway, here's something that should help: import time import threading class Clock(threading.Thread): def __init__(self, *args, **kwargs): super(Clock, self).__init__(*args, **kwargs) self.finished = False def start(self): print "Clock %s started at %s" % (self.getName(), time.ctime()) super(Clock, self).start() def run(self): while 1: if self.finished: break print "Clock %s still alive at %s" % ( self.getName(), time.ctime()) time.sleep(2) print "Clock %s quit at %s" % (self.getName(), time.ctime()) def quit(self): print "Clock %s asked to quit at %s" % ( self.getName(), time.ctime()) self.finished = True def do_work(): clock = Clock(name="clock-1") clock.start() # Start processing something hard. for i in xrange(8): print "Processing %d..." % i # Simulate real work with a sleep. time.sleep(0.75) clock.quit() And in action: >>> do_work() Clock clock-1 started at Mon Mar 8 17:40:42 2010 Processing 0... Clock clock-1 still alive at Mon Mar 8 17:40:43 2010 Processing 1... Processing 2... Clock clock-1 still alive at Mon Mar 8 17:40:45 2010 Processing 3... Processing 4... Processing 5... Clock clock-1 still alive at Mon Mar 8 17:40:47 2010 Processing 6... Processing 7... Clock clock-1 still alive at Mon Mar 8 17:40:49 2010 Clock clock-1 asked to quit at Mon Mar 8 17:40:49 2010 >>> Clock clock-1 quit at Mon Mar 8 17:40:51 2010 >>> There's a bit of a display artifact in the interactive interpreter, when the final quit message is printed: the interpreter doesn't notice it needs to redraw the prompt. But other than that, it should be fine. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] use of __new__
On Fri, 12 Mar 2010 06:03:35 am spir wrote: > Hello, > > I need a custom unicode subtype (with additional methods). This will > not be directly used by the user, instead it is just for internal > purpose. I would like the type to be able to cope with either a byte > str or a unicode str as argument. In the first case, it needs to be > first decoded. I cannot do it in __init__ because unicode will first > try to decode it as ascii, which fails in the general case. Are you aware that you can pass an explicit encoding to unicode? >>> print unicode('cdef', 'utf-16') 摣晥 >>> help(unicode) Help on class unicode in module __builtin__: class unicode(basestring) | unicode(string [, encoding[, errors]]) -> object > So, I > must have my own __new__. The issue is the object (self) is then a > unicode one instead of my own type. > > class Unicode(unicode): > Unicode.FORMAT = "utf8" > def __new__(self, text, format=None): > # text can be str or unicode > format = Unicode.FORMAT if format is None else format > if isinstance(text,str): > text = text.decode(format) > return text > ... > > x = Unicode("abc")# --> unicode, not Unicode That's because you return a unicode object :) Python doesn't magically convert the result of __new__ into your class, in fact Python specifically allows __new__ to return something else. That's fairly unusual, but it does come in handy. "format" is not a good name to use. The accepted term is "encoding". You should also try to match the function signature of the built-in unicode object, which includes unicode() -> u''. Writing Unicode.FORMAT in the definition of Unicode can't work: >>> class Unicode(unicode): ... Unicode.FORMAT = 'abc' ... Traceback (most recent call last): File "", line 1, in File "", line 2, in Unicode NameError: name 'Unicode' is not defined So it looks like you've posted something slightly different from what you are actually running. I have tried to match the behaviour of the built-in unicode as close as I am able. See here: http://docs.python.org/library/functions.html#unicode class Unicode(unicode): """Unicode(string [, encoding[, errors]]) -> object Special Unicode class that has all sorts of wonderful methods missing from the built-in unicode class. """ _ENCODING = "utf8" _ERRORS = "strict" def __new__(cls, string='', encoding=None, errors=None): # If either encodings or errors is specified, then always # attempt decoding of the first argument. if (encoding, errors) != (None, None): if encoding is None: encoding = cls._ENCODING if errors is None: errors = cls._ERRORS obj = super(Unicode, cls).__new__( Unicode, string, encoding, errors) else: # Never attempt decoding. obj = super(Unicode, cls).__new__(Unicode, string) assert isinstance(obj, Unicode) return obj >>> Unicode() u'' >>> Unicode('abc') u'abc' >>> Unicode('cdef', 'utf-16') u'\u6463\u6665' >>> Unicode(u'abcd') u'abcd' -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] use of __new__
On Fri, 12 Mar 2010 11:26:19 am Alan Gauld wrote: > "spir" wrote > > > The issue is the object (self) is then a unicode one instead of my > > own type. > > I think you need to modify self in __new__ The method signature for __new__ is usually written as: def __new__(cls, args): because when __new__ is called, no instance yet exists. __new__ is the constructor method which creates an instance, so it gets the class as the first instance. __new__ can then do one of two things: (1) return a new instance of your class; or (2) return something else. If it returns an instance of your class, Python then automatically calls the initializer __init__ with that instance as an argument (plus any other arguments passed to __new__). >>> class MyClass(object): ... def __new__(cls): ... print "Calling __new__ on object %s" % cls ... return super(MyClass, cls).__new__(cls) ... def __init__(self): ... print "Calling __init__ on object %s" % self ... >>> o = MyClass() Calling __new__ on object Calling __init__ on object <__main__.MyClass object at 0xb7c6f44c> >>> o <__main__.MyClass object at 0xb7c6f44c> For mutable types, you can modify self inside __init__, but that doesn't work for immutable objects like unicode, str, int, etc. For them, you have to do any changes inside __new__ BEFORE creating the instance. In the second case, where __new__ returns something else, __init__ is never called: >>> class AnotherClass(MyClass): ... def __new__(cls): ... ignore = super(AnotherClass, cls).__new__(cls) ... return 42 ... >>> o = AnotherClass() Calling __new__ on object >>> o 42 -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] use of __new__
On Fri, 12 Mar 2010 11:53:16 am Steven D'Aprano wrote: > I have tried to match the behaviour of the built-in unicode as close > as I am able. See here: > http://docs.python.org/library/functions.html#unicode And by doing so, I entirely forgot that you want to change the default encoding from 'ascii' to 'utf-8'! Oops. Sorry about that. Try changing this bit: > else: # Never attempt decoding. > obj = super(Unicode, cls).__new__(Unicode, string) to this: # Untested else: if isinstance(string, unicode): # Don't do any decoding. obj = super(Unicode, cls).__new__(Unicode, string) else: if encoding is None: encoding = cls._ENCODING if errors is None: errors = cls._ERRORS obj = super(Unicode, cls).__new__( Unicode, string, encoding, errors) You can probably clean up the method to make it a bit tidier. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] use of __new__
On Fri, 12 Mar 2010 06:03:35 am spir wrote: > Hello, > > I need a custom unicode subtype (with additional methods). [snip] Here's my second attempt, and a very simple test function that passes. Obviously you have to add your own additional methods :) class Unicode(unicode): """Unicode(string [, encoding[, errors]]) -> object Special Unicode class that has all sorts of wonderful methods missing from the built-in unicode class. """ _ENCODING = "utf8" _ERRORS = "strict" def __new__(cls, string='', encoding=None, errors=None): optional_args = not (encoding is errors is None) # Set default encoding and errors. if encoding is None: encoding = cls._ENCODING if errors is None: errors = cls._ERRORS # To match the behaviour of built-in unicode, if either # optional argument is specified, we always attempt decoding. if optional_args or not isinstance(string, unicode): args = (string, encoding, errors) else: args = (string,) return super(Unicode, cls).__new__(Unicode, *args) def test(): assert Unicode() == u'' assert Unicode('abcd') == u'abcd' u = 'cdef'.decode('utf-16') assert u == u'\u6463\u6665' s = u.encode('utf-8') assert Unicode(s) == u try: unicode(s) except UnicodeDecodeError: pass else: assert False, 'failed to fail as expected' -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] use of __new__
I've taken the liberty of replying back to the list rather than in private. Denis, if you mean to deliberately reply privately, please say so at the start of the email, otherwise I will assume it was an accident. On Fri, 12 Mar 2010 09:56:11 pm spir wrote: > Side-question: Why use super() when we know it can only be unicode? super is necessary for multiple inheritance to work correctly: class SpecialString(MyOtherStringClass, Unicode): ... will have hard-to-find bugs if you don't use super. But if you are absolutely sure that you will never directly or indirectly use multiple inheritance, then you could replace the calls to super with: unicode.__new__(...) But why bother? super does the right thing for both single and multiple inheritance. > And why use cls when we know it can only be Unicode? Because you might want to subclass Unicode, and if you use cls then everything will just work correctly, but if you hard-code the name of the class, things will break. Actually, my code has a bug. I wrote: return super(Unicode, cls).__new__(Unicode, *args) in the __new__ method, but that hard-codes the name of the class. Let's try it: >>> type(Unicode()) # Unicode class as defined in my previous post. >>> class K(Unicode): ... pass ... >>> type(K()) Broken! I hang my head in shame :( So you need to replace the above return with: return super(Unicode, cls).__new__(cls, *args) and then it will work correctly: >>> class K(Unicode): ... pass ... >>> type(K()) You might be tempted to change the first reference to Unicode to cls as well, but sadly that does not work. The reason is complicated, and to be honest I don't remember it, but you will probably find it by googling for "python super gotchas". -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] sorting algorithm
On Fri, 12 Mar 2010 11:04:13 pm C.T. Matsumoto wrote: [snip 269 lines of quoted text] > Thanks Jeff. Indeed when I kept the code as is and added a doubled > element to the input list, it went into an infinite loop. For running > the swap it doesn't matter if the elements are equal. Catching equal > elements makes a recursive loop. As I found out when I tested it. In future, could you please trim your quoting to be a little less excessive? There's no need to included the ENTIRE history of the thread in every post. That was about four pages of quoting to add five short sentences! Thank you. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] %s %r with cutom type
On Fri, 12 Mar 2010 10:29:17 pm spir wrote: > Hello again, > > A different issue. On the custom Unicode type discussed in another > thread, I have overloaded __str__ and __repr__ to get encoded byte > strings (here with debug prints & special formats to distinguish from > builtin forms): [...] > Note that Unicode.__str__ is called neither by "print us", nore by > %s. What happens? Why does the issue only occur when using both > format %s & %s? The print statement understands how to directly print strings (byte-strings and unicode-strings) and doesn't call your __str__ method. http://docs.python.org/reference/simple_stmts.html#the-print-statement You can demonstrate that with a much simpler example: >>> class K(unicode): ... def __str__(self): return "xyz" ... def __repr__(self): return "XYZ" ... >>> k = K("some text") >>> str(k) 'xyz' >>> repr(k) 'XYZ' >>> print k some text print only calls __str__ if the object isn't already a string. As for string interpolation, I have reported this as a bug: http://bugs.python.org/issue8128 I have some additional comments on your class below: > class Unicode(unicode): > ENCODING = "utf8" > def __new__(self, string='', encoding=None): This is broken according to the Liskov substitution principle. http://en.wikipedia.org/wiki/Liskov_substitution_principle The short summary: subclasses should only ever *add* functionality, they should never take it away. The unicode type has a function signature that accepts an encoding and an errors argument, but you've missed errors. That means that code that works with built-in unicode objects will break if your class is used instead. If that's intentional, you need to clearly document that your class is *not* entirely compatible with the built-in unicode, and preferably explain why you have done so. If it's accidental, you should fix it. A good start is the __new__ method I posted earlier. > if isinstance(string,str): > encoding = Unicode.ENCODING if encoding is None else > encoding string = string.decode(encoding) > return unicode.__new__(Unicode, string) > def __repr__(self): > print '+', > return '"%s"' %(self.__str__()) This may be a problem. Why are you making your unicode class pretend to be a byte-string? Ideally, the output of repr(obj) should follow this rule: eval(repr(obj)) == obj For instance, for built-in unicode strings: >>> u"éâÄ" == eval(repr(u"éâÄ")) True but for your subclass, us != eval(repr(us)). So again, code that works perfectly with built-in unicode objects will fail with your subclass. Ideally, repr of your class should return a string like: "Unicode('...')" but if that's too verbose, it is acceptable to just inherit the __repr__ of unicode and return something like "u'...'". Anything else should be considered non-standard behaviour and is STRONGLY discouraged. > def __str__(self): > print '*', > return '`'+ self.encode(Unicode.ENCODING) + '`' What's the purpose of the print statements in the __str__ and __repr__ methods? Again, unless you have a good reason to do different, you are best to just inherit __str__ from unicode. Anything else is strongly discouraged. > An issue happens in particuliar cases, when using both %s and %r: > > s = "éâÄ" This may be a problem. "éâÄ" is not a valid str, because it contains non-ASCII characters. The result that you get may depend on your external environment. For instance, if I run it in my terminal, with encoding set to UTF-8, I get this: >>> s = "éâÄ" >>> print s éâÄ >>> len(s) 6 >>> list(s) ['\xc3', '\xa9', '\xc3', '\xa2', '\xc3', '\x84'] but if I set it to ISO 8859-1, I get this: >>> list("éâÄ") ['\xe9', '\xe2', '\xc4'] As far as I know, the behaviour of stuffing unicode characters into byte-strings is not well-defined in Python, and will depend on external factors like the terminal you are running in, if any. It may or may not work as you expect. It is better to do this: u = u"éâÄ" s = u.encode('uft-8') which will always work consistently so long as you declare a source encoding at the top of your module: # -*- coding: UTF-8 -*- -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] First program
On Sat, 13 Mar 2010 01:11:25 pm Ray Parrish wrote: > Here's what I get from that, could you please explain why? > > >>> print('{0}, is not a valid choice'.format(choice)) > Traceback (most recent call last): > File "", line 1, in > AttributeError: 'str' object has no attribute 'format' The original poster is using Python 3.0 or 3.1, you are using an earlier version. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] First program
On Sat, 13 Mar 2010 01:04:42 pm Ray Parrish wrote: > > print "A %s with dimensions %sx%s has an area of %s." % > > (choice, height, width, width*height) > > Hello, > > Isn't it a little more understandable to use a > construct like the following? > > >>> print "The area of a " + Choice + "is " str(Width) + " x " + > str(Height) + " equals " + str(Width * Height) + " > square feet" > > The area of a rectangle is 12 x 10 equals 120 > square feet. > > I find that putting the variables on the end like > that, when you're not actually applying any special formatting to them > makes it less readable > when I'm debugging my stuff, or when someone else > is reading my code, > and trying to understand it. Of course you are welcome to use whatever coding standards you like, but I think you will find that among experienced coders, you are in a vanishingly small minority. As a beginner, I found string interpolation confusing at first, but it soon became second-nature. And of course, there are legions of C coders who are used to it. I find an expression like: "The area of a " + Choice + "is " str(Width) + " x " + str(Height) + "equals " + str(Width * Height) + "square feet" difficult to follow: too many quotes, too many sub-expressions being added, too many repeated calls to str(), it isn't clear what is the template and what is being inserted into the template. It is too easy to miss a quote and get a SyntaxError, or to forget to add spaces where needed. To me, this is MUCH easier: template = "The area of a %s is %s x %s equals %s square feet" print template % (Width, Height Width*Height) One pair of quotes instead of five, no problems with remembering to add spaces around pieces, and no need to explicitly call str(). -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] %s %r with cutom type
ller's responsibility. If the user wants a string "cat" and they pass "C aT \n" instead, you're not responsible for fixing their mistake. If they want the unicode string u"éâÄ" and they pass the byte-string "\xe9\xe2\xc4" instead, that's not your problem either. > One main reason for my Unicode type (that > accepts both str and unicode). If all you want is a subclass of unicode which defaults to UTF-8 instead of ASCII for encoding, then I will agree with you 100%. That's a nice idea. But you seem to be taking a nice, neat, unicode subclass and trying to turn it into a swiss-army knife, containing all sorts of extra functionality to do everything for the user. That is a bad idea. > Anyway, all that source of troubles > disappears with py3 :-) > Then, I only need __str__ to produce nice, clear, unpolluted output. > > > which will always work consistently so long as you declare a source > > encoding at the top of your module: > > > > # -*- coding: UTF-8 -*- > > Yes, this applies to my own code. But what about user code calling my > lib? (This is the reason for Unicode.ENCODING config param). That is their responsibility, not yours. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Escaping a single quote mark in a triple quoted string.
On Sun, 14 Mar 2010 04:33:57 am Ray Parrish wrote: > Hello, > > I am getting the following - > > >>> String = """http://www.rayslinks.com";>Ray's Links""" > >>> String > 'http://www.rayslinks.com";>Ray\'s Links' > > Note the magically appearing back slash in my result string. You are confusing the printed representation of the string with the contents of the string. The backslash is not part of the string, any more than the leading and trailing quotes are part of the string. They are part of the display of the string. Consider: >>> s = "ABC" # Three characters A B C. >>> s # Looks like five? 'ABC' >>> len(s) # No, really only three. 3 The quotes are not part of the string, but part of the printable representation. This is supposed to represent what you would type to get the string ABC. You have to type (quote A B C quote). Now consider: >>> s = """A"'"B""" # Five chars A double-quote single-quote d-quote B >>> s # Looks like eight? 'A"\'"B' >>> len(s) # But actually only five. 5 When printing the representation of the string, Python always wraps it in quotation marks. If the contents include quotation marks as well, Python will escape the inner quotation marks if needed, but remember this is only for the display representation. If you want to see what the string looks like without the external quotes and escapes: >>> print s A"'"B > >>> NewString = """'""" > >>> NewString > > "'" > Hmmm, no back slash this time... In this case, the string itself only contains a single-quote, no double-quote, so when showing the representation, Python wraps the contents with double-quotes and there is no need to escape the single-quote. Python's rules for showing the representation of the string includes: * wrap the string contents in single quotes ' * unless the string contains single quotes, in which case wrap it in double quotes " and display the single quotes unescaped * unless the string contains double quotes as well, in which case wrap it in single quotes and escape the inner single quotes. But remember: this is only the display of the string, not the contents. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Declaring a compound dictionary structure.
On Sun, 14 Mar 2010 10:06:49 am Ray Parrish wrote: > Hello, > > I am stuck on the following - > > # Define the Dates{} > dictionary structure as a dictionary > # containing two dictionaries, > each of which contains a list. > Dates = > {Today:{ThisIPAddress:[]}, > Tomorrow:{ThisIPAddress:[]}} > > How do I pass this declaration empty values for > Today, Tomorrow, and ThisIPAddress to initially > clare it? You don't. Once you create a key, you can't modify it. So if you create an empty value for Today etc., it stays empty. > The idea behind the structure is to sort through a > server log that contains entries for two dates, [...] Just work out the dates before hand, and populate the dictionary that way: today = "2010-03-13" tomorrow = "2010-03-14" thisIP = "123.456.789.123" entries = {today: {thisIP: []}, tomorrow: {thisIP: []}} Or don't pre-populate the dict at all. entries = {} for line in logfile: # Process the line to get a date, an IP address, and visitor date = ... address = ... visitor = ... entry = entries.get(date, {}) x = entry.get(address, {}) x.append(visitor) entry[address] = x entries[date] = entry The trick is to use get to look up the dictionary: entries.get(date, {}) looks up date in entries, and if it isn't found, it returns an empty dict {} instead of failing. Similarly for looking up the IP address. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] raw_input()
On Tue, 16 Mar 2010 09:52:26 am kumar s wrote: > Dear group: > I have a large file 3GB. Each line is a tab delim file. [...] > Now I dont want to load this entire file. I want to give each line as > an input and print selective lines. datafile = open("somefile.data", "r") for line in datafile: print line will print each line. If you want to print selected lines, you have to explain what the condition is that decides whether to print it or not. I'm going to make something up: suppose you want the line to only print if the eighth column is "A": def condition(line): line = line.strip() fields = line.split('\t') return fields[7].strip().upper() == "A" datafile = open("somefile.data", "r") for line in datafile: if condition(line): print line will print only the lines where column 8 is the letter A. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] raw_input()
On Tue, 16 Mar 2010 10:22:55 am kumar s wrote: > thanks Benno. > > supplying 3.6 GB file is over-kill for the script. Then supply a smaller file. > This is the reason I chose to input lines on fly. I don't understand what you are trying to say. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] raw_input()
On Tue, 16 Mar 2010 11:27:13 am you wrote: > I think he thinks that python is going to read the whole 3.6GB of > data into memory in one hit, rather than using a small amount of > memory to process it line by line. But "for line in datafile" in your > code above uses a generator, right? So I don't think it's a problem - > correct me if I'm wrong. No, you are correct -- "for line in file" reads one line at a time. Beware, though, if the file isn't line-oriented, then each "line" (separated with a newline character) could be huge. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] os.popen4 help!
On Wed, 17 Mar 2010 07:56:40 am Jeff Peery wrote: > Hello, > I'm trying to run an executable file from a python script. the > executable is "opcenum.exe". [...] > currently I do the below: > > import os > cmd = 'opcenum.exe/IOPCServerList/EnumClassesofCategory > 63D5F432-CFE4-11d1-B2C8-0060083BA1FB' > fin,fout = os.popen4(cmd) > result = fout.read() > > this doesn't work, and it hangs up on fout.read(). What happens if you run the exact same command from the shell? I assume you're using Windows. Open a DOS Window (command.com or cmd.exe or whatever it is called) and run: opcenum.exe/IOPCServerList/EnumClassesofCategory 63D5F432-CFE4-11d1-B2C8-0060083BA1FB and see what it does. Once you get the syntax right in the shell, then use the exact same syntax in popen4. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] difflib context to string-object?
On Fri, 19 Mar 2010 09:55:06 pm Karjer Jdfjdf wrote: > With difflib.context_diff it is possible to write the context to > files. > > difflib.context_diff(a, b[, fromfile][, tofile][, fromfiledate][, > tofiledate][, n][, lineterm]) > > Is it also possible to do this to seperate string-objects instead of > writing them to files? Use StringIO objects, which are fake files that can be read as strings. Untested and entirely from memory, use: from StringIO import StringIO fromfile = StringIO("some text goes here") tofile = StringIO("some different text goes here") -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] using pythnn to open a password protected website
On Fri, 19 Mar 2010 11:33:36 pm richard west wrote: > Hi, > > Im trying to use python to open up a password protected website(e.g. > facebook / gmail) in Firefox. supplying the login and password > automatically at runtime - so that I can interface my code with > fingerprint recognition code. So far I have only found urllib, > urllib2 and web browser, none of which I have been able to use. > > urllib2 can log me in, but then I cant pass this authenicated login > to the browser. At the very least I need to be able to send the form > post information to the browser via my code, and preferably > auto-submit the form. You might like to look at Mechanize, which is a third-party Python project for dealing with just that sort of problem. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Finding duplicates entry in file
On Sun, 21 Mar 2010 03:34:01 am Ken G. wrote: > What is a method I can use to find duplicated entry within a sorted > numeric file? > > I was trying to read a file reading two lines at once but apparently, > I can only read one line at a time. f = open("myfile") while True: first = f.readline() # Read one line. second = f.readline() # And a second. process(first) process(second) if second == '': # If the line is empty, that means we've passed the # end of the file and we can stop reading. break f.close() Or if the file is small (say, less than a few tens of megabytes) you can read it all at once into a list: lines = open("myfile").readlines() > Can the same file be opened and read two times within a program? You can do this: text1 = open("myfile").read() text2 = open("myfile").read() but why bother? That's just pointlessly wasteful. Better to do this: text1 = text2 = open("myfile").read() which is no longer wasteful, but probably still pointless. (Why do I need two names for the same text?) > For example, a file has: > > 1 > 2 > 2 > 3 > 4 > 4 > 5 > 6 > 6 > > The newly revised file should be: > > 1 > 2 > 3 > 4 > 5 > 6 Unless the file is huge, something like this should do: # Untested lines = open("myfile").readlines() f = open("myfile", "w") previous_line = None for line in lines: if line != previous_line: f.write(line) previous_line = line f.close() -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Efficiency and speed
On Sat, 20 Mar 2010 03:41:11 am James Reynolds wrote: > I've still been working towards learning the language, albeit slowly > and I've been working on a project that is somewhat intense on the > numerical calculation end of things. > > Running 10,000 trials takes about 1.5 seconds and running 100,000 > trials takes 11 seconds. Running a million trials takes about 81 > seconds or so. I don't think 1M trials is needed for accuracy, but > 100K probably is. That's 15 microseconds per trial. Why do you think that's necessarily slow? How much work are you doing per trial? > I've made a few other optimizations today that I won't be able to > test until I get home, but I was wondering if any of you could give > some general pointers on how to make python run a little more > quickly. Read this: http://www.python.org/doc/essays/list2str.html Google on "Python optimizations". And you can watch this talk from the recent Python conference: http://pycon.blip.tv/file/3322261/ Unfortunately the video is only available as a 1GB file, so I haven't watched it myself. But if you have a fast Internet link with lots of quota, go ahead. [snip] > def mcrange_gen(self, sample):#, lensample): > lensample = len(sample) # this section is just for speed. All of > > these are renames from the globals to bring calc times down at the > > expense of memory. I haven't tested these yet. > > nx2 = self.nx1 > nx2_append = nx2.append > nx2_sort = nx2.sort > nx2_reverse = nx2.reverse > nx2_index = nx2.index > nx2_remove = nx2.remove All this becomes white noise. That's the problem with optimizations, particularly micro-optimizations: so often they make the code less readable. Unless you have profiled your code and determined that these micro-optimizations make a significant difference, I'd say drop them, they're not worth it. This sort of micro-optimization is the last thing you should be adding. > for s in range(lensample): > q = sample[s] #takes the next randomly generated number from > > the sample list > nx2_append(q) # and appends it to nx list. > nx2_sort() # and sorts it in place > nx2_reverse() # reverses the list, because this was the > > original order > i = nx2_index(q) #get the index of that element > nx2_remove(q) # and remove the element. > yield i # send the position of that element back to the main > > program. Your naming conventions are ... weird. "s" for an integer index, "q" for a sample -- what's with that? And what's nx1 and nx2? Writing comments is a good thing, but only when the comments actually add something to the code. Nearly every one of the above comments is useless. For instance, you write: nx2_append(q) # and appends it to nx list. Really? Calling "append" appends? Who would have guessed! nx2_sort() # and sorts it in place Sort sorts. Wow. I trust I've made my point. There's no need to repeat the code in plain English as a comment. It just becomes noise. Since s is never used except to get the sample, I'd do this: nx = self.nx1 # This is more a typing optimization than for speed. for q in sample: nx.append(q) nx.sort(reverse=True) i = nx.index(q) nx.remove(q) yield i Now, let's look at what happens in the first four lines of the loop. Appending to a list is fast. You almost certainly don't need to care about that. Sorting is fast-ish. Python's sort is amazingly fast, as far as sorts go, but sorting is a naturally expensive operation, and you don't sort the data once, you sort it every time through the loop. Tick this as something that *maybe* you want to get rid of. index() is an O(N) operation -- slow but not painfully slow. Unless your list is short, this is another "maybe" to remove. remove() is another expensive operation. Unless your list is very short, this is something else you want to avoid when possible. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Efficiency and speed
On Sat, 20 Mar 2010 05:47:45 am James Reynolds wrote: > This is a monte-carlo simulation. > > The simulation measures the expiration of something and those > somethings fall into bins that are not evenly dispersed. These bins > are stored in the nx list mentioned previously. > > So let's say you have the bins, a, b,c,d,e,f and you have the value z > from the sample list where z >b and <= a. In this case, it should > return the index value at position (a). I'm not sure I understand completely. An example might help. I *think* you have a list like this: nx = [10.0, 9.0, 7.0, 3.0, 2.0, 1.0] and if you have a value like z = 9.8 you want to return the index 0. Correct? That seems a bit funny. In my experience it is normal to have the bins in the opposite direction. I suppose it probably doesn't matter that much, but it does seem a bit unusual. If nx is fairly short (say, less than 40 or 50 items), then the fastest way is probably a linear search, something like this: def search_bins(nx, z): """Search bins nx for item z. >>> bins = [5.0, 4.0, 2.0, 1.0] >>> search_bins(bins, 1.2) 2 If z is not in the bins, returns -1: >>> search_bins(bins, 5.1) -1 """ for i, value in enumerate(nx): if z > value: return i-1 return -1 -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Finding duplicates entry in file
On Sun, 21 Mar 2010 09:10:52 am Luke Paireepinart wrote: > On Sat, Mar 20, 2010 at 4:50 PM, Ken G. wrote: > > Thanks for the info. I already adopted a program from another > > person and it works like a charm. As for your question, I had no > > idea of if I had duplicate or more as there was some 570 line > > items. I whittled it down to 370 line entries. Whew. > > Can you please try to post to the list in plain-text rather than > HTML? it is very hard to read your font on my system. What client are you using? Ken's post includes both a plain text part and a HTML part. Any decent mail client I know of gives you the choice of which to view. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] parsing pyc files
On Tue, 23 Mar 2010 10:01:11 pm Jojo Mwebaze wrote: > Researchers at our university are allowed to checkout code from CVS, > make modifications, change variables/parameters and run experiments.. > After experiments are run, results are published. (However we don't > allow them to commit the changes, till changes are approved) > > take an example, two researchers can run two experiments on the same > data set but get different results depending on what someone > did/changed. > > So the problem is how to compare both results, We need to know how > the results were generated. e.g methods invoked, parameters/variables > passed to that method, and probably changes made to the classes and > probably store this information as part of the data. Then ask the scientists for the source code. If you suspect that they will give you a version of code which is different from the version they actually ran, then run it on the same data. If it doesn't give the exact same results, send it back with this link: http://en.wikipedia.org/wiki/Scientific_misconduct -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] parsing pyc files
On Tue, 23 Mar 2010 12:16:03 pm Jojo Mwebaze wrote: > Hello Tutor > > I have two problems, any help will be highly appreciated. > > How is possible to trace the all method calls, object instantiations, > variables used in running an experiment dynamically, without putting > print - or log statements in my code? - some sort of provenance! Look at the profile module for starters. >>> def test(x): ... y = -42 ... for i in xrange(1): ... y += func(i) ... return x + y ... >>> def func(x): ... return x ... >>> >>> import profile >>> profile.run("test(3)") 10004 function calls in 0.286 CPU seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 10.0020.0020.0020.002 :0(setprofile) 10.1450.0000.1450.000 :1(func) 10.1390.1390.2840.284 :1(test) 10.0000.0000.2840.284 :1() 00.000 0.000 profile:0(profiler) 10.0000.0000.2860.286 profile:0(test(3)) > I would like to create a tool that can look into pyc files to find > classes/methods that was executed without looking the the source > code. Is this possible in python. Look at the dis module. >>> import dis >>> dis.dis(test) 2 0 LOAD_CONST 1 (-42) 3 STORE_FAST 1 (y) 3 6 SETUP_LOOP 36 (to 45) 9 LOAD_GLOBAL 0 (xrange) 12 LOAD_CONST 2 (1) 15 CALL_FUNCTION1 18 GET_ITER >> 19 FOR_ITER22 (to 44) 22 STORE_FAST 2 (i) 4 25 LOAD_FAST1 (y) 28 LOAD_GLOBAL 1 (func) 31 LOAD_FAST2 (i) 34 CALL_FUNCTION1 37 INPLACE_ADD 38 STORE_FAST 1 (y) 41 JUMP_ABSOLUTE 19 >> 44 POP_BLOCK 5 >> 45 LOAD_FAST 0 (x) 48 LOAD_FAST1 (y) 51 BINARY_ADD 52 RETURN_VALUE -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Press Enter to quit. Silently maybe.
On Wed, 24 Mar 2010 07:47:40 am Wayne Watson wrote: > I use this code to quit a completed program. What on earth for? If the program is complete, just quit. In my opinion, there is very little worse than setting up a chain of long-running programs to run overnight, then coming back in the morning expecting that they will all be finished only to discover that one of those programs is stupidly sitting them with a "Press any key to quit" message, stopping all the rest from running. In my opinion, such behaviour should be a shooting offense. *wink* > If no is selected for > the yes/no prompt, warning messages appear in the shell window. I'm > executing from IDLE. Is there a way to just return to the >>> prompt > there? > > def finish(): > print; print "Bye" > print > raw_input('Press Enter to quit') > sys.exit() What yes/no prompt? How do you select No? -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Interpolation function
On Fri, 26 Mar 2010 07:41:35 am Armstrong, Richard J. wrote: > Hello all, > > > > Does anyone have a suggestion for a good interpolation function in > numpy or scipy. I have an earthquake time history with a time step of > 0.005 sec and want to convert it to a time history with a time step > of say 0.01. The interpolation function numpy.interp is too "coarse" > and modifies the characteristic of the time history too much. You probably should take that question to a dedicated numpy or scipy mailing list, where the folks are more likely to be numerically sophisticated. Unless there happens to be an experienced numpy/scipy user hanging around here, any answer we give would be just guessing. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Odds and even exercise
On Sat, 27 Mar 2010 11:55:01 pm TJ Dack wrote: > Hi, just started python at Uni and think i am in for a rough ride > with zero prior experience in programming. > Anyway my problem that i can't fix my self after googling. > > The exercise is to generate a list of odd numbers between 1-100 and > the same for even numbers. > > So far this is what i haveCODE: SELECT > ALL<http://python-forum.org/pythonforum/viewtopic.php?f=3&t=17610#>#A > way to display numbers 1 - 100 > numbers = range(100) > #A way to display all odd numbers > odd = numbers[::2] > #A way to display all even numbers Nice try! Sadly you *just* missed getting it though. Here's an example, using 1-10 instead of 1-100: >>> numbers = range(10) >>> odd = numbers[::2] >>> print numbers [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] >>> print odd [0, 2, 4, 6, 8] So you have two small errors: you have the numbers 0-99 instead of 1-100, and consequently the list you called "odd" is actually even. This is what programmers call an "off by one" error. Don't worry, even the best programmers make them! The first thing to remember is that Python's range function works on a "half-open" interval. That is, it includes the starting value, but excludes the ending value. Also, range defaults to a starting value of 0, but you need a starting value of 1. So you need: numbers = range(1, 101) # gives [1, 2, 3, ... 99, 100] Now your odd list will work: odd = numbers[::2] # gives [1, 3, 5, ... 99] How to get even numbers? Consider: Position: 0, 1, 2, 3, 4, 5, ... # Python starts counting at zero Number: 1, 2, 3, 4, 5, 6, ... # The values of list "numbers" Odd/Even: O, E, O, E, O, E, ... Every second number is even, just like for the odd numbers, but instead of starting at position zero you need to start at position one. Do you know how to do that? Hint: numbers[::2] means: start at 0, finish at the end of the list, and return every 2nd value. You want: start at 1, finish at the end of the list, and return every 2nd value. > I can't find a way to easily list the even numbers, i really need a > easier way to find answers my self but using the help docs in idle > didn't get me far, any tips that you guys have when you come across > something you can't solve? So far you seem to be doing well: think about the problem, google it, think about it some more, ask for help. Also, try solving the problem with pencil and paper first, then repeat what you did using Python. Finally, experiment! Open up the Python interpreter, and try things. See what they do. See if you can predict what they will do before you do them. For example, what would these give? range(1, 101, 3) range(2, 101, 4) Try it yourself and see if you predicted correctly. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Odds and even exercise
On Sun, 28 Mar 2010 09:33:23 am yd wrote: > I find it easy to do all this stuff with list comprehensions, but > i am a beginner so this might not be the most efficient way to do it > > numbers=[] > for x in range(1,101): > numbers.append(x) That certainly isn't efficient! In Python 2.x, this is what it does: (1) Create an empty list and call it "numbers". (2) Create a list [1, 2, ... 100] (3) Set up a for-loop. (4) Read the first number from the list (1) and call it x. (5) Add x to the end of numbers. (6) Read the second number from the list (2) and call it x. (7) Add x to the end of numbers. (8) Read the third number from the list and call it x. (9) Add x to the end of numbers. ... (202) Read the 100th number from the list and call it x. (203) Add x to the end of numbers. (204) Finish up the for-loop. Better to just say: (1) Create a list [1, 2, ... 100] and call it "numbers". numbers = range(1, 101) In Python 3.x, it is exactly the same except for step 2, which creates a lazy range-object which only stores one item at a time. So the solution in Python 3.x is to convert it into a list: numbers = list(range(1, 101)) > > #A way to display all odd numbers > > odd = numbers[::2] > > instead i do this: > odd=[] > for y in range(1,101,2): > odd.append(y) This does just as much unnecessary work as above. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] inter-module global variable
ot; should _receive_ as > general parameter a pointer to 'w', before they do anything. Yes, this is better than "really global" globals, but not a lot better. > In other > words, the whole "code" module is like a python code chunk > parameterized with w. If it would be a program, it would get w as > command-line parameter, or from the user, or from a config file. > Then, all instanciations should be done using this pointer to w. > Meaning, as a consequence, all code objects should hold a reference > to 'w'. This could be made as follows: If every code object has a reference to the same object w, that defeats the purpose of passing it as an argument. It might be local in name, but in practice it is "really global", which is dangerous. > # main module > import code > code.Code.w = w Why not just this? code.w = w And where does w come from in the first place? Shouldn't it be defined in code.py, not the calling module? > from code import * This is generally frowned upon. You shouldn't defeat Python's encapsulation of namespaces in that way unless you absolutely have to. > # "code" module > class Code(object): > w = None ### to be exported from importing module That sets up a circular dependency that should be avoided: Code objects are broken unless the caller initialises the class first, but you can't initialise the class unless you import it. Trust me, you WILL forget to initialise it before using it, and then spend hours trying to debug the errors. > def __init__(self, w=Code.w): > # the param allows having a different w eg for testing > self.w = w This needlessly gives each instance a reference to the same w that the class already has. Inheritance makes this unnecessary. You should do this instead: class Code(object): w = None # Better to define default settings here. def __init__(self, w=None): if w is not None: self.w = w If no w is provided, then lookups for instance.w will find the shared class attribute w. [...] > But the '###' line looks like an ugly trick to me. (Not the fact > that it's a class attribute; as a contrary, I often use them eg for > config, and find them a nice tool for clarity.) The issue is that > Code.w has to be exported. It is ugly, and fragile. It means any caller is *expected* to modify the w used everywhere else, in strange and hard-to-predict ways. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] help
On Mon, 29 Mar 2010 01:00:45 pm Oshan Modi wrote: > i am only a novice and just started programming.. i am having trouble > running a .py file in the command prompt.. if anyone of you could > help? Are you running Linux or Mac? At the command prompt, run: python myfile.py and report any errors. You can also run: which python and see what it says. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Script Feedback
On Wed, 31 Mar 2010 01:27:43 am Damon Timm wrote: [...] > My initial questions are: > > 1. Is there a better way to implement a --quiet flag? I usually create a function "print_" or "pr", something like this: def print_(obj, verbosity=1): if verbosity > 0: print obj and then have a variable "verbosity" which defaults to 1 and is set to 0 if the user passes the --quiet flag. Then in my code, I write: print_("this is a message", verbosity) > 2. I am not very clear on the use of Exceptions (or even if I am > using it in a good way here) — is what I have done the right > approach? Hmmm... perhaps. Usually, you want to catch an exception in order to try an alternative approach, e.g.: try: result = somefunction(x) except ValueError: # Fall back on some other approach. result = "something else" Occasionally, you want to catch an exception just to ignore it. It's generally *not* a good idea to catch an exception just to print an error message and then exit, as you do. Just let the exception propagate, and Python will print a rich and detailed traceback together with your error message. However, a reasonable exception (pun intended) for that rule is to hide the traceback from the users, who probably can't do anything about it, and would only get confused by the details. So you want to have your functions and classes raise exceptions, and the application layer (the user interface) catch them. So I would do something like this (untested). In your function code: ... if os.path.exists(tar_name): msg = "A tar file already exists this this directory name." \ " Move or rename it and try again." raise DestinationTarFileExists(msg) Then your application layer looks something like: if __name__ == '__main__': try: ... except KeyboardInterrupt: sys.exit(1) except DestinationTarFileExists, e: print e.message sys.exit(2) # Any other exception is unexpected, so we allow Python # to print the full traceback as normal. > 3. Finally, in general: any feedback on how to improve > this? (I am thinking, just now, that the script is only suitable for > a command line usage, and couldn’t be imported by another script, for > example.) Separate the underlying functionality from the application-level code. These functions should NEVER print anything: they do all communication through call-backs, or by returning a value, or raising an exception. E.g.: def tar_bzip2_directories(directories, callback=None): for directory in directories: file_name = '-'.join(directory.split(' ')) tar_name = file_name.replace('/','').lower() + ".tar.bz2" if os.path.exists(tar_name): raise DestinationTarFileExists(errmsg) if callback is not None: callback(directory, filename, tarname) ... Create a main function that runs your application: def main(argv=None, callback=None): if argv is None: argv = sys.argv process_command_line_options(argv) if callback is None: def callback(dirname, filename, tarname): print "Processing ..." tar_bzip2_directories(...) if __name__ == '__main__': try: main() except ... # as above Now anyone can import the module and call individual functions, or even call main, or they can run it as a script. Hope this helps, -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Script Feedback
On Wed, 31 Mar 2010 10:54:34 am Damon Timm wrote: > > Separate the underlying functionality from the application-level > > code. These functions should NEVER print anything: they do all > > communication through call-backs, or by returning a value, or > > raising an exception. > > I tried to implement this, however, I am not sure how the 'callback' > works ... is that just a function that a user would pass to *my* > function that gets called at the end of the script? Not necessarily at the end of the script. You usually find callbacks used a lot in coding for graphical user interfaces (GUIs). For instance, the GUI framework might provide a Button class. The caller creates a new Button object, and provides a callback function which gets called automatically by the framework whenever the user clicks the button. The built-in function map is very similar. map is more or less equivalent to the following: def map(callback, sequence): result = [] for item in sequence: result.append(callback(item)) return result except the real map is more efficient, and accepts multiple sequences. It's not common to describe the function argument to map as a callback, but that's essentially what it is. In this example, the callback function needs to be a function that takes a single arbitrary argument. The caller can pass any function they like, so long as it takes a single argument, and map promises to call that function with every item in the sequence. There's nothing special about callbacks, except that they have to take the arguments which the library function promises to supply. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Help with simple text book example that doesn't work!!!
On Sun, 4 Apr 2010 03:40:57 pm Brian Drwecki wrote: > Hi all... I am working from the Learning Python 3rd edition published > by O'Reily... FYI I am trying to learn Python on my own (not for > course credit or anything).. I am a psychologist with very limited > programming experience.. I am anal, and this example code doesn't > work.. I am using IDLE to do everything (ni ni ni ni ni) > > So here is the code the book give me.. > > while True: > reply = raw_input('Enter text:') > if reply == 'stop': > break > elif not reply.isdigit( ): > print 'Bad!' * 8 > else: > print int(reply) ** 2 > print 'Bye' > > > Idle gives me this error SyntaxError: invalid syntax (it highlights > the word print in the print 'bye' line.. Please do an exact copy and paste of the error and post it, rather than paraphrasing the error. In the meantime, a couple of guesses... Are you sure you are using Python 2.6? If you are using 3.1, that would explain the failure. In Python 3, print stopped being a statement and became an ordinary function that requires parentheses. In Python 2.6, one way to get that behaviour is with the special "from __future__ import" statement: >>> from __future__ import print_function >>> print "Hello world" File "", line 1 print "Hello world" ^ SyntaxError: invalid syntax >>> print("Hello world") Hello world Alternatively, sometimes if you have an error on one line, the interpreter doesn't see it until you get to the next, and then you get a SyntaxError on one line past the actual error. E.g. if you forgot to close the bracket: ... else: print int(reply ** 2 print 'Bye' then you would (probably) get a SyntaxError on the line with the print. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Simple bank account oriented object
loat instead of an int) and then it will calculate as you expect in all versions. > def affiche_solde(self):#Méthode qui affiche > le solde et le montant d'intérêt accumulé > print "Le solde du compte bancaire de %s est de %d $CAD" > %(self.nom,self.solde) > print "Vous avez récolté %d $CDN en intérêt"%(self.interet) Now we look at the Étudiant account, and add the extra functionality. > class CompteEtudiant(CompteBancaire): > "définition du compte bancaire pour étudiant dérivé du compte > bancaire standard" > def __init__(self, nom='', solde=0, margeCre=0): > CompteBancaire.__init__(self, nom='Nom', solde=0, interet=0) > self.nom, self.solde, self.margeCre = nom, solde, margeCre The CompteEtudiant class can let the CompteBancaire do some of the work. This is called "inheritance" -- the subclass inherits code from the parent class. def __init__(self, nom='', solde=0, margeCre=0): # Call the parent class method. CompteBancaire.__init__(self, nom, solde) # Do the extra work this class needs. self.margeCre = margeCre Calling CompteBancaire.__init__ does everything the CompteBancaire understands. It sets nom, solde and interet, but not margeCre because CompteBancaire does not know anything about margeCre. Then the subclass sets margeCre itself. > def affiche_solde(self, somme=0): #Méthode constructeur > qui redéfini la fonction affiche_solde pour le compte étudiant > print "%s--Votre solde bancaire est de %d $CAD" > %(self.nom,self.solde) > print "Le solde de votre marge de crédit est de %d $CAD" > %(self.margeCre) The CompteEtudiant should print the same information as the CompteBancaire class, plus extra. Again, inheritance makes it easy: def affiche_solde(self): # Display everything the super class understands. CompteBancaire.affiche_soldeafficheself) # Display the parts that only the subclass understands. print "Le solde de votre marge de crédit est de %d $CAD" % (self.margeCre) And one more method needs to be defined: we have to change the retrait method to allow negative balance, but only up to a maximum of margeCre. def retrait(self, somme): if somme > self.solde + self.margeCre: raise ValueError('fonds sont insuffisants') else: self.solde = self.solde - somme -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] ask-why I cannot run it, and I am so confused about the traceback
On Wed, 7 Apr 2010 11:15:35 pm bob gailer wrote: > You have the solution. Good. > > I beg you to avoid colored text. I find it hard to read. > > Just use good old plain text. No fancy fonts, sizes, colors. I don't see any of those. Can't you tell your mail client to ignore the "rich text" (HTML) attachment and just display the plain text? -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Declaring methods in modules.
On Sun, 11 Apr 2010 10:30:54 pm Ray Parrish wrote: > Hello, > > I am working on some stuff, and I would like to be able to write a > module which can be imported, and after it's been imported I would > like to be able to access it's functions as methods. > > In other words, if I do the import of module ISPdetector, I want to > then be able to make calls like the following - > > ipAddress = "123.123.123.123" > emails = ipAddress.GetEmailAddresses() Can't be done -- in Python, built-in types like strings can't have new methods added to them, and thank goodness for that! The ability to monkey-patch builtins is more dangerous than useful. But you can subclass builtins, as Alan suggested: # module ISPdetector class MyString(string): def GetEmailAddresses(self): pass # another module import ISPdetector ipAddress = ISPdetector.MyString("123.123.123.123") emails = ipAddress.GetEmailAddresses() But that is a strange API design. IP addresses aren't strings, they're integers, and although they are commonly written in quad-dotted form as a string, you don't want to treat them as strings. For example, something like ipAddress.replace('2', 'P') makes no sense. Also, making GetEmailAddresses a method of an IP address implies that address *have* email addresses, which is nonsense. People have email addresses. Computer accounts have email addresses. Particular email *messages* come from an IP address, but IP addresses don't have email addresses. A better name would be ipaddress.find_email_from(). So I would suggest the best API is either to create an IP address class (or better still, don't re-invent the wheel, use one of the fine existing IP address modules already written), or write functions in the module and just call them: import ISPdetector ipAddress = "123.123.123.123" emails = find_email_from(ipAddress) Remember, in Python methods are just syntactic sugar for function calls. obj.method(arg) is just another way of spelling method(obj, arg). -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Move all files to top-level directory
On Tue, 13 Apr 2010 12:11:30 am Dotan Cohen wrote: > I use this one-liner for moving photos nested a single folder deep > into the top-level folder: > find * -name "*.jpg" | awk -F/ '{print "mv "$0,$1"-"$2}' | sh > > I would like to expand this into an application that handles > arbitrary nesting and smart rename, so I figure that Python is the > language that I need. I have googled file handling in Python but I > simply cannot get something to replicate the current functionality of > that lovely one-liner. "Lovely"??? What on earth does it do? It's worse than Perl code!!! *half a wink* > What fine manual should I be reading? I am not > asking for code, rather just a link to the right documentation. See the shell utilities module: import shutil -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Sequences of letter
On Tue, 13 Apr 2010 02:46:39 am Dave Angel wrote: > Or more readably: > > from string import lowercase as letters > for c1 in letters: > for c2 in letters: > for c3 in letters: > print c1+c2+c3 Here's another solution, for those using Python 2.6 or better: >>> import itertools >>> for x in itertools.product('abc', 'abc', 'abc'): ... print ''.join(x) ... aaa aab aac aba abb abc [many more lines...] ccc If you don't like the repeated 'abc' in the call to product(), it can be written as itertools.product(*['ab']*3) instead. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] "value" ~ "data" ~ "object"
On Thu, 15 Apr 2010 09:37:02 pm spir ☣ wrote: > Hello, > > I have been recently thinking at lexical distinctions around the > notion of data. (--> eg for a starting point > http://c2.com/cgi/wiki?WhatIsData) Not only but especially in Python. > I ended up with the following questions: Can one state "in Python > value=data=object"? > Can one state "in Python speak value=data=object"? I don't think so -- this is mistaking the map for the territory. Objects are the concrete representation ("the map") in the memory of a computer of some data ("the territory"). The Python object 42 is not the integer 42, which is an abstract mathematical concept. The Python object my_list_of_countries is not an actual list of countries, it is a representation of a list containing representations of countries. The dictionary definition of value generally describes it as a number, e.g. from WordNet: a numerical quantity measured or assigned or computed; "the value assigned was 16 milliseconds" or Websters: 9. (Math.) Any particular quantitative determination; as, a function's value for some special value of its argument. In modern terms, we extend this to other types of data: we might say the value of the variable A is the string 'elephant'. So I would say things like: a variable HAS a value; the bit pattern 1001 REPRESENTS the integer nine an object REPRESENTS a value; a name REFERS TO (points to, is bound to) an object; and similar. To say that the (say) the variable x IS 9 is verbal short hand. Who has never pointed at a spot on the map and said "That's where we have to go"? We all treat the map as the territory sometimes. > What useful distinctions are or may be done, for instance in > documentation? What kind of difference in actual language semantics > may such distinctions mirror? I think that, 99% of the time, it is acceptable to treat the map as the territory and treat variables as BEING values. Just write the documentation in the most natural way: class Elephant: def walk(self, direction): """Cause the elephant to walk in the direction given.""" ... def _move_leg(self, leg): """Private method that operates the elephant's given leg.""" ... Trying to be pedantic about values, data and in-memory representations is rarely necessary. > Denis > > PS: side-question on english: > I am annoyed by the fact that in english "data" is mainly used & > understood as a collective (uncountable) noun. "datum" (singular) & > "datas" (plural) seem to be considered weird. "Datas" has never been the plural of data. Datum is the singular, data the plural. Over the last few decades, data is slowly shifting to be an uncountable noun, like "water", "air" or "knowledge", and datum has become very old-fashioned. I would still expect most people would recognise it, or at least understand it from context. Collective noun is incorrect. Collective nouns are things like "a FLOCK of geese", "a PILE of books", "a SET of spanners", or "a GROUP of people". The term you are thinking of is mass noun. See: http://en.wikipedia.org/wiki/Collective_noun http://en.wikipedia.org/wiki/Mass_noun http://en.wikipedia.org/wiki/Data > How to denote a single > unit of data wothout using the phrase "piece of data"? You can still use datum, although it is becoming obsolete it is not gone yet. But if you treat data as uncountable, then like any uncountable noun you have to supply a unit or amount: Item of data. Element of data. Great big lump of data. 42 bytes of data. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Creating class instances through iteration
On Fri, 16 Apr 2010 12:03:52 pm Tim Goddard wrote: > For example each row would look like [name, value1, value2, value3], > I planned on passing the three values as a tuple > > The code would simply be: > > for row in csvimport: > tuple = (row[1],row[2],row[3]) > instancename = Classname(tuple) > > How could I create different instances inside an iteration loop for > each row ? Let me see if I have it right... you want a CSV file that looks something like this: a,1,2,3 b,2,4,8 c,3,6,9 and you want to end up with: a = Classname((1,2,4)) b = Classname((2,4,8)) c = Classname((3,6,9)) Correct? That's the wrong question. It's easy to create new names, using eval (but beware of the security risks!). But the question you should be asking is, what on earth do I do with them once I've created them? Later in your code, suppose you want to do something like this: print c.some_method() (say). But how do you know that c will exist? Maybe the CSV file doesn't have rows named a,b,c, but instead has x,y,z. Or a,b,d,e, or even fred,wilma,betty,barney. Since you don't know what the name of the variable will be, you can't refer to it in code. You then have to jump through all sorts of flaming hoops by writing code like this: # Untested list_of_names = [] for row in csvimport: list_of_names.append(row[0]) tuple = (row[1],row[2],row[3]) # WARNING -- the next line is DANGEROUS and has a HUGE security # hole which could cause data loss. Only use it on data you # trust with your life. eval('%s = Classname(t)' % row[0], globals(), {'t': tuple}) # more code goes here # ... # much later if 'c' in list_of_names: print c.some_method() else: # Oh I don't know, just pick some random variable and hope it # is the right one... eval('print %s.some_method()' % list_of_names[0]) Whew! Ugly, insecure, slow, dangerous code. Is there an alternative? Of course -- dictionaries. objects = {} for row in csvimport: tuple = (row[1],row[2],row[3]) objects[row[0]) = Classname(tuple) # more code goes here # ... # much later try: print objects['c'].some_method() except KeyError: # No name 'c' exists... ... But more likely, you don't need to operate on a single object c, you need to operate on all of them. So what is the purpose of the names? Just drop the names, and work on a collection of objects: objects = [] for row in csvimport: tuple = (row[0],row[1],row[2]) objects.append(Classname(tuple)) # more code goes here # ... # much later for obj in objects: print obj.some_method() Hope this helps, -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Loop comparison
On Fri, 16 Apr 2010 06:29:40 pm Alan Gauld wrote: > "Stefan Behnel" wrote > > > import cython > > > > @cython.locals(result=cython.longlong, i=cython.longlong) > > def add(): > > result = 0 > > for i in xrange(10): > > result += i > > return result > > > > print add() [...] > Or is cython doing the precalculation optimisations you mentioned? > And if so when does it do them? Because surely, at some stage, it > still has to crank the numbers? > > (We can of course do some fancy math to speed this particular > sum up since the result for any power of ten has a common pattern, > but I wouldn't expect the compiler optimiser to be that clever) No fancy maths needed, although I'd be amazed (in a good way) to learn that compiler compiler optimizers recognised this case! Are optimizing compilers really that clever these days? The sum of 1,2,3,4,...,N is given by a simple formula: 1/2*N*(N+1). An anecdote about the great mathematician Carl Gauss is that while still a very young child in primary school, his teacher assigned the class the problem of adding the numbers 1 through 100 in order to keep them occupied for an hour or so. To his astonishment, Gauss gave him the answer within seconds. He reasoned like this: sum = 1 + 2 + 3 + 4 + ... + 99 + 100 sum = 100 + 99 + 98 + 97 + ... + 2 + 1 2*sum = 101 + 101 + 101 + ... 101 = 100*101 so sum = 1/2 * 100 * 101 = 5050 While it is uncertain whether this specific story is actually true, it is certain that Gauss was a child prodigy and a genius of the first degree. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Loop comparison
On Fri, 16 Apr 2010 06:25:54 pm Stefan Behnel wrote: > Alan Gauld, 16.04.2010 10:09: > > Even the built in sum() will be faster than a while loop: > > > > result = sum(range(10)) > > > > although it still took 10 minutes on my PC. > > Did you mean to say "minutes" or rather "seconds" here? And did you > really mean to use "range" or rather "xrange" (or "range" in Py3)? > > sum(xrange(10)) > > clearly runs in 12 seconds for me on Py2.6, 7 minutes for me in Python 2.5. The joys of low end hardware! Are you sure you got the right number of zeroes? > whereas the same with > "range" gives an error due to insufficient memory. I'm not even going to try... -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Raw string
On Mon, 19 Apr 2010 07:49:31 am Neven Goršić wrote: > Hi! > > When I get file path from DirDialog, I get in a (path) variable. > Sometimes that string (path) contains special escape sequences, such > as \x, \r and so on. > > 'C:\Python25\Programs\rating' That creates a string containing a \r character, which is a carriage return. You could write it as a raw string r'C:\Python25\Programs\rating' but that will fail if the string ends with a backslash. Or you could escape your backslashes: 'C:\\Python25\\Programs\\rating' The best solution is to remember that Windows will accept either backslash or forward slash in paths, and so write: 'C:/Python25/Programs/rating' Another solution is to construct the path programmatically, e.g.: parts = ['C:', 'Python25', 'Programs', 'ratings'] path = '\\'.join(parts) but frankly I would consider any solution except "use forward slashes" to be a waste of time -- CPU time *and* programmer time. > When I try to open that file (whose name contains escape sequences) > it doesn't work. Only because the file doesn't exist. If you actually have a file called Programs\rating in the C:/Python25/ directory, you will open it. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] the binary math "wall"
On Wed, 21 Apr 2010 02:58:06 am Lowell Tackett wrote: > I'm running headlong into the dilemma of binary math representation, with game-ending consequences, e.g.: > >>> 0.15 > > 0.14999 > > Obviously, any attempts to manipulate this value, under the misguided > assumption that it is truly "0.15" are ill-advised, with inevitable > bad results. That really depends on what sort of manipulation you are doing. >>> x = 0.15 >>> x 0.14999 >>> x*100 == 15 True Seems pretty accurate to me. However: >>> 18.15*100 == 1815 False The simplest, roughest way to fix these sorts of problems (at the risk of creating *other* problems!) is to hit them with a hammer: >>> round(18.15*100) == 1815 True [...] > What I'm shooting for, by the way, is an algorithm that converts a > deg/min/sec formatted number to decimal degrees. It [mostly] worked, > until I stumbled upon the peculiar cases of 15 minutes and/or 45 > minutes, which exposed the flaw. I'm afraid that due to the nature of floating point, this is a hard problem. Even the professionals at Hewlett-Packard's scientific calculator division don't always get it right, and they are *extremely* careful: http://www.hpmuseum.org/cgi-sys/cgiwrap/hpmuseum/archv018.cgi?read=132690 The best result I can suggest is, change the problem! Don't pass degrees-minutes-seconds around using a floating point value, but as a tuple with distinct (DEG, MIN, SEC) integer values. Or create a custom class. But if you really need D.MMSS floats, then something like this should be a good start.: def dms2deg(f): """Convert a floating point number formatted as D.MMSS into degrees. """ mmss, d = math.modf(f) assert d == int(f) if mmss >= 0.60: raise ValueError( 'bad fractional part, expected < .60 but got %f' % mmss) mmss *= 100 m = round(mmss) if m >= 60: raise ValueError('bad minutes, expected < 60 but got %d' % m) s = round((mmss - m)*100, 8) if not 0 <= s < 60.0: raise ValueError('bad seconds, expected < 60.0 but got %f' % s) return d + m/60.0 + s/3600.0 >>> dms2deg(18.15) 18.25 >>> dms2deg(18.1515) 18.2541666 which compares well to my HP-48GX: 18.15 HMS-> gives 18.25, and: 18.1515 HMS-> gives 18.254167. Note though that this still fails with some valid input. I will leave fixing it as an exercise (or I might work on it later, time permitting). -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] set and sets.Set
On Thu, 22 Apr 2010 02:28:03 am Bala subramanian wrote: > Friends, > Someone please write me the difference between creating set with > set() and a sets.Set(). The sets module, including sets.Set(), were first introduced in Python 2.3 and is written in Python. The built-in set object was introduced in Python 2.4 and is re-written in C for speed. Functionally, they are identical. Speed-wise, the built-in version is much faster, so unless you need to support Python 2.3, always use the built-in set type. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] the binary math "wall"
On Thu, 22 Apr 2010 01:37:35 am Lowell Tackett wrote: > Recalling (from a brief foray into college Chem.) that a result could > not be displayed with precision greater than the least precise > component that bore [the result]. So, yes, I could accept my input > as the arbitrator of accuracy. Unfortunately, you can't distinguish the number of supplied digits of accuracy from a float. Given as floats, all of the following are identical: 0.1 0.1 0.09 0.11 as are these two: 0.08 0.099985 Perhaps you should look at the Decimal class, not necessarily to use it, but to see what they do. For instance, you create a Decimal with a string, not a float: >>> from decimal import Decimal >>> Decimal('0.1') Decimal("0.1") >>> Decimal('0.1') Decimal("0.1") which allows you to distinguish the number of digits of precision. > A scenario: > > Calculating the coordinates of a forward station from a given base > station would require [perhaps] the bearing (an angle from north, > say) and distance from hither to there. Calculating the north > coordinate would set up this relationship, e.g.: > > cos(3° 22' 49.6") x 415.9207'(Hyp) = adjacent side(North) > > My first requirement, and this is the struggle I (we) are now engaged > in, is to convert my bearing angle (3° 22' 49.6") to decimal degrees, > such that I can assign its' proper cosine value. This is MUCH MUCH MUCH easier than trying to deal with DMS as a float. Your data already separates the parts for you, so it is just a matter of: >>> d = 3 + 22/60.0 + 49.2/3600.0 >>> import math >>> angle = math.radians(d) >>> math.cos(angle) 0.9982601259166638 Then the only problem you have is whether or not the formula you are using is numerically stable, or whether it is subject to catastrophically growing errors. I hope we're not frightening you off here. For nearly anything people are going to want to do, their input data will be in single-precision. One of the simplest things they can do to improve the accuracy of floating point calculations is to do their intermediate calculations in double-precision. The good news is, Python floats are already in double-precision. For most modern systems, single-precision floats have 24 binary digits of precision (approximately 6 decimal digits) and double-precision floats have 53 binary digits (15 decimal) of precision. More than sufficient for dealing with an angle measured to a tenth of a second. Some resources for you to read: http://en.wikipedia.org/wiki/Floating_point http://www.cs.princeton.edu/introcs/91float/ http://www.cs.berkeley.edu/~wkahan/ http://docs.sun.com/source/806-3568/ncg_goldberg.html > Were I to accumulate many of these "legs" into perhaps a 15 mile > traverse-accumulating little computer errors along the way-the end > result could be catastrophically wrong. YES!!! And just by being aware of this potential problem, you are better off than 90% of programmers who are blithely unaware that floats are not real numbers. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] sys.path and the path order
On Fri, 23 Apr 2010 10:34:02 am Garry Willgoose wrote: > My question is so simple I'm surprised I can't find an answer > somewhere. Did you read the Fine Manual? > I'm interested if I can rely on the order of the > directories in the sys.path list. [...] > The question is can I rely on entry [0] in sys.path always being the > directory in which the original file resides (& across linux, OSX and > Windows)? The documentation says: sys.path A list of strings that specifies the search path for modules. Initialized from the environment variable PYTHONPATH, plus an installation-dependent default. As initialized upon program startup, the first item of this list, path[0], is the directory containing the script that was used to invoke the Python interpreter. If the script directory is not available (e.g. if the interpreter is invoked interactively or if the script is read from standard input), path[0] is the empty string, which directs Python to search modules in the current directory first. Notice that the script directory is inserted before the entries inserted as a result of PYTHONPATH. http://docs.python.org/library/sys.html#sys.path So the answer to your question is, Yes. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Class Inheritance
On Fri, 23 Apr 2010 03:11:36 pm David Hutto wrote: > Hello List! > > While experimenting with Tkinter(python2.6), when from Tkinter > import* is used I came across the following error: > > C:\Users\ascent>c:\python26/Script3.py > Traceback (most recent call last): > File "C:\python26\Script3.py", line 1, in > from Tkinter import * > File "C:\Python26\lib\lib-tk\Tkinter.py", line 44, in > from turtle import * > File "C:\Python26\lib\lib-tk\turtle.py", line 374, in > class ScrolledCanvas(Tkinter.Frame): > AttributeError: 'module' object has no attribute 'Frame' Something is screwy there. I believe you have broken your installation by making changes to files without having any understanding of what you are doing. The turtle module does this: import Tkinter as TK so that line should be "class ScrolledCanvas(TK.Frame)", and in fact that's what I find when I go and look at the source code: class ScrolledCanvas(TK.Frame): The line number is different too. How many breakages have you introduced to the library? > Which stems from the below in turtle.py: > > class ScrolledCanvas(TK.Frame) Note the difference between Tkinter.Frame and TK.Frame. > I know that ScrolledCanvas is trying to use class TK.Frame as it's > base class to build from, and the class Frame is what is trying to be > called from the Tkinter module. > > So I tried to alter the turtle.py. Please don't try to "fix" library files when you don't understand what they are doing. Confine your experiments to your own code, so that when things break, you know that the cause is in your code, not some random change you have done to the library. > When I try to just 'from Tkinter > import *, such as: > > from Tkinter import * > class ScrolledCanvas(Tkinter.Frame): That won't work, because there is no Tkinter defined. One wrong way to do it is: from Tkinter import * class ScrolledCanvas(Frame): but that risks stomping over the top of other variables with great big hob-nailed boots. Better to avoid import *, and do this: import Tkinter class ScrolledCanvas(Tkinter.Frame): but that's still a silly thing to do, because the turtle module has already imported Tkinter. The right way is to leave the module alone, it was working before you changed it: import Tkinter as TK class ScrolledCanvas(TK.Frame): > I get: > * > C:\Users\ascent>c:\python26/Script3.py > Traceback (most recent call last): > File "C:\python26\Script3.py", line 1, in > from Tkinter import * > File "C:\Python26\lib\lib-tk\Tkinter.py", line 44, in > from turtle import * > File "C:\Python26\lib\lib-tk\turtle.py", line 373, in > class ScrolledCanvas(Tkinter.Frame): > NameError: name 'Tkinter' is not defined Now you have two errors. (1) You have introduced a circular import dependency, where turtle tries to import Tkinter which tries to import Tkinter which tries to import turtle... (2) You have no Tkinter object, since you import it's contents, not the module itself. > I know pretty much what is going on there. I doubt it, or else you wouldn't have done what you did. > But when I try to use: > > import Tkinter > from Tkinter import * Why would you do that when turtle has already imported Tkinter under the name TK? > class ScrolledCanvas(Tkinter.Frame): > > It takes me back to the first error. Which means > in both instances both directly called by me, and > when called from the original turtle.py call, > it's not finding the Frame class. I suspect you've broken it. I recommend you re-install the Tkinter library, including turtle.py, in order to get it back to a known working state. > >From the docs (9.5. Inheritance) it states: > > "The name BaseClassName must be defined in a > scope containing the derived class definition. > In place of a base class name, other arbitrary > expressions are also allowed. This can be useful, > for example, when the base class is defined in another module: > > class DerivedClassName(modname.BaseClassName) > " > > > So why does the above, from turtle.py, a standard module, > not allow this, or is their something > the module writer got wrong, or more likely, that I'm not > understanding about what it's doing? I don't have this problem with an unmodified version of turtle and Tkinter in Python 2.6. I get: >>> import turtle >>> turtle.TK.Frame >>> turtle.ScrolledCanvas > As a sidenote, I ended up removing the from turtle import * > line from Tkinter which resolved the problem(the example I was using > didn't h
Re: [Tutor] Class Inheritance
On Fri, 23 Apr 2010 04:54:11 pm David Hutto wrote: [...] > > Something is screwy there. I believe you have broken your > > installation by making changes to files without having any > > understanding of what you are doing. > > My original post was incorrect: the first error should be: > > C:\Users\ascent>c:\python26/Script3.py > Traceback (most recent call last): > File "C:\python26\Script3.py", line 1, in > from Tkinter import * > File "C:\Python26\lib\lib-tk\Tkinter.py", line 44, in > from turtle import * > File "C:\Python26\lib\lib-tk\turtle.py", line 374, in > class ScrolledCanvas(TK.Frame): > AttributeError: 'module' object has no attribute 'Frame' For anyone else reading this, this is a good example of why retyping error messages is a bad, bad idea. That just wastes everybody's time, and sends us on wild-goose chases trying to diagnose problems that didn't actually occur. > > Please don't try to "fix" library files when you don't understand > > what they are doing. Confine your experiments to your own code, so > > that when things break, you know that the cause is in your code, > > not some random change you have done to the library. > > I know this, but I reinstall regularly, so evereything is 'pristine'. You know it but continue to do it anyway? If you want to experiment with library modules, make a copy of them into your home directory, with a different name, and experiment to your heart's desire. That gives you all the benefits of experimentation with none of the disadvantages. When you mangle a standard library module, then post long complicated posts suggesting that the Python development team don't understand their own language, you waste our time as well as yours. If you want to waste your time, go right ahead, but don't send us on wild goose chases with nonsense about the turtle module not allowing inheritance when the breakage was due to your tinkering. Next time you do something like this, be up front about it. Tell us right from the beginning that you've been editing the modules, so we don't have to spend our time discovering this for ourselves. If you had, this conversation would have taken a completely different turn. > If I get an error, I don't wait for mailing list responses, I usually > try to analyze, from what I've learned so far, what's wrong and > figure out why on my own All that is good practice. What's not good practice is making random changes to complicated libraries. A good exercise in programming discipline is to try to reproduce the fault in the smallest amount of code possible. When asking for help, you'll often be asked to do this, because when asking for volunteers to spend their time solving your problems for free, it is only fair that you reduce the amount of effort needed as much as possible. As an exercise, I've reproduced your error in minimal form: >>> import mytkinter Traceback (most recent call last): File "", line 1, in File "mytkinter.py", line 1, in from myturtle import * File "myturtle.py", line 3, in class ScrolledCanvas(TK.Frame): AttributeError: 'module' object has no attribute 'Frame' and here's the minimal implementation: # mytinkinter.py from myturtle import * class Frame: pass # myturtle.py import mytkinter as TK class ScrolledCanvas(TK.Frame): pass > P.S. I bet you've been waiting since you got your first condescending > response to a similar question, to lay it on someone about touching > the Almighty Library. > > Way to keep up the cycle. Don't try to psychoanalyse people you've never met, you aren't any good at it. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Binary search question
On Sat, 24 Apr 2010 07:21:13 am Alan Gauld wrote: > "Emile van Sebille" wrote > > > It's expensive enough that for a list this size I'd convert it to a > > dict and use in on that. eg, > > > > a = range(10) > > d = dict(zip(a,a)) > > Surely that would depend on how often you do the search? > If its a one off occurence I'd expect the overhead of zipping > and converting to a dict would outweight the savings? It absolutely has to, because zipping it has to iterate over the entire list, then calling dict has to iterate over the entire zipped version. That's iterating over the entire list *twice* before you even START doing a search! In Python 3.x, zip is a lazy iterator so that will reduce the excess iterations from twice to once, but for a one-off test it entirely negates the point of converting. > If the search was inside a loop however then I'd definitely > agree. Although I'd opt for a set rather than a dict... Yes, there's no point in making a dict {a:a} just for membership testing when you can just use a set. > Another option would be to use the bisect module on a > sorted version of the list. But keep in mind that unless the list is HUGE, or your Python version includes the C version of bisect, a linear search using in may end up being faster than a slow pure-Python version of bisect. Also, bisect on its own doesn't do membership testing: >>> data = range(0,100,2) >>> 7 in data False >>> bisect.bisect(data, 7) 4 So you have to weigh up the extra complication needed versus the optimization. Another strategy is to move items closer to the front of the list each time they are accessed, so that commonly used items eventually bubble up to the front of the list and searches for them are fast. And finally, if the list is very large, and your searches tend to be clustered, it becomes wasteful to do a fresh binary search each time you look something up. (E.g. consider looking up these three words in the dictionary, one after the other: "caravan", "cat", "cap".) In this case, a good strategy is sometimes called "hunt and search". The algorithm is something like this: Each search takes a second argument, i, the place to start searching from. This will usually be the place the previous search ended at. First you *hunt* for the item: try to bracket the item you want between i and i+1, then i and i+2, then i and i+4, i+8, and so forth, doubling the size of the bracket each time. (This assumes that the value at index i was too small. If it were too large, you hunt in the opposite direction, with i-1 to i, i-2 to i, etc.) Once you have bracketed the item you are searching for, you *search* for it within those limits, using binary search or even a linear search. If your searches are clustered, most of the hunt phases will be short and even linear search will be fast, rather than doing a binary search over the full list each time. -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor