On Fri, Jan 29, 2016 at 04:29:14PM -0500, Ryan Smith wrote: > Hi all, > > I am new to programming and python and had a question. I hope I > articulate this well enough so here it goes... I was following along > on a thread on this mailing list discussing how to test file I/O > operations. In one of the suggested solutions the following code was > used: > > > def _open_as_stringio(self, filename): > filelike = StringIO(self.filecontents[filename]) > real_close = filelike.close > def wrapped_close(): > self.filecontents[filename] = filelike.getvalue() > real_close() > filelike.close = wrapped_close > return filelike
This function (actually a method) creates and returns a new object which behaves like an actual file. When that fake file's "close" method is called, it has to run some code which accesses the instance which created it (self). How does this fake file know which instance created it? It has to "remember" the value of self, somehow. There are two usual ways for a function to be given access to a variable (in this case, self): - pass it as a parameter to the function; - access it as a global variable. The first can't work, because fake_file.close() takes no arguments. The second won't work safely, unless you can guarantee that there is only ever one fake file at a time. If there are two, they will both try to access different values of self in the same global, and bad things will happen. ("Global Variables Considered Harmful.") But there's a third way, and that's called "a closure". And this is exactly what happens here. Let's look at the inner function more carefully: def wrapped_close(): self.filecontents[filename] = filelike.getvalue() real_close() "wrapped_close" takes no arguments, but it references four variables inside the body: - self - filename - filelike - real_close Each of those variables are defined in the surrounding function. So when Python creates the inner function, it records a reference to those outer variables inside the inner function. These are called "closures", and we say "the inner function closes over these four variables". (I don't know why the word "close" is used.) So the word "closure" can be used either for the inner function itself, or the internal hidden links between it and the variables. In other words: The inner function "wrapped_close" remembers the values of those four variables, as they were at the time the inner function was defined. So by the time you eventually get hold of the fake file, and call its close method, that close method will remember which instance created the fake file, etc, and use those variables as needed. Complicated? Well, a bit. Here's a simpler demonstration of a closure: def factory(num): """Factory function that returns a new function that adds num to whatever it is given.""" def adder(x): return x + num add1 = factory(1) add2 = factory(2) add17_25 = factory(17.25) print(add1(100)) # => prints 101 print(add2(50)) # => prints 52 print(add17_25(5000)) # => prints 5017.25 Each of the "add" functions remember the value that "num" had, *at the time of creation*. So as far as add1 is concerned, num=1, even though add2 considers that num=2. (Warning: be careful with mutable values like lists, since they can change value if you modify them. If this makes no sense to you, please ask, and I'll demonstrate what I mean.) > In trying to understand the logic behind the code for my own > edification, I was wondering what is the reasoning of defining a > function in side another function (for lack of a better phrase)? There are two common reasons for defining inner functions. The first is a boring and not-very-exciting reason: you want to hide that function from anyone else, prevent them from using it, except for one specific function. So you put it inside that function: def example(n): results = [] def calc(x): # Pretend that calc is actually more complex. return x*100 for i in range(1, n+1): results.append(calc(i)) return results example(3) # => returns [100, 200, 300] That's a fairly boring reason, and people rarely do that. Normally, you would just make calc an external function, perhaps named "_calc" so people know it is a private function that shouldn't be used. The second reason is to make the inner function a closure, as shown above. -- Steve _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor