Re: [Tutor] walk registry using _winreg
__ >From: Dave Angel >To: tutor@python.org >Sent: Wednesday, May 29, 2013 2:11 PM >Subject: Re: [Tutor] walk registry using _winreg > > >On 05/29/2013 04:11 AM, Albert-Jan Roskam wrote: >> Hello, >> >> I created a program to go through the windows registry and look for a >> certain key ("file_locations", though in the example I am using a key that >> every windows user has on his/her computer). If found, I want to replace the >> data associated with value "temp_dir" in that key. I have chosen this >> strategy because the exact registry keys may have changed from version to >> version. Also, multiple versions of the program may be installed on a given >> computer. I pasted the code below this mail, but also here: >> http://pastebin.com/TEkyekfi >> >> Is this the correct way to do this? I would actually prefer to specify only >> "valueNameToSearch" and not also "keyToSearch". As in: start walking through >> the registry starting at , return every key where a temp_dir is >> defined. >> >> Thank you in advance! >> >> >> Regards, >> Albert-Jan >> >Please specify Python version. I'll assume 2.7. Obviously this is >Windows, though it's also conceivable that it matters which version of >Windows (XP, Win8, whatever). Hi Dave, sorry, Python 2.7 on Windows 7 Enterprise. >First comment is that I'd be writing walkRegistry as a generator, using >yield for the items found, rather than building a list. It's generally >easier to reuse that way, and won't get you in trouble if there are tons >of matches. See os.walk for an example. ;-) I removed the "yield" statements at the last moment. I had indeed been looking at os.walk. I put them back now. Also studied os.walk again. >A generator also would naturally eliminate the need to know KeyToSearch. > >> >> import _winreg >> import os >> >> global __debug__ >> __debug__ = True >> >> def walkRegistry(regkey, keyToSearch="file_locations", >> valueNameToSearch="temp_dir", verbose=False): >> """Recursively search the Windows registry (HKEY_CURRENT_USER), >> starting at top . Return a list of three tuples that contain >> the registry key, the value and the associated data""" >> if verbose: >> print regkey >> aReg = _winreg.OpenKey(_winreg.HKEY_CURRENT_USER, regkey) >> i, keys, founds = 0, [], [] >> try: >> while True: >> i += 1 >> key = _winreg.EnumKey(aReg, i) > >I believe these are zero-based indexes, so you're skipping the first one. Good catch! Thanks! I wrote a new version (sorry, I don't have access to pastebin in the office, it's qualified as "online storage"). Here's the code. As far as I can tell, it does exactly what I want now. I hope I used all your feedback. CAUTION to anyone who runs this code: it replaces a registry entry. import _winreg import os def walkRegistry(regkey, keyToSet="file_locations", valueToSet="temp_dir", HKEY=_winreg.HKEY_CURRENT_USER, verbose=False): """Recursively search the Windows registry, starting at top . Return a list of three tuples that contain the registry key, the value and the associated data""" if verbose: print regkey i = 0 aReg = _winreg.OpenKey(HKEY, regkey) try: while True: key = _winreg.EnumKey(aReg, i) i += 1 if key: new_regkey = os.path.join(regkey, key) if key == keyToSet: if verbose: print "---> FOUND!!", new_regkey, key, keyToSet, valueToSet with _winreg.OpenKey(HKEY, new_regkey) as anotherReg: value_data = _winreg.QueryValueEx(anotherReg, valueToSet)[0] yield new_regkey, valueToSet, value_data for x in walkRegistry(new_regkey, keyToSet, valueToSet, HKEY, verbose): yield x except WindowsError: pass def setRegistry(regkey, value, data, HKEY=_winreg.HKEY_CURRENT_USER): """Set (subkey) of with """ hkey = [item for item in dir(_winreg) if getattr(_winreg, item) == HKEY][0] print "setting value '%s' with data '%s' in regkey\n'%s\\%s'\n" % \ (value, data, hkey, regkey) try: aReg = _winreg.OpenKey(HKEY, regkey, 0, _winreg.KEY_ALL_ACCESS) except: aReg = _winreg.CreateKey(HKEY, regkey) try: _winreg.SetValueEx(aReg, value, 0, _winreg.REG_SZ, data) finally: _winreg.CloseKey(aReg) if __name__ == "__main__": regkey = u"Software\\Microsoft\\Windows\\CurrentVersion\\Explorer" args = regkey, u"Shell Folders", u"Cookies" regdata = [(regkey, value, data) for regkey, value, data in walkRegistry(*args)] if len(regdata) == 1: regkey, value, existing_data = regdata[0] new_data = os.getenv("temp") setRegistry(regkey, value, new_data) ___ Tutor maill
[Tutor] google spreadsheet
Hey Hi if possible can u guys please help me regarding accessing Google spreadsheet with the help of python i have gone through many blog but have not found any valid solution i trying to access google spreadsheet with help of python in eclipse if you can suggest it would really be a lot of help Thanks and regards, SUMEET SINGH___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] google spreadsheet
On 30/05/2013 17:35, Sumeet Singh wrote: Hey Hi if possible can u guys please help me regarding accessing Google spreadsheet with the help of python i have gone through many blog but have not found any valid solution i trying to access google spreadsheet with help of python in eclipse if you can suggest it would really be a lot of help Thanks and regards, SUMEET SINGH Start here https://developers.google.com/google-apps/documents-list/v1/developers_guide_python I guess!!! -- If you're using GoogleCrap™ please read this http://wiki.python.org/moin/GoogleGroupsPython. Mark Lawrence ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] google spreadsheet
___ > From: Sumeet Singh >To: "tutor@python.org" >Sent: Thursday, May 30, 2013 6:35 PM >Subject: [Tutor] google spreadsheet > > > >Hey Hi >if possible can u guys please help me regarding accessing Google spreadsheet >with the help of python >i have gone through many blog but have not found any valid solution > >i trying to access google spreadsheet with help of python in eclipse if you >can suggest it would really be a lot of help > >Thanks and regards, >SUMEET SINGH Perhaps: https://developers.google.com/drive/quickstart-python ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] walk registry using _winreg
In addition to my previous reply: here's a colour-coded version of the code: http://pastebin.com/bZEezDSG ("Readability counts") Regards, Albert-Jan ~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~ ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] a little loop
Sending again to the list (sorry boB)... On 29 May 2013 17:51, boB Stepp wrote: >> I don't know exactly how str.join is implemented but it does not use >> this quadratic algorithm. For example if str.join would first compute >> the length of the resulting string first then it can allocate memory >> for exactly one string of that length and copy each substring to the >> appropriate place (actually I imagine it uses an exponentially >> resizing buffer but this isn't important). >> > > ...str.join gets around these issues? As I said I don't know how this is implemented in CPython (I hoped Eryksun might chime in there :) ). > In the linked article it was > discussing increasing memory allocation by powers of two instead of > trying to determine the exact length of the strings involved, > mentioning that the maximum wasted memory would be 50% of what was > actually needed. Is Python more clever in its implementation? Actually the maximum memory wastage is 100% of what is needed or 50% of what is actually used. This is if the amount needed is one greater than a power of two and you end up doubling to the next power of two. I don't see how CPython could be much cleverer in its implementation. There aren't that many reasonable strategies here (when implementing strings as linear arrays like CPython does). >> * CPython actually has an optimisation that can append strings in >> precisely this situation. However it is an implementation detail of >> CPython that may change and it does not work in other interpreters >> e.g. Jython. Using this kind of code can damage portability since your >> program may run fine in CPython but fail in other interpreters. >> > You are speaking of "appending" and not "concatenation" here? In this case I was just talking about single characters so you could think of it as either. However, yes the optimisation is for concatenation and in particular the '+' and '+=' operators. > I had not even considered other Python interpreters than CPython. More > complexity to consider for the future... It's only a little bit of complexity. Just bear in mind the distinction between a "language feature" that is true in any conforming implementation and an "implementation detail" that happens to be true in some or other interpreter but is not a specified part of the language, In practise this means not really thinking too hard about how CPython implements things and just using the recommended idioms e.g. str.join. I don't know if it is documented anywhere that str.join is linear rather than quadratic but I consider that to be a language feature. Exactly how it achieves linear behaviour (precomputing, resizing, etc.) is an implementation detail. If your code relies only on language features then it should not have problems when changing interpreters. Oscar ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Random Number Game: Returns always the same number - why?
[Reordered response to after quote] Thomas Murphy wrote: > > > > There are a few issues here: > > * variable names should be lower case > > * for this case it's best to use for loop with range() > > * you calculate random number only once, outside of loop > > > > Try something like: > > > > for count in range(100): > > print random.randint(1, 100) > > > > > > -m > > Mitya, > Why is it best in this situation to use range() rather than a while > loop? Curious about best practices for the various iterating > functions. Thanks! This is not really a case of "range" vs. "while" loops, it is really "while" vs "for" loops. In general, you should use "for" loops when you know the number of times to loop up front. "While" loops should be used when you are unsure how many times you need to loop. "For" loops are good for use with iterables (lists, strings, sequences). "While" loops are good for processing until some state is achieved. In C, it was very easy to interchange "for" and "while" loops, and while it can probably be done in Python it may require a bit more work. ~Ramit This email is confidential and subject to important disclaimers and conditions including on offers for the purchase or sale of securities, accuracy and completeness of information, viruses, confidentiality, legal privilege, and legal entity disclaimers, available at http://www.jpmorgan.com/pages/disclosures/email. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] a little loop
On 30/05/13 02:51, boB Stepp wrote: On Wed, May 29, 2013 at 11:07 AM, Oscar Benjamin wrote: I don't know exactly how str.join is implemented but it does not use this quadratic algorithm. For example if str.join would first compute the length of the resulting string first then it can allocate memory for exactly one string of that length and copy each substring to the appropriate place (actually I imagine it uses an exponentially resizing buffer but this isn't important). I have not actually read the str.join source code, but what I understand is that it has two cases: 1) If you pass a sequence of sub-strings to join, say, a list, it can look at each sub-string, calculate the total space required, and allocate a string buffer of exactly that amount of memory, and only then copy the characters into the buffer. 2) If you pass an iterator, join cannot go over the sub-strings twice, it has to do so in one pass. It probably over-allocates the buffer, then when finished, shrinks it back down again. Sure enough, ''.join(list-of-substrings) is measurably faster than ''.join(iterator-of-substrings). ...str.join gets around these issues? In the linked article it was discussing increasing memory allocation by powers of two instead of trying to determine the exact length of the strings involved, mentioning that the maximum wasted memory would be 50% of what was actually needed. Is Python more clever in its implementation? In the case of lists, CPython will over-allocate. I believe that up to some relatively small size, lists are initially quadrupled in size, after which time they are doubled. The exact cut-off size is subject to change, but as an illustration, we can pretend that it looks like this: - An empty list is created with, say, 20 slots, all blank. - When all 20 slots are filled, the next append or insert will increase the size of the list to 80 slots, 21 of which are used and 59 are blank. - When those 80 slots are filled, the next append or insert will increase to 320 slots. - When those are filled, the number of slots is doubled to 640. - Then 1280, and so forth. So small lists "waste" more memory, up to 75% of the total size, but who cares, because they're small. Having more slots available, they require even fewer resizes, so they're fast. However, I emphasis that the exact memory allocation scheme is not guaranteed, and is subject to change without notice. The only promise made, and this is *implicit* and not documented anywhere, is that appending to a list will be amortised to constant time, on average. (Guido van Rossum, Python's creator, has said that he would not look kindly on anything that changed the basic performance characteristics of lists.) When creating a string, Python may be able to determine the exact size required, in which case no over-allocation is needed. But when it can't, it may use a similar over-allocation strategy as for lists, except that the very last thing done before returning the string is to shrink it down so there's no wasted space. * CPython actually has an optimisation that can append strings in precisely this situation. However it is an implementation detail of CPython that may change and it does not work in other interpreters e.g. Jython. Using this kind of code can damage portability since your program may run fine in CPython but fail in other interpreters. You are speaking of "appending" and not "concatenation" here? Yes. Because strings are immutable, under normal circumstances, concatenating two strings requires creating a third. Suppose you say: A = "Hello " B = "World!" C = A + B So Python can see that string A has 6 characters, and B has 6 characters, so C requires space for 12 characters: C = "" which can then be filled in: C = "Hello World!" and now string C is ready to be used. But, suppose we have this instead: A = A + B # or A += B The *old* A is used, then immediately discarded, and replaced with the new string. This leads to a potential optimization: instead of having to create a new string, Python can resize A in place: A = "Hello --" B = "World!" then copy B into A: A = "Hello World!" But note that Python can only do this if A is the one and only reference to the string. If any other name, list, or other object is pointing to the string, this cannot be done. Also, you can't do it for the reverse: B = A + B since memory blocks can generally only grow from one side, not the other. Finally, it also depends on whether the operating system allows you to grow memory blocks in the fashion. It may not. So the end result is that you cannot really rely on this optimization. It's nice when it is there, but it may not always be there. And just a reminder, none of this is important for one or two string concatenations. It's only when you build up a string from repeated concatenations that this becomes an issue. -- Steven _
Re: [Tutor] walk registry using _winreg
On Thu, May 30, 2013 at 10:47 AM, Albert-Jan Roskam wrote: > > def walkRegistry(regkey, keyToSet="file_locations", > valueToSet="temp_dir", > HKEY=_winreg.HKEY_CURRENT_USER, verbose=False): I suppose you don't need the "sam" option in your case, but in general it's needed for 64-bit Windows in order to handle both native and WOW64 keys. For a WOW64 process, the native 64-bit keys can be read with sam=KEY_READ | KEY_WOW64_64KEY. For a 64-bit process, the WOW64 keys can be read with sam=KEY_READ | KEY_WOW64_32KEY. A WOW64 process will have "PROCESSOR_ARCHITEW6432" defined in os.environ. > aReg = _winreg.OpenKey(HKEY, regkey) You should use a "with" statement here instead of depending on the garbage collection of the generator frame. > try: > while True: > key = _winreg.EnumKey(aReg, i) > i += 1 > if key: > new_regkey = os.path.join(regkey, key) There's too much code here under the banner of one "try" suite. OpenKey and QueryValueEx in the subsequent statements may raise a WindowsError for various reasons. Also, as you're currently doing things it leaves several open handles as you recursively create generators. It's likely not an issue (the registry isn't deeply nested), but in general I prefer to close a resource as immediately as is possible. I'd enumerate the subkeys in a list and only yield a key/value match for the current regkey (that's basically how os.walk traverses the file system). This can match on the initial key. If you don't want that, it can be worked around (e.g. a flag, or a helper function), but I don't think the additional complexity is worth it. import os import _winreg def walkRegistry(regkey, keyToSet="file_locations", valueToSet="temp_dir", HKEY=_winreg.HKEY_CURRENT_USER, sam=_winreg.KEY_READ, onerror=None, verbose=False): try: aReg = _winreg.OpenKey(HKEY, regkey) except WindowsError as e: if onerror is not None: onerror(e) return i = 0 subkeys = [] with aReg: while True: try: subkeys.append(_winreg.EnumKey(aReg, i)) except WindowsError: break i += 1 # check the key name; not the key path if os.path.basename(regkey) == keyToSet: if verbose: print "---> FOUND KEY:", regkey try: data = _winreg.QueryValueEx(aReg, valueToSet)[0] except WindowsError: # value not found pass else: if verbose: print "---> FOUND KEY,VALUE PAIR" yield regkey, valueToSet, data for key in subkeys: new_regkey = os.path.join(regkey, key) for item in walkRegistry( new_regkey, keyToSet, valueToSet, HKEY, sam, onerror, verbose): yield item Minimally tested (sorry): >>> HKEY = _winreg.HKEY_LOCAL_MACHINE >>> regkey = r'Software\Python\PythonCore\2.7' >>> res = list( ... walkRegistry(regkey, 'PythonPath', '', HKEY, verbose=True)) ---> FOUND KEY: Software\Python\PythonCore\2.7\PythonPath ---> FOUND KEY,VALUE PAIR >>> res[0][2] u'C:\\Python27\\Lib;C:\\Python27\\DLLs;C:\\Python27\\Lib\\lib-tk' ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] a little loop
On Wed, May 29, 2013 at 12:51 PM, boB Stepp wrote: > On Wed, May 29, 2013 at 11:07 AM, Oscar Benjamin > wrote: > >> I don't know exactly how str.join is implemented but it does not use >> this quadratic algorithm. For example if str.join would first compute >> the length of the resulting string first then it can allocate memory >> for exactly one string of that length and copy each substring to the >> appropriate place (actually I imagine it uses an exponentially >> resizing buffer but this isn't important). > > ...str.join gets around these issues? In the linked article it was > discussing increasing memory allocation by powers of two instead of > trying to determine the exact length of the strings involved, > mentioning that the maximum wasted memory would be 50% of what was > actually needed. Is Python more clever in its implementation? CPython computes the exact length required. Nothing clever. It first expands the iterable into a list. It joins the strings in two passes. In the first pass it computes the total size (3.3 also has to determine the 'kind' of unicode string in this loop, i.e. ASCII, 2-byte, etc). Then it allocates a new string and copies in the data in a second pass. >> * CPython actually has an optimisation that can append strings in >> precisely this situation. However it is an implementation detail of >> CPython that may change and it does not work in other interpreters >> e.g. Jython. Using this kind of code can damage portability since your >> program may run fine in CPython but fail in other interpreters. >> > You are speaking of "appending" and not "concatenation" here? In terms of sequence methods, it's inplace concatenation. On their own, immutable string types only support regular concatenation, but the interpreter can evaluate the concatenation inplace for special cases. Specifically, it can resize the target string in an INPLACE_ADD if it's not interned and has only *one* reference. Also, the reference has to be a local variable; it can't be a global (unless at module level), an attribute, or a subscript. Here's an example (tested in 2.7 and 3.3). Interned strings: >>> s = 'abcdefgh' * 128 CPython code objects intern their string constants that are all name characters (ASCII alphanumeric and underscore). But that's not an issue if you multiply the string to make it longer than 20 characters. A sequence length of 20 is the cutoff point for compile-time constant folding. This keeps the code object size under wraps. The number 20 was apparently chosen for obvious reasons (at least to someone). Anyway, if the string is determined at runtime, it won't be interned. But really I'm concatenating the base string with itself so many times to avoid using the Pymalloc object allocator (see the note below). Its block sizes are fine grained at just 8 bytes apart. Depending on your system I don't know if adding even one more byte will push you up to the next block size, which would defeat an example based on object id(). I'll take my chances that the stdlib realloc() will be able to grow the block, but that's not guaranteed either. Strings should be treated as immutable at all times. This is just a performance optimization. The reference count must be 1: >>> sys.getrefcount(s) 2 Hmmm. The reference count of the string is incremented when it's loaded on the stack, meaning it will always be at least 2. As such, the original variable reference is deleted before in-place concatenation. By that I mean that if you have s += 'spam', then mid operation s is deleted from the current namespace. The next instruction stores the result back. Voilà: >>> id_s = id(s) >>> s += 'spam' >>> id(s) == id_s True Note on object reallocation: The following assumes CPython is built with the Pymalloc small-object allocator, which is the default configuration in 2.3+. Pymalloc requests memory from the system in 256 KiB chunks calls arenas. Each arena is partitioned into 64 pools. Each pool has a fixed block size, and block sizes increase in steps of 8, from 8 bytes up to 256 bytes (up to 512 bytes in 3.3). Resizing the string eventually calls PyObject_Realloc. If the object isn't managed by Pymalloc, the call to PyObject_Realloc punts to the C stdlib realloc. Otherwise if the new size maps to a larger block size, or if it's shrinking by more than 25% to a smaller block size, the allocation punts to PyObject_Malloc. This allocates a block from the first available pool. If the requested size is larger than the maximum block size, it punts to the C stdlib malloc. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] google spreadsheet
On Thu, May 30, 2013 at 12:50 PM, Mark Lawrence wrote: >> >> i trying to access google spreadsheet with help of python > > Start here > https://developers.google.com/google-apps/documents-list/v1/ > developers_guide_python > I guess!!! > > -- > If you're using GoogleCrap™ please read this :D ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] google spreadsheet
On 30/05/13 17:35, Sumeet Singh wrote: if possible can u guys please help me regarding accessing Google spreadsheet with the help of python This list is for people learning Python and its standard library. You will likely get more help on the main Python mailing list/newsgroup. Or possibly a Google programming forum, assuming such a thing exists. To improve your chances try asking specific questions and providing information about your OS, Python version and what you've tried and how it went. Include any error messages. That way people won't have to guess. -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] To error statement or not to error statement
Jim Mooney wrote: > Sent: Wednesday, May 22, 2013 11:28 AM > To: tutor@python.org > Subject: [Tutor] To error statement or not to error statement > > >> "I find it amusing when novice programmers believe their main job is > >> preventing programs from crashing. ... More experienced programmers realize > >> that correct code is great, code that crashes could use improvement, but > >> incorrect code that doesn't crash is a horrible nightmare." > > Then am I right to assume that rather than put in error statements I > barely understand at this point, the best thing would be to work the > hell out of the program in hope of seeing an error? Does Python have > something that would do this automatically since I can't see running a > program a hundred times by hand? > > Mainly, I'm just learning all this stuff for future reference. I > really doubt I'll need to use nose to find errors in twenty-line > programs. Print, assert, and staring at it for a long time should be > enough for now - and the Wing debugger now and then. > > From the varied replies so far, it sounds to me that debugging is more > of an art than a science. So far the books I've looked at just mention > the basics but don't get into the philosophy of when and how. > > Jim > If you mostly just write small programs and use them from shell/debugger, then I would just never catch any error. All raised exceptions will automatically be printed by the interpreter when they kill your code. That will automatically show the error and the appropriate stack trace to fix the error. If you combine that with proper logging (print can be fine for single threads) you should have everything you need to solve the problem. Eventually you get better at logging and anticipating possible problems. ~Ramit This email is confidential and subject to important disclaimers and conditions including on offers for the purchase or sale of securities, accuracy and completeness of information, viruses, confidentiality, legal privilege, and legal entity disclaimers, available at http://www.jpmorgan.com/pages/disclosures/email. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] a little loop
On 30 May 2013 21:35, eryksun wrote: > In terms of sequence methods, it's inplace concatenation. On their > own, immutable string types only support regular concatenation, but > the interpreter can evaluate the concatenation inplace for special > cases. Specifically, it can resize the target string in an INPLACE_ADD > if it's not interned and has only *one* reference. It's also for BINARY_ADD in the form a = a + b: $ python Python 2.7.3 (default, Sep 26 2012, 21:51:14) [GCC 4.7.2] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> s = 'abcdefgh' * 128 >>> id_s = id(s) >>> s = s + 'spam' >>> print(id(s) == id_s) True A rare case of me actually using the dis module: >>> def f(): ... s = s + 'spam' ... >>> import dis >>> dis.dis(f) 2 0 LOAD_FAST0 (s) 3 LOAD_CONST 1 ('spam') 6 BINARY_ADD 7 STORE_FAST 0 (s) 10 LOAD_CONST 0 (None) 13 RETURN_VALUE Oscar ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] a little loop
On Thu, May 30, 2013 at 6:35 PM, Oscar Benjamin wrote: > > It's also for BINARY_ADD in the form a = a + b: Right you are. It sees that the next operation is a store back to "a". It wouldn't work the other way around, i.e. a = b + a. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] a little loop
On Thu, May 30, 2013 at 3:16 PM, Steven D'Aprano wrote: > > Sure enough, ''.join(list-of-substrings) is measurably faster than > ''.join(iterator-of-substrings). A tuple or list is used directly. Otherwise join() has to create an iterator and build a new list. This isn't directly related to the discussion on string concatenation. But in relation to the offshoot discussion on lists, I've put together some examples that demonstrate how different was of creating the same list lead to different allocated sizes. First, here's a function to get the allocated length of a list's item array: from ctypes import sizeof, c_void_p, c_ssize_t alloc_offset = sizeof(c_void_p * 2) + sizeof(c_ssize_t * 2) def allocated(alist): addr = id(alist) alloc = c_ssize_t.from_address(addr + alloc_offset) return alloc.value It uses ctypes to peek into the object, but you can also use a list object's __sizeof__() method to calculate the result. First get the array's size in bytes by subtracting the size of an empty list from the size of the list. Then divide by the number of bytes in a pointer: import struct pointer_size = struct.calcsize('P') empty_size = [].__sizeof__() def allocated(alist): size_bytes = alist.__sizeof__() - empty_size return size_bytes // pointer_size Example 0: >>> allocated([0,1,2,3,4,5,6,7,8,9,10,11]) 12 In this case the constants are pushed on the stack, and the interpreter evaluates BUILD_LIST(12), which in CPython calls PyList_New(12). The same applies to using built-in range() in 2.x (not 3.x): >>> allocated(range(12)) 12 Example 1: >>> allocated([i for i in xrange(12)]) 16 This starts at 0 and grows as follows: 1 + 1//8 + 3 = 4 5 + 5//8 + 3 = 8 9 + 9//8 + 6 = 16 Example 2: >>> allocated(list(xrange(12))) 19 This also applies to range() in 3.x. Some iterators have a "__length_hint__" method for guessing the initial size: >>> iter(xrange(12)).__length_hint__() 12 The guess is immediately resized as follows: 12 + 12//8 + 6 = 19 Example 3: >>> allocated(list(i for i in xrange(12))) 12 The initializer here is a generator, compiled from the generator expression. A generator doesn't have a length hint. Instead, the list uses a default guess of 8, which is over-allocated as follows: 8 + 8//8 + 3 = 12 If the generator continues to a 13th item, the list resizes to 13 + 13//8 + 6 = 20: >>> allocated(list(i for i in xrange(13))) 20 ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor