[Tutor] Python Variables Changing in Other Functions
Hello, I am having trouble with determining when python is passing by reference and by value and how to fix it to do what I want: I am writing a program that will take in a list of book titles and will allow many people to rank them in terms of popularity and will export the results to Excel. I'll include the whole code below, but the function I'm having trouble with is rankRandom(). I want it to take in a list of titles, randomize the list (so that I can be certain the order isn't influencing the results of the rankings), get a person's rankings, and then resort the order of the rankings to match the original order of title list (that way I can match up different people's rankings to the correct title). The issue is this: random.shuffle() mutates the list in place, rather than creating a new copy. This is fine, but rather than modifying just the local copy of my titles, it is modifying it in the other functions, too. For instance, rankRandom() is called by main(), which passes it listOfTitles. When rankRandom() returns, listOfTitles has been changed to the randomized version of titles. To fix this, I tried copying the original title list and then assigning it to the mutated version right before the rankRandom() function returns. The local version of titles in rankRandom() does indeed regain its original value, but listOfTitles in main() is still being assigned to the randomized version, and not to the original version. This boggles me, since it seems like shuffle is mutating the titles as if it were a global variable, but assignment is treating it only as a local. What exactly is going on here, and how do I avoid this problem? Many thank! Rachel import xlwt as excel import random import copy def getTitleList(): """ Makes a list of all the lines in a file """ filename = raw_input("Name and Extension of File: ") myFile = open( filename ) titles = [] title = "none" while title != "": title = myFile.readline() if title not in ["\n",""]: titles.append(title) return titles def rank( titles ): """ Gets a user-input ranking for each line of text. Returns those rankings """ ranks = [] for t in titles: rank = raw_input(t + " ") ranks.append(rank) return ranks def rankRandom( titles ): """ Takes a list of titles, puts them in random order, gets ranks, and then returns the ranks to their original order (so that the rankings always match the correct titles). """ finalRanks = [0]*len(titles) origTitles = copy.copy(titles) #print "Orign: ", origTitles random.shuffle(titles) # Shuffle works in-place ranks = rank(titles) i = 0 for t in titles: finalRanks[ origTitles.index(t) ] = ranks[i] i += 1 titles = origTitles # Must restore, since python passes by reference, not # value, and the original structure was changed by # shuffle #print "t: ", titles return finalRanks def writeToExcel(titles, allRanks): # Open new workbook mydoc = excel.Workbook() # Add a worksheet mysheet = mydoc.add_sheet("Ranks") # Write headers header_font = excel.Font() # Make a font object header_font.bold = True header_font.underline = True # Header font needs to be style, actually header_style = excel.XFStyle(); header_style.font = header_font # Write Headers: write( row, col, data, style ) row = 0 col = 0 for t in titles: # Write data. Indexing is zero based, row then column mysheet.write(row, col, t, header_style) col += 1 # Write Data row += 1 for ranks in allRanks: col = 0 for r in ranks: mysheet.write(row, col, r) col += 1 row += 1 # Save file. You don't have to close it like you do with a file object mydoc.save("r.xls") def main(): listOfTitles = getTitleList() allRanks = [] done = raw_input("Done?: ") while done != "y": allRanks.append( rankRandom( listOfTitles ) ) #print listOfTitles done = raw_input("Done?: ") writeToExcel(listOfTitles, allRanks ) if __name__ == "__main__" : main() ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Python Variables Changing in Other Functions
I'm not quite certain I understand. When you say sections, do you mean different worksheets? If so, you should finish writing on one worksheet first, and then move to another. If you're talking about writing to row 5, and then jumping to row 50, enumerate lets you do that by allowing you to determine where to start indexing. btw, my improved code is below. Maybe it will offer some clarification? Rachel --- import xlwt as excel import random import copy # Note: When python takes in arguments to a function, it passes in the entire # object, so any modifications made to that object will be retained after the # function terminates without you having to explicity return the object. You # only have to return an object if it wasn't passed in as an argument to the # function and you need to use in in another function. def getTitleList(): """ Makes a list of all the lines in a file """ filename = raw_input("Name and Extension of File: ") myFile = open( filename ) dictOfTitles = {} for title in myFile: if title not in ["\n",""]: dictOfTitles[title] = [] return dictOfTitles def rank( randomizedTitles, dictOfTitles ): """ Gets a user-input ranking (0-10) for each line of text. Returns those rankings """ for title in randomizedTitles: while True: rank = raw_input(title + " ") if not rank.isdigit(): continue elif ( int(rank) > 10 ): continue dictOfTitles[title].append(rank) break def rankRandom( dictOfTitles ): """ Takes a list of titles, puts them in random order, gets ranks, and then returns the ranks to their original order (so that the rankings always match the correct titles). """ randomizedTitles = dictOfTitles.keys() random.shuffle(randomizedTitles)# Shuffle works in-place. rank(randomizedTitles, dictOfTitles) def writeToExcel(dictOfTitles): """ Writes the titles and ranks to Excel """ # Open new workbook mydoc = excel.Workbook() # Add a worksheet mysheet = mydoc.add_sheet("Ranks") # Make header style header_font = excel.Font() # Make a font object header_font.bold = True header_font.underline = True header_style = excel.XFStyle(); header_style.font = header_font # Write headers and ranks to Excel. Indexing is 0-based # write( row, col, data, style ) for col, title in enumerate(dictOfTitles): mysheet.write(0, col, title, header_style) for row, rank in enumerate( dictOfTitles[title], 1 ): mysheet.write(row, col, rank) # Save file. You don't have to close it like you do with a file object mydoc.save("r.xls") def main(): dictOfTitles = getTitleList() done = "" while done.lower() != "y": rankRandom( dictOfTitles ) done = raw_input("Done? (y/[n]): ") writeToExcel(dictOfTitles) if __name__ == "__main__" : main() On May 25, 2011, at 1:49 PM, Prasad, Ramit wrote: > >> Having lots of += hanging around is a perfect example of a code smell > >> (i.e. something in this code stinks, and we should change >>it). Part of > >> being a good programmer is learning to recognize those bad smells and > >> getting rid of them. Turns out, Python has a lot >>of nice built-in > >> functions for the elimination of code smells. In this case, it's the > >> enumerate function: > > What happens if you are trying to write to an excel document with several > different sections and need to keep track of things like last written row / > current row? I could keep track of enumerations and then save them to a local > variable and then append it but that seems about as funky. Is there a better > way? > > # pseudocode-ish > # Is this better? > blah = enumerate(raw_data) > For enumerate_value, data in blah: > sheet.write (base_row + enumerate_value , column, data) > base_row = blah[-1][0] > > Ramit > > > > Ramit Prasad | JPMorgan Chase Investment Bank | Currencies Technology > 712 Main Street | Houston, TX 77002 > work phone: 713 - 216 - 5423 > This communication is for informational purposes only. It is not intended as > an offer or solicitation for the purchase or sale of any financial instrument > or as an official confirmation of any transaction. All market prices, data > and other information are not warranted as to completeness or accuracy and > are subject to change without notice. Any comments or statements made herein > do not necessarily reflect those of JPMorgan Chase & Co., its subsidiaries > and affiliates. This transmission may contain information that is privileged, > confidential, legally privileged, and/or exempt from disclosure under > applicable law. If you are not the intended recipient, you are hereby > notifi
Re: [Tutor] Python Variables Changing in Other Functions
You asked for the traceback. All I get is this: - python a2.py File "a2.py", line 20 titles = [title in myFile if title not in ["\n",""]] ^ SyntaxError: invalid syntax -- (In case the spaces don't come through in this email, the carrot ^ is pointing to the last ]) The function was this: -- def getTitleList(): """ Makes a list of all the lines in a file """ filename = raw_input("Name and Extension of File: ") myFile = open( filename ) titles = [title in myFile if title not in ["\n",""]] return titles -- Rachel On May 25, 2011, at 6:29 PM, Wayne Werner wrote: > On Wed, May 25, 2011 at 3:59 PM, Rachel-Mikel ArceJaeger > wrote: > Thank you so much for taking the time to comment this all out. It was very > very helpful and showed me improvements to some coding styles I have been > doing for years. I have a couple of questions, though: > > 1. I tried the following line: titles = [title in myFile if title not in > ["\n",""]] > as you suggested, but I'm getting a syntax error pointing to the last ]. > Later on I ended up changed titles from a list to a dict, so I'm not sure if > this is even applicable anymore, but since I use this sort of structure a > lot, I'm curious as to why it is not working. > > You'll have to copy/paste the traceback, and the code snippet - otherwise > it's just a guess! > > 2. I am curious as to how the amended for-loop (for titles in myFile) knows > to assign title to a line of text. I can see that it does, but I'm not sure > why it acts that way. > > It's magic! Well, not really. In C-style languages, your for loop usually > takes the form of > > for(int x = 0; x < sizeOfSomething; x++){ > somehow_use(something[x]); > } > > But x is usually unused - what you really want to say is "for each item in > this collection, do something with that item". So Guido Van Rossum, in his > Dutch-y wisdom, blessed us with this type called an iterable. Which is > basically anything that you can think of in separate parts. Letters in a > string, lines in a file, items in a list, and so on and so forth. Rather than > wasting the extra "int x = 0; x < size; x++", you simply have to tell the > loop what variable you want to use, and what iteratble you want to iterate > over, and Python takes care of the details. > > Iterables really allow for some super neat programming. > > 3. I've never used zip before and I'm a little confused about why your > amended for-loop works the way it does. As far as I can tell, > > a = [1,2,3] > b = ['a','b','c'] > d = zip(a,b) > > means d is [(1, 'a'), (2, 'b'), (3, 'c')] > > So how is it that if I say > > for c,d in zip(a,b): > ... print [c,d] > > I get: > > [1, 'a'] > [2, 'b'] > [3, 'c'] > > It seems to me we should have to unzip the zipped list or something to get > the tuple first, but it immediately gets the elements of the tuple. Why? > > This looks like magic, but it really isn't. Consider the following: > >>> a = (1,2) > >>> x, y = a > >>> x > 1 > >>> y > 2 > >>> b = [(4,5), (6,7)] > >>> x, y = b[0] > >>> x > 4 > >>> y > 5 > > Python has this nifty little feature called unpacking, that allows you to use > a collection of data on the right side and a collection of variables on the > left side, and if the numbers of arguments match, then assignment happens. As > for what happens when the numbers don't match up, I'll leave that experiment > to you ;) > > > 4. Regarding my previous question about passing in arguments, is the > following surmise correct?: When python takes in arguments to a function, it > passes in the entire object, so any modifications made to that object will be > retained after the function terminates without you having to explicity return > the object. You only have to return an object if it wasn't passed in as an > argument to the function and you need to use in in another function. > > No. It might appear that way at times, but that surmise is based on an > incorrect premise. If you read this paper: > http://effbot.org/zone/python-objects.htm it explains what Python objects > really are. Then read http://effbot.org/zone/call-by-object.htm and it > explains how Python passes arguments. > > HTH, > Wayne R.M. ArceJaeger Author/Publisher, Platypus Press Contact: arcejae...@gmail.com Website: http://rmarcejaeger.com ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Python Variables Changing in Other Functions
Yes it does! Thank you. Rachel On May 25, 2011, at 8:34 PM, Wayne Werner wrote: > On Wed, May 25, 2011 at 10:17 PM, Rachel-Mikel ArceJaeger > wrote: > You asked for the traceback. All I get is this: > - > > python a2.py > File "a2.py", line 20 > titles = [title in myFile if title not in ["\n",""]] >^ > SyntaxError: invalid syntax > > Ahah. You're missing the important part: > > titles = [title for title in myFile if title not in ["\n",""]] > > you're missing the "title for" part. That should fix it. > > HTH, > Wayne R.M. ArceJaeger Author/Publisher, Platypus Press Contact: arcejae...@gmail.com Website: http://rmarcejaeger.com ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Python Extensions in C
A couple small things that will help improve memory management Rather than avg = sumall / count; return avg; Just return sumall/count instead. Then you don't have to waste a register or assignment operation. Division is expensive. Avoid it when you can. Here, for (a=0; a != count; a++) { temp = PyFloat_AsDouble(PySequence_Fast_GET_ITEM(seq,a)); sumall += temp; Again, save variables and operations. Write this as: for (a=0; a != count; a++) { sumall += PyFloat_AsDouble(PySequence_Fast_GET_ITEM(seq,a)); Similar corrections in var() It's cheaper when you're using powers of two to just right or left-shift: >> or <<. Since you want to increase by a power of two, do: (avg - PyFloat_AsDouble(PySequence_Fast_GET_ITEM(seq,a) << 1; // This means (...)^(2^1) Division by powers of two is >>. Note that these only works for powers of two. Now I haven't worked with pointers in a long time and didn't fully trace this out so I'm probably wrong, but it doesn't seem like you ever have your pointers in stat_avg() point to an object. Therefore wouldn't they always be Null? R.M. ArceJaeger Author/Publisher, Platypus Press Contact: arcejae...@gmail.com Website: http://rmarcejaeger.com On May 26, 2011, at 8:22 AM, James Reynolds wrote: > Hello All: > > As an intellectual exercise, I wanted to try my hand at writing some > extensions in C. > > I was wondering if you all could look over my code and give some feedback. > > Here is the link for the code: http://pastebin.com/jw3ihfsN > > I have zero experience coding in C (and not much more coding in Python!). > Being a kinetic learner, I thought this would be a good exercise to teach me > some of the underpinnings of Python, how it works, why it works the way it > does, and as an added bonus, skills to actually write my own extensions if I > ever wanted to. > > I had to learn about pointers to do this, and I'm still not 100% on if I used > them correctly herein. > > I am also very concerned with memory management because I am not sure when I > should be calling the memory allocation macros to decref or incref when > needed. > > I would also like to get feedback on how I am constructing C algorithms. > > As far as the module itself goes, I was able to compile and use it on a > windows machine compiling with mingw (I use distutils to do the work, so for > me I do "python setup.py build" in my CMD. > > There are three functions, stats.mean, stats.var, stats.stdev (and they do > what you would expect). One thing though, these are the "population" > statistics and not "sample" in case you want to test it out. > > Also, anything else that you think would be worthwile pointing out, tips and > tricks, common pitfalls, etc. > > Thanks in advance for you feedback. > > ___ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > http://mail.python.org/mailman/listinfo/tutor ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Python Extensions in C
I suppose it's up to the programmer. Personally, I find something like this: variable += something a lot easier to read than temp = something variable += temp For me, it's just another variable I have to worry about allocating/deallocating/seeing if it's used anywhere else/accidentally using when I shouldn't. And I'm sure you're right and the memory tweaks I mentioned aren't that necessary. I only brought them up because James asked about memory management and depending on the program you're using and the size of your inputs, it actually can make a difference. I recall one program I wrote where changing things little things like division and eliminating extra variables that we don't think of as being time-expensive drastically lessened the runtime. Rachel On May 26, 2011, at 10:40 AM, Alan Gauld wrote: > "Rachel-Mikel ArceJaeger" wrote > >> avg = sumall / count; >> return avg; >> Just return sumall/count instead. >> Then you don't have to waste a register or assignment operation. > > Readibility counts in C too. And premature optimisation is even > more of an evil since C is harder to read to start with > This code doesn't need to be saving microseconds (the Python > object creation will likely consume far more resource and > speed than these tweaks). > >> Division is expensive. Avoid it when you can. > > That is entirely dependant on your processor. > Even on an Intel chip with a math processor it's > only very slightly more expensive than the other operators, > compared to memory allocation it's lightning fast. > And readability counts. > >> for (a=0; a != count; a++) { >> temp = PyFloat_AsDouble(PySequence_Fast_GET_ITEM(seq,a)); >> sumall += temp; >> Again, save variables and operations. Write this as: >> >> for (a=0; a != count; a++) { >> sumall += >> PyFloat_AsDouble(PySequence_Fast_GET_ITEM(seq,a)); > > Again the two Python calls will vastly outweigh the savings. > And in combining the lines you lose a debug and instrumentation > opportunity and the chance to introduce a safety check of > the return values. Reliable code beats fast code. > >> It's cheaper when you're using powers of two to just >> right or left-shift: >> or <<. Since you want to increase >> by a power of two, do: >> >> (avg - PyFloat_AsDouble(PySequence_Fast_GET_ITEM(seq,a) << 1; > > While this is sufficiently idiomatic to be understood by > most C programmers it's still likely to be swamped in effect > by the Python calls. Readibility still counts, make the > code express the algorithm not the workings of the CPU. > > C can be tweaked to be unreadable or it can be made as > clear as most other 3GLs. The more unreadable it is the > more likely it is to harbour bugs. If there is a genuine need > to be as small and fast as possible then optimise, but hide > those optimisations in a function if possible. If there is no > need to optimise at the expense of readability or reliability > then don't. > > Consider these the ravings of an ex maintenance programmer > who spent far too much of his life deciphering other folks > "clever" C code... It wasn't clever and it wasn't working! > > Alan G. > > > ___ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > http://mail.python.org/mailman/listinfo/tutor R.M. ArceJaeger Author/Publisher, Platypus Press Contact: arcejae...@gmail.com Website: http://rmarcejaeger.com ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] checking if a variable is an integer?
Isn't one of the unsolved millenium prize problems one that includes the ability to find all of the prime numbers? I'm not sure if your program is possible if the input number is large. But to check if a number x is an int, just do this: x == int(x) Rachel On May 31, 2011, at 2:23 PM, Hans Barkei wrote: > I want to make a program that finds all the prime numbers up to a number > inputed by the user. > I want to know if it is an integer because that will tell me if it is > divisible by that number or not. >-Hans- > ___ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > http://mail.python.org/mailman/listinfo/tutor R.M. ArceJaeger Author/Publisher, Platypus Press Contact: arcejae...@gmail.com Website: http://rmarcejaeger.com ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] checking if a variable is an integer?
Isn't one of the unsolved millenium prize problems one that includes the ability to find all of the prime numbers? I'm not sure if your program is possible if the input number is large. But to check if a number x is an int, just do this: x == int(x) Rachel On Tue, May 31, 2011 at 2:38 PM, Hugo Arts wrote: > On Tue, May 31, 2011 at 11:30 PM, Joel Goldstick > wrote: > > > > > > > http://stackoverflow.com/questions/1265665/python-check-if-a-string-represents-an-int-without-using-try-except > > > > def RepresentsInt(s): > > > > try: > > int(s) > > > > return True > > except ValueError: > > > > return False > > > print RepresentsInt("+123") > > > > True > print RepresentsInt("10.0") > > > > False > > > > For strings, that works, but not for integers: > > >>> int(10.0) > 10 > > And if you want to check if one number is divisible by another, you're > not going to call it on strings. > > A better way is to use the modulo operator, which gives the remainder > when dividing: > > >>> a = 6 > >>> a % 3 > 0 > >>> a % 4 > 2 > > So, if the remainder is zero the left operand is divisible by the right > one. > > Hugo > ___ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > http://mail.python.org/mailman/listinfo/tutor > ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] checking if a variable is an integer?
Isn't one of the unsolved millenium prize problems one that includes the ability to find all of the prime numbers? I'm not sure if your program is possible if the input number is large. But to check if a number x is an int, just do this: x == int(x) Rachel On May 31, 2011, at 2:23 PM, Hans Barkei wrote: > I want to make a program that finds all the prime numbers up to a number > inputed by the user. > I want to know if it is an integer because that will tell me if it is > divisible by that number or not. >-Hans- > ___ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > http://mail.python.org/mailman/listinfo/tutor R.M. ArceJaeger Author/Publisher, Platypus Press Contact: arcejae...@gmail.com Website: http://rmarcejaeger.com ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor