[Tutor] [OT] Secure coding guidelines
Hi, This is a little off-topic, but, I though I might put this question in. Since I am learning Python, I was wondering if there are any good references on secure coding practices. Books, guides or even any howtos would suffice. Security seems to be almost always an after-thought rather than being ingrained into any course that I have come across including the ones that they have in college degrees. If this question is inappropriate for this list then please let me know and accept my apologies (EAFP) ;-) Regards, Didar ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] If you don't close file when writing, do bytes stay in memory?
xbmuncher wrote: Which piece of code will conserve more memory? I think that code #2 will because I close the file more often, thus freeing more memory by closing it. Am I right in this thinking... or does it not save me any more bytes in memory by closing the file often? Sure I realize that in my example it doesn't save much if it does... but I'm dealing with writing large files.. so every byte freed in memory counts. Thanks. CODE #1: def getData(): return '12345' #5 bytes f = open('file.ext', 'wb') for i in range(2000): f.write(getData()) f.close() CODE #2: def getData(): return '12345' #5 bytes f = open('file.ext', 'wb') for i in range(2000): f.write(getData()) if i == 5: f.close() f = open('file.ext', 'ab') i = 1 i = i + 1 f.close() You don't save a noticeable amount of memory usage by closing and immediately reopening the file. The amount that the system buffers probably wouldn't depend on file size, in any case. When dealing with large files, the thing to watch is how much of the data you've got in your own lists and dictionaries, not how much the file subsystem and OS are using. But you have other issues in your code. 1) you don't say what version of Python you're using. So I'll assume it's version 2.x. If so, then range is unnecessarily using a lot of memory. It builds a list of ints, when an iterator would do just as well. Use xrange(). ( In Python 3.x, xrange() was renamed to be called range(). ) This may not matter for small values, but as the number gets bigger, so would the amount of wastage. 2) By using the same local for the for loop as for your "should I close" counter, you're defeating the logic. As it stands, it'll only do the close() once. Either rename one of these, or do the simpler test, of if i%5 == 0: f.close() f = open 3) Close and re-open has three other effects. One, it's slow. Two, append-mode isn't guaranteed by the C standard to always position at the end (!). And three, it flushes the data. That can be a very useful result, in case the computer crashes while spending a long time updating a file. I'd suggest sometimes doing a flush() call on the file, if you know you'll be spending a long time updating it. But I wouldn't bother closing it. DaveA ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] If you don't close file when writing, do bytes stay in memory?
What does flush do technically? "Flush the internal buffer, like stdio‘s fflush(). This may be a no-op on some file-like objects." The reason I thought that closing the file after I've written about 500MB file data to it, was smart -> was because I thought that python stores that data in memory or keeps info about it somehow and only deletes this memory of it when I close the file. When I write to a file in 'wb' mode at 500 bytes at a time.. I see that the file size changes as I continue to add more data, maybe not in exact 500 byte sequences as my code logic but it becomes bigger as I make more iterations still. Seeing this, I know that the data is definitely being written pretty immediately to the file and not being held in memory for very long. Or is it...? Does it still keep it in this "internal buffer" if I don't close the file. If it does, then flush() is exactly what I need to free the internal buffer, which is what I was trying to do when I closed the file anyways... However, from your replies I take it that python doesn't store this data in an internal buffer and DOES immediately dispose of the data into the file itself (of course it still exists in variables I put it in). So, closing the file doesn't free up any more memory. On Sat, Oct 10, 2009 at 7:02 AM, Dave Angel wrote: > xbmuncher wrote: > >> Which piece of code will conserve more memory? >> I think that code #2 will because I close the file more often, thus >> freeing >> more memory by closing it. >> Am I right in this thinking... or does it not save me any more bytes in >> memory by closing the file often? >> Sure I realize that in my example it doesn't save much if it does... but >> I'm >> dealing with writing large files.. so every byte freed in memory counts. >> Thanks. >> >> CODE #1: >> def getData(): return '12345' #5 bytes >> f = open('file.ext', 'wb') >> for i in range(2000): >>f.write(getData()) >> >> f.close() >> >> >> CODE #2: >> def getData(): return '12345' #5 bytes >> f = open('file.ext', 'wb') >> for i in range(2000): >>f.write(getData()) >>if i == 5: >>f.close() >>f = open('file.ext', 'ab') >>i = 1 >>i = i + 1 >> >> f.close() >> >> >> > You don't save a noticeable amount of memory usage by closing and > immediately reopening the file. The amount that the system buffers probably > wouldn't depend on file size, in any case. When dealing with large files, > the thing to watch is how much of the data you've got in your own lists and > dictionaries, not how much the file subsystem and OS are using. > > But you have other issues in your code. > > 1) you don't say what version of Python you're using. So I'll assume it's > version 2.x. If so, then range is unnecessarily using a lot of memory. It > builds a list of ints, when an iterator would do just as well. Use > xrange(). ( In Python 3.x, xrange() was renamed to be called range(). ) > This may not matter for small values, but as the number gets bigger, so > would the amount of wastage. > > 2) By using the same local for the for loop as for your "should I close" > counter, you're defeating the logic. As it stands, it'll only do the > close() once. Either rename one of these, or do the simpler test, of > if i%5 == 0: > f.close() > f = open > > 3) Close and re-open has three other effects. One, it's slow. Two, > append-mode isn't guaranteed by the C standard to always position at the end > (!). And three, it flushes the data. That can be a very useful result, in > case the computer crashes while spending a long time updating a file. > > I'd suggest sometimes doing a flush() call on the file, if you know you'll > be spending a long time updating it. But I wouldn't bother closing it. > > DaveA > > > ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] If you don't close file when writing, do bytes stay in memory?
Oh yea, it's python 2.6. On Sat, Oct 10, 2009 at 10:32 AM, Xbox Muncher wrote: > What does flush do technically? > "Flush the internal buffer, like stdio‘s fflush(). This may be a no-op on > some file-like objects." > > The reason I thought that closing the file after I've written about 500MB > file data to it, was smart -> was because I thought that python stores that > data in memory or keeps info about it somehow and only deletes this memory > of it when I close the file. > When I write to a file in 'wb' mode at 500 bytes at a time.. I see that the > file size changes as I continue to add more data, maybe not in exact 500 > byte sequences as my code logic but it becomes bigger as I make more > iterations still. > > Seeing this, I know that the data is definitely being written pretty > immediately to the file and not being held in memory for very long. Or is > it...? Does it still keep it in this "internal buffer" if I don't close the > file. If it does, then flush() is exactly what I need to free the internal > buffer, which is what I was trying to do when I closed the file anyways... > > However, from your replies I take it that python doesn't store this data in > an internal buffer and DOES immediately dispose of the data into the file > itself (of course it still exists in variables I put it in). So, closing the > file doesn't free up any more memory. > > > On Sat, Oct 10, 2009 at 7:02 AM, Dave Angel wrote: > >> xbmuncher wrote: >> >>> Which piece of code will conserve more memory? >>> I think that code #2 will because I close the file more often, thus >>> freeing >>> more memory by closing it. >>> Am I right in this thinking... or does it not save me any more bytes in >>> memory by closing the file often? >>> Sure I realize that in my example it doesn't save much if it does... but >>> I'm >>> dealing with writing large files.. so every byte freed in memory counts. >>> Thanks. >>> >>> CODE #1: >>> def getData(): return '12345' #5 bytes >>> f = open('file.ext', 'wb') >>> for i in range(2000): >>>f.write(getData()) >>> >>> f.close() >>> >>> >>> CODE #2: >>> def getData(): return '12345' #5 bytes >>> f = open('file.ext', 'wb') >>> for i in range(2000): >>>f.write(getData()) >>>if i == 5: >>>f.close() >>>f = open('file.ext', 'ab') >>>i = 1 >>>i = i + 1 >>> >>> f.close() >>> >>> >>> >> You don't save a noticeable amount of memory usage by closing and >> immediately reopening the file. The amount that the system buffers probably >> wouldn't depend on file size, in any case. When dealing with large files, >> the thing to watch is how much of the data you've got in your own lists and >> dictionaries, not how much the file subsystem and OS are using. >> >> But you have other issues in your code. >> >> 1) you don't say what version of Python you're using. So I'll assume it's >> version 2.x. If so, then range is unnecessarily using a lot of memory. It >> builds a list of ints, when an iterator would do just as well. Use >> xrange(). ( In Python 3.x, xrange() was renamed to be called range(). ) >> This may not matter for small values, but as the number gets bigger, so >> would the amount of wastage. >> >> 2) By using the same local for the for loop as for your "should I close" >> counter, you're defeating the logic. As it stands, it'll only do the >> close() once. Either rename one of these, or do the simpler test, of >> if i%5 == 0: >> f.close() >> f = open >> >> 3) Close and re-open has three other effects. One, it's slow. Two, >> append-mode isn't guaranteed by the C standard to always position at the end >> (!). And three, it flushes the data. That can be a very useful result, in >> case the computer crashes while spending a long time updating a file. >> >> I'd suggest sometimes doing a flush() call on the file, if you know you'll >> be spending a long time updating it. But I wouldn't bother closing it. >> >> DaveA >> >> >> > ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] If you don't close file when writing, do bytes stay in memory?
2009/10/10 Xbox Muncher : > What does flush do technically? > "Flush the internal buffer, like stdio‘s fflush(). This may be a no-op on > some file-like objects." > > The reason I thought that closing the file after I've written about 500MB > file data to it, was smart -> was because I thought that python stores that > data in memory or keeps info about it somehow and only deletes this memory of > it when I close the file. > When I write to a file in 'wb' mode at 500 bytes at a time.. I see that the > file size changes as I continue to add more data, maybe not in exact 500 byte > sequences as my code logic but it becomes bigger as I make more iterations > still. > > Seeing this, I know that the data is definitely being written pretty > immediately to the file and not being held in memory for very long. Or is > it...? Does it still keep it in this "internal buffer" if I don't close the > file. If it does, then flush() is exactly what I need to free the internal > buffer, which is what I was trying to do when I closed the file anyways... > > However, from your replies I take it that python doesn't store this data in > an internal buffer and DOES immediately dispose of the data into the file > itself (of course it still exists in variables I put it in). So, closing the > file doesn't free up any more memory. Python file I/O is buffered. That means that there is a memory buffer that is used to hold a small amount of the file as it is read or written. You original example writes 5 bytes at a time. With unbuffered I/O, this would write to the disk on every call to write(). (The OS also has some buffering, I'm ignoring that.) With buffered writes, there is a memory buffer allocated to hold the data. The write() call just puts data into the buffer; when it is full, the buffer is written to the disk. This is a flush. Calling flush() forces the buffer to be written. So, a few points about your questions: - calling flush() after each write() will cause a disk write. This is probably not what you want, it will slow down the output considerably. - calling flush() does not de-allocate the buffer, it just writes its contents. So calling flush() should not change the amount of memory used. - the buffer is pretty small, maybe 8K or 32K. You can specify the buffer size as an argument to open() but really you probably want the system default. Kent ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] for loop issue
2009/10/9 Oxymoron : > On Fri, Oct 9, 2009 at 11:02 PM, Kent Johnson wrote: >> On Fri, Oct 9, 2009 at 3:54 AM, Stefan Lesicnik wrote: >> >> You can easily keep track of the previous item by assigning it to a >> variable. For example this shows just the increasing elements of a >> sequence: >> >> In [22]: items = [0, 1, 3, 2, 8, 5, 9 ] >> >> In [23]: last = None >> >> In [24]: for item in items: >> : if last is None or item > last: >> : print item >> : last = item > > Darn... this is what happens when you're stuck on one solution > (referring to my index-only ways in the last post) - you miss other > obvious ways, duh * 10. Apologies for the unnecessarily complicated > exposition on iterables above. :-( > > > > -- > There is more to life than increasing its speed. > -- Mahatma Gandhi > ___ > Tutor maillist - tu...@python.org > To unsubscribe or change subscription options: > http://mail.python.org/mailman/listinfo/tutor > Another way, if you just want to track two consecutive items (which probably won't help in this case, but is useful to know), is to zip together two iters over the list, one omitting the first item, the other omitting the last: >>> lst = range(5) >>> for v1, v2 in zip(lst[:-1], lst[1:]): ... print v1, v2 ... 0 1 1 2 2 3 3 4 4 5 If you're concerned about memory usage (zip generates an intermediate list, twice the size of the original), you can use itertools.izip, which is the same except it doen't generate the list: it's xrange to zip's range. Also, rather than mucking around with xrange() and len(), I'd always recommend using enumerate: e.g. Your example of for i in xrange(0, len(x)-1): print x[i], x[i+1] becomes for i, v in enumerate(x[:-1]): #omitting last value in list to avoid IndexError print v, x[i+1] I've got to say that of the two, I prefer the zip method: it looks cleaner, at least to my eyes. -- Rich "Roadie Rich" Lovely There are 10 types of people in the world: those who know binary, those who do not, and those who are off by one. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] If you don't close file when writing, do bytes stay in memory?
Kent Johnson wrote: 2009/10/10 Xbox Muncher : What does flush do technically? "Flush the internal buffer, like stdio‘s fflush(). This may be a no-op on some file-like objects." The reason I thought that closing the file after I've written about 500MB file data to it, was smart -> was because I thought that python stores that data in memory or keeps info about it somehow and only deletes this memory of it when I close the file. When I write to a file in 'wb' mode at 500 bytes at a time.. I see that the file size changes as I continue to add more data, maybe not in exact 500 byte sequences as my code logic but it becomes bigger as I make more iterations still. Seeing this, I know that the data is definitely being written pretty immediately to the file and not being held in memory for very long. Or is it...? Does it still keep it in this "internal buffer" if I don't close the file. If it does, then flush() is exactly what I need to free the internal buffer, which is what I was trying to do when I closed the file anyways... However, from your replies I take it that python doesn't store this data in an internal buffer and DOES immediately dispose of the data into the file itself (of course it still exists in variables I put it in). So, closing the file doesn't free up any more memory. Python file I/O is buffered. That means that there is a memory buffer that is used to hold a small amount of the file as it is read or written. You original example writes 5 bytes at a time. With unbuffered I/O, this would write to the disk on every call to write(). (The OS also has some buffering, I'm ignoring that.) With buffered writes, there is a memory buffer allocated to hold the data. The write() call just puts data into the buffer; when it is full, the buffer is written to the disk. This is a flush. Calling flush() forces the buffer to be written. So, a few points about your questions: - calling flush() after each write() will cause a disk write. This is probably not what you want, it will slow down the output considerably. - calling flush() does not de-allocate the buffer, it just writes its contents. So calling flush() should not change the amount of memory used. - the buffer is pretty small, maybe 8K or 32K. You can specify the buffer size as an argument to open() but really you probably want the system default. Kent What Kent said. I brought up flush(), not because you should do it on every write, but because you might want to do it on a file that's open a long time, either because it's very large, or because you're doing other things while keeping the file open. A flush() pretty much assures that this portion of the file is recoverable, in case of subsequent crash. The operating system itself is also doing some buffering. After all, the disk drive writes sectors in multiples of at least 512 bytes, so if you write 12 bytes and flush, it needs at least to know about the other 504 bytes in the one sector. The strategy of this buffering varies depending on lots of things outside of the control of your Python program. For example, a removable drive can be mounted either for "fast access" or for "most likely to be recoverable if removed unexpectedly." These parameters (OS specific, and even version specific) will do much more buffering than Python or the C runtime library will ever do. Incidentally, just because you can see that the file has grown, that doesn't mean the disk drive itself has been updated. It just means that the in-memory version of the directory entries has been updated. Those are buffered as well, naturally. If they weren't, then writing performance would be truly horrendous. Anyway, don't bother closing and re-opening, unless it's to let some other process get access to the file. And use flush() judiciously, if at all, considering the tradeoffs. Did you follow my comment about using the modulo operator to do something every nth time through a loop? DaveA ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] for loop issue
On Sun, Oct 11, 2009 at 4:07 AM, Rich Lovely wrote: > for i, v in enumerate(x[:-1]): #omitting last value in list to avoid > IndexError > print v, x[i+1] Thanks for the tip on enumerate, escaped me. Much like Kent's simply using a temporary var escaped me despite having done similar things often... never reply on a tiring Friday. On the bright side this blunder with indexes, iterators, and lengths has made me more aware of other contexts for using additional (zip, enumerate) facilities. I hope the original poster learnt as much as I did from feebly attempting to answer! > > I've got to say that of the two, I prefer the zip method: it looks > cleaner, at least to my eyes. It's an elegant usage :-), zipping up slices of the same list to compare consecutive elems in it, hmm neat. -- Kamal > > -- > Rich "Roadie Rich" Lovely > > There are 10 types of people in the world: those who know binary, > those who do not, and those who are off by one. > -- There is more to life than increasing its speed. -- Mahatma Gandhi ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] [OT] Secure coding guidelines
On Sat, Oct 10, 2009 at 4:31 AM, Didar Hossain wrote: > Since I am learning Python, I was wondering if there are any good > references on secure > coding practices. Books, guides or even any howtos would suffice. > I'm not sure of any references, but I know of a few things. First, for versions < 3.0 use raw_input (ref: http://docs.python.org/library/functions.html#raw_input ) It's a lot more secure than input() Data validation is also a good thing: rather than a function like this: def mysum(n1, n2): return n1 + n2 validate your data: def mysum(n1, n2): try: n1 = int(n1) n2 = int(n2) except ValueError: print "Error! Cannot convert values to int!" return n1+n2 Or do something similar. HTH, Wayne -- To be considered stupid and to be told so is more painful than being called gluttonous, mendacious, violent, lascivious, lazy, cowardly: every weakness, every vice, has found its defenders, its rhetoric, its ennoblement and exaltation, but stupidity hasn’t. - Primo Levi ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] [OT] Secure coding guidelines
"Wayne" wrote Data validation is also a good thing: I agree with this bit but... def mysum(n1, n2): try: n1 = int(n1) n2 = int(n2) except ValueError: print "Error! Cannot convert values to int!" return n1+n2 Or do something similar. In a dynamic language like Python this kind of data validation - which is actually type validation - is not necessary. It would be better to do: def mysum(n1,n2): try: return n1+n2 except TypeError: print "Cannot add %s and %s" % (n1,n2) One of the most powerful features of Python is that you can use "Duck Typing" to create powerful polymorphic functions like this that can add two objects, regardless of type, provided they support addition. Limiting it to integers would be a big limitation. In Python data validaton should normally be restricted to catching invalid data *values* not invalid data types. HTH, -- Alan Gauld Author of the Learn to Program web site http://www.alan-g.me.uk/ ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] [OT] Secure coding guidelines
On Sat, Oct 10, 2009 at 5:31 AM, Didar Hossain wrote: > Hi, > > This is a little off-topic, but, I though I might put this question in. > > Since I am learning Python, I was wondering if there are any good > references on secure > coding practices. Books, guides or even any howtos would suffice. I don't know any references, but a few tips: - don't use eval or exec on untrusted code - don't unpickle data from an untrusted source - don't use string formatting to create SQL statements - use the two-argument form of execute() to pass args as a sequence - AFAIK there is no generally accepted, secure sandbox for running untrusted Python code (other than Google App Engine I guess) so don't run untrusted code Kent ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Python 3 and tkinter Radiobuttons
Thanks for the reply. Unfortunately, even when I include a variable, all of the buttons start out selected. I noticed that this is the case when I create a StringVar (rather than an IntVar, where the buttons start out correctly unselected). So, here's another simple example where all of the radio buttons start out incorrectly selected: from tkinter import * root = Tk() root.grid() v = StringVar() Radiobutton(root, text = "Test RadioButton 1", variable=v, value="1").grid(row = 0, column = 0, sticky = W) Radiobutton(root, text = "Test RadioButton 2", variable=v, value="2").grid(row = 1, column = 0, sticky = W) root.mainloop() Any ideas on how to have a StringVar() associated with a group of Radiobutton objects where all of the radio buttons start off unselected? --Bob > Date: Thu, 8 Oct 2009 20:43:21 -0400 > Subject: Re: [Tutor] Python 3 and tkinter Radiobuttons > From: ken...@tds.net > To: bobsmith...@hotmail.com > CC: tutor@python.org > > On Thu, Oct 8, 2009 at 6:04 PM, bob smith wrote: > > Hi. I’m using Tkinter to create a new Radiobutton in Python 3. However, > > when I create the button, it starts off looking selected instead of > > unselected (though it behaves correctly like an unselected Radiobutton). So > > this means when I create a group of Radiobuttons they all look selected when > > my program begins. > > You have to associate the Radiobuttons with a variable, for example: > > from tkinter import * > > root = Tk() > root.grid() > v = IntVar() > button = Radiobutton(root, text = "Test RadioButton", variable=v, value=1) > button.grid(row = 0, column = 0, sticky = W) > > button = Radiobutton(root, text = "Test RadioButton2", variable=v, value=2) > button.grid(row = 1, column = 0, sticky = W) > > root.mainloop() > > > Kent _ Hotmail: Trusted email with Microsoft’s powerful SPAM protection. http://clk.atdmt.com/GBL/go/177141664/direct/01/___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Python 3 and tkinter Radiobuttons
"bob smith" wrote So, here's another simple example where all of the radio buttons start out incorrectly selected: v = StringVar() Radiobutton(root, text = "Test RadioButton 1", variable=v, value="1").grid(row = 0, column = 0, sticky = W) Radiobutton(root, text = "Test RadioButton 2", variable=v, value="2").grid(row = 1, column = 0, sticky = W) root.mainloop() Any ideas on how to have a StringVar() associated with a group of Radiobutton objects where all of the radio buttons start off unselected? Don't you need to assign a value to the StringVar? Othewise how does Python know which of your buttons is supposed to be the selected option? But I'm no expert, I rarely use radio buttons... -- Alan Gauld Author of the Learn to Program web site http://www.alan-g.me.uk/ ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor