[Tutor] fast sampling with replacement
On Sat, Feb 20, 2010 at 11:55 AM, Luke Paireepinart wrote: > > > On Sat, Feb 20, 2010 at 1:50 PM, Kent Johnson wrote: > >> On Sat, Feb 20, 2010 at 11:22 AM, Andrew Fithian >> wrote: >> > can >> > you help me speed it up even more? >> > import random >> > def sample_with_replacement(list): >> > l = len(list) # the sample needs to be as long as list >> > r = xrange(l) >> > _random = random.random >> > return [list[int(_random()*l)] for i in r] >> >> You don't have to assign to r, just call xrange() in the list comp. >> You can cache int() as you do with random.random() >> Did you try random.randint(0, l) instead of int(_random()*i) ? >> You shouldn't call your parameter 'list', it hides the builtin list >> and makes the code confusing. >> >> You might want to ask this on comp.lang.python, many more optimization >> gurus there. >> >> Also the function's rather short, it would help to just inline it (esp. > with Kent's modifications, it would basically boil down to a list > comprehension (unless you keep the local ref's to the functions), I hear the > function call overhead is rather high (depending on your usage - if your > lists are huge and you don't call the function that much it might not > matter.) > > The code is taking a list of length n and randomly sampling n items with replacement from the list and then returning the sample. I'm going to try the suggestion to inline the code before I make any of the other (good) suggested changes to the implementation. This function is being called thousands of times per execution, if the "function call overhead" is as high as you say that sounds like the place to start optimizing. Thanks guys. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
[Tutor] Using and
Can anyone tell me if I can have my program check to see if something is true the add to count For example something like if c_1 and c_2 and c_3 true: count + 1 or if I can use it after splitting an input the problem is the input is variable so I don't know if I can do such a thing without knowing the input ahead of time. if anyone can tell me if this is possible or if I should be doing something else that would be grate thanks. _ Introducing Windows® phone. http://go.microsoft.com/?linkid=9708122___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Using and
On Sun, Feb 21, 2010 at 7:44 AM, jim serson wrote: > Can anyone tell me if I can have my program check to see if something is > true the add to count > > For example something like > > if c_1 and c_2 and c_3 true: > count + 1 > > or if I can use it after splitting an input the problem is the input is > variable so I don't know if I can do such a thing without knowing the input > ahead of time. if anyone can tell me if this is possible or if I should be > doing something else that would be grate thanks. > Your explanation is a little ambiguous. Try writing out some pseudo-code of what you want to do, i.e. get input, assign to c1 assign random number to c2 get another input, assign to c3 check if c1, c2, and c3 are all non-empty strings if yes, do something otherwise do something else If you can do that, someone can probably help you. Otherwise we're just shooting in the dark, and no one likes that. HTH, Wayne -- To be considered stupid and to be told so is more painful than being called gluttonous, mendacious, violent, lascivious, lazy, cowardly: every weakness, every vice, has found its defenders, its rhetoric, its ennoblement and exaltation, but stupidity hasn’t. - Primo Levi ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Using and
On 21 February 2010 22:44, jim serson wrote: > Can anyone tell me if I can have my program check to see if something is > true the add to count > > For example something like > > if c_1 and c_2 and c_3 true: > count + 1 Your code currently throws the result of the count + 1 expression away. You probably want this instead: count = count + 1 > or if I can use it after splitting an input the problem is the input is > variable so I don't know if I can do such a thing without knowing the input > ahead of time. I'm not really sure what you're trying to say here. HTH, benno ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Using and
On Sun, Feb 21, 2010 at 8:44 AM, jim serson wrote: > Can anyone tell me if I can have my program check to see if something is > true the add to count > > For example something like > > if c_1 and c_2 and c_3 true: > > count + 1 This will almost work as written. Try if c_1 and c_2 and c_3: count = count + 1 # or count += 1 Kent ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] fast sampling with replacement
Luke Paireepinart wrote: Can you explain what your function is doing and also post some test code to profile it? On Sat, Feb 20, 2010 at 10:22 AM, Andrew Fithian wrote: Hi tutor, I'm have a statistical bootstrapping script that is bottlenecking on a python function sample_with_replacement(). I wrote this function myself because I couldn't find a similar function in python's random library. This is the fastest version of the function I could come up with (I used cProfile.run() to time every version I wrote) but it's not fast enough, can you help me speed it up even more? import random def sample_with_replacement(list): l = len(list) # the sample needs to be as long as list r = xrange(l) _random = random.random return [list[int(_random()*l)] for i in r] # using list[int(_random()*l)] is faster than random.choice(list) FWIW, my bootstrapping script is spending roughly half of the run time in sample_with_replacement() much more than any other function or method. Thanks in advance for any advice you can give me. -Drew ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor list and l are poor names for locals. The former because it's a built-in type, and the latter because it looks too much like a 1. You don't say how big these lists are, but I'll assume they're large enough that the extra time spent creating the 'l' and 'r' variables is irrelevant. I suspect you could gain some speed by using random.randrange instead of multiplying random.random by the length. And depending on how the caller is using the data, you might gain some by returning a generator expression instead of a list. Certainly you could reduce the memory footprint. I wonder why you assume the output list has to be the same size as the input list. Since you're sampling with replacement, you're not using the whole list anyway. So I'd have defined the function to take a second argument, the length of desired array. And if you could accept a generator instead of a list, you don't care how long it is, so let it be infinite. (untested) def sample(mylist): mylistlen = len(mylist) randrange = random.randrange while True: yield mylist[ randrange(0, mylistlen)] DaveA ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
[Tutor] Functions returning multiple values
Hi, do you know if there is a way so that i can get multiple values from a function? For example: def count(a,b): c = a + b d = a - b How can I return the value of C and D? Then, i have another question: i've read, some time ago, this guide http://hetland.org/writing/instant-python.html, skipping the object-related part. Now i've started reading it, and have found something strange: just go where it says "Of course, now you know there is a better way. And why don’t we give it the default value of [] in the first place? Because of the way Python works, this would give all the Baskets the same empty list as default contents.". Can you please help me understanding this part? Thankyou Giorgio -- -- AnotherNetFellow Email: anothernetfel...@gmail.com ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Functions returning multiple values
On Mon, 22 Feb 2010 03:00:32 am Giorgio wrote: > Hi, > > do you know if there is a way so that i can get multiple values from > a function? > > For example: > > def count(a,b): > c = a + b > d = a - b > > How can I return the value of C and D? Return a tuple of c and d: >>> def count(a, b): ... c = a + b ... d = a - b ... return c, d ... >>> t = count(15, 11) >>> t (26, 4) You can also unpack the tuple immediately: >>> x, y = count(15, 11) >>> x 26 >>> y 4 > Then, i have another question: i've read, some time ago, this guide > http://hetland.org/writing/instant-python.html, skipping the > object-related part. Now i've started reading it, and have found > something strange: just go where it says "Of course, now you know > there is a better way. And why don’t we give it the default value of > [] in the first place? Because of the way Python works, this would > give all the Baskets the same empty list as default contents.". Can > you please help me understanding this part? When you declare a default value in a function like this: def f(a, b, c=SOMETHING): the expression SOMETHING is calculated once, when the function is defined, and *not* each time you call the function. So if I do this: x = 1 y = 2 def f(a, b, c=x+y): return a+b+c the default value for c is calculated once, and stored inside the function: >>> f(0, 0) 3 Even if I change x or y: >>> x = >>> f(0, 0) 3 So if I use a list as a default value (or a dict), the default is calculated once and stored in the function. You can see it by looking at the function's defaults: >>> def f(alist=[]): ... alist.append(1) ... return alist >>> >>> f.func_defaults[0] [] Now, call the function without an argument: >>> f() [1] >>> f() [1, 1] >>> f() [1, 1, 1] >>> f.func_defaults[0] [1, 1, 1] How is this happening? Because every time you call the function, it appends 1 to the argument. If you don't supply an argument, it appends 1 to the default, changing it in place. Why doesn't the same thing happen here? >>> def g(x=0): ... x += 1 ... return x ... >>> g.func_defaults[0] 0 >>> g() 1 >>> g() 1 >>> g.func_defaults[0] 0 The answer is that ints are immutable: you can't change their value. When you do x+=1, it doesn't modify the int 0 in place, turning it into 1. It leaves 0 as zero, and gives you a new int equal to one. So the default value stored in the function never changes. The difference boils down to immutable objects, which can't be changed in place, and mutable objects, which can. Immutable: ints, floats, strings, tuples, frozensets Mutable: lists, dicts, sets, most custom classes -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
[Tutor] Using Python with a Mac
Hi everyone, I would like to know how to use python with a mac. For now, I go to spotlight, open terminal then type IDLE and a window pops up but its like the window that opens when you run your programs already saved and I'm not able to open another window to write a script from scratch. Could someone help me please please I have the latest Macbook Pro so 2,88ghz 15 inches screen. Thank you in advance Marchoes ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Using Python with a Mac
mac have python 2.5 and 2.6 installed by default. If you use Vi then you are up for it. If you want IDE then pydev plugin with eclipse can be tried. I had heard good things about it. Also you can try bpython(python fancy interpretor), install it via macports. ~l0nwlf On Sun, Feb 21, 2010 at 10:36 PM, Marco Rompré wrote: > Hi everyone, I would like to know how to use python with a mac. > > For now, I go to spotlight, open terminal then type IDLE and a window pops > up but its like the window that opens when you run your programs already > saved and I'm not able to open another window to write a script from > scratch. > > Could someone help me please please > > I have the latest Macbook Pro so 2,88ghz 15 inches screen. > > Thank you in advance > > Marchoes > > ___ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > http://mail.python.org/mailman/listinfo/tutor > > ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
[Tutor] Regex to find files ending with one of a given set of extensions
Hi all I'm trying use regex to match image formats: import re def findImageFiles(): imageRx = re.compile('\.jpe?g$|\.png$|\.gif$|\.tiff?$', re.I) someFiles = ["sdfinsf.png","dsiasd.dgf","wecn.GIF","iewijiefi.jPg","iasjasd.py"] findImages = imageRx(someFiles) print "START: %s" %(findImages.start()) print "GROUP: %s" %(findImages.group()) def main(): findImageFiles() if __name__ == "__main__": main() here's the traceback: $ python test.py Traceback (most recent call last): File "test.py", line 25, in main() File "test.py", line 21, in main findImageFiles() File "test.py", line 14, in findImageFiles findImages = imageRx(someFiles) TypeError: '_sre.SRE_Pattern' object is not callable i'm new with regexing, please help. Thanks Dayo ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] fast sampling with replacement
On Sun, 21 Feb 2010 03:22:19 am Andrew Fithian wrote: > Hi tutor, > > I'm have a statistical bootstrapping script that is bottlenecking on > a python function sample_with_replacement(). I wrote this function > myself because I couldn't find a similar function in python's random > library. random.choice(list) does sample with replacement. If you need more than one sample, call it in a loop or a list comprehension: >>> mylist = [1, 2, 4, 8, 16, 32, 64, 128] >>> choice = random.choice >>> values = [choice(mylist) for _ in xrange(4)] >>> values [64, 32, 4, 4] > This is the fastest version of the function I could come up > with (I used cProfile.run() to time every version I wrote) but it's > not fast enough, can you help me speed it up even more? Profiling doesn't tell you how *fast* something is, but how much time it is using. If something takes a long time to run, perhaps that's because you are calling it lots of times. If that's the case, you should try to find a way to call it less often. E.g. instead of taking a million samples, can you get by with only a thousand? Do you need to prepare all the samples ahead of time, or do them only when needed? > import random > def sample_with_replacement(list): > l = len(list) # the sample needs to be as long as list > r = xrange(l) > _random = random.random > return [list[int(_random()*l)] for i in r] # using > list[int(_random()*l)] is faster than random.choice(list) Well, maybe... it's a near thing. I won't bother showing my timing code (hint: use the timeit module) unless you ask, but in my tests, I don't get a clear consistent winner between random.choice and your version. Your version is a little bit faster about 2 out of 3 trials, but slower 1 out of 3. This isn't surprising if you look at the random.choice function: def choice(self, seq): return seq[int(self.random() * len(seq))] It is essentially identical to your version, the major difference being you use the same algorithm directly in a list comprehension instead of calling it as a function. So I would expect any difference to be small. I would say that you aren't going to speed up the sampling algorithm in pure Python. This leaves you with some options: (1) Live with the speed as it is. Are you sure it's too slow? (2) Re-write the random number generator in C. (3) Find a way to use fewer samples. (4) Use a lower-quality random number generator which is faster. > FWIW, my bootstrapping script is spending roughly half of the run > time in sample_with_replacement() much more than any other function > or method. Thanks in advance for any advice you can give me. You don't tell us whether that actually matters. Is it taking 3 hours in a 6 hour run time? If so, then it is worth spending time and effort to cut that down by a few hours. Or is it taking 0.2 seconds out of 0.4 seconds? If that's the case, then who cares? :) -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Regex to find files ending with one of a given set of extensions
On Mon, 22 Feb 2010 04:23:04 am Dayo Adewunmi wrote: > Hi all > > I'm trying use regex to match image formats: Perhaps you should use a simpler way. def isimagefile(filename): ext = os.path.splitext(filename)[1] return (ext.lower() in ('.jpg', '.jpeg', '.gif', '.png', '.tif', '.tiff')) def findImageFiles(): someFiles = [ "sdfinsf.png","dsiasd.dgf","wecn.GIF","iewijiefi.jPg","iasjasd.py"] return filter(isimagefile, someFiles) > $ python test.py > Traceback (most recent call last): > File "test.py", line 25, in > main() > File "test.py", line 21, in main > findImageFiles() > File "test.py", line 14, in findImageFiles > findImages = imageRx(someFiles) > TypeError: '_sre.SRE_Pattern' object is not callable The error is the line findImages = imageRx(someFiles) You don't call regexes, you have to use the match or search methods. And you can't call it on a list of file names, you have to call it on each file name separately. # untested for filename in someFiles: mo = imageRx.search(filename) if mo is None: # no match pass else: print filename -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
[Tutor] Python file escaping issue?
Hi all, I'm trying to read a file (Python 2.5.2, Windows XP) as follows: assignment_file = open('C:\Documents and Settings\coderoid\My Documents\Downloads\code_sample.txt', 'r+').readlines() new_file = open(new_file.txt, 'w+') for line in assignment_file: new_file.write(line) new_file.close() assignment_file.close() When the code runs, the file path has the slashes converted to double slashes. When try to escape them, i just seemto add more slashes. What am i missing? -- Regards, Sithembewena Lloyd Dube http://www.lloyddube.com ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Using Python with a Mac
On Sun, Feb 21, 2010 at 10:36 PM, Marco Rompré wrote: > Hi everyone, I would like to know how to use python with a mac. > > For now, I go to spotlight, open terminal then type IDLE and a window pops > up but its like the window that opens when you run your programs already > saved and I'm not able to open another window to write a script from > scratch. > By default IDLE seems to open up a python interpreter window (regardless of platform). If you hit Apple-N to open a 'New' window, or select the equivalent command from the menu, it will open a new window for entering a script; from there you can perform various actions (save, run, debug, etc.) If you click on 'Python' to the left of 'File' and select 'Preferences' and then pick the 'General' tab you can opt to have IDLE open up with an editor window instead of an interpreter window the next time you start. HTH, Monte ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Python file escaping issue?
On Sun, Feb 21, 2010 at 12:22 PM, Sithembewena Lloyd Dube wrote: > Hi all, > > I'm trying to read a file (Python 2.5.2, Windows XP) as follows: > > assignment_file = open('C:\Documents and Settings\coderoid\My > Documents\Downloads\code_sample.txt', 'r+').readlines() > new_file = open(new_file.txt, 'w+') > for line in assignment_file: > new_file.write(line) > > new_file.close() > assignment_file.close() > > When the code runs, the file path has the slashes converted to double > slashes. When try to escape them, i just seemto add more slashes. What am i > missing? try using the r to declare it as a raw string: >>> filename = r'C:\Documents and Settings\coderoid\otherstuff' >>> filename 'C:\\Documents and Settings\\coderoid\\otherstuff' that should work. HTH, Wayne -- To be considered stupid and to be told so is more painful than being called gluttonous, mendacious, violent, lascivious, lazy, cowardly: every weakness, every vice, has found its defenders, its rhetoric, its ennoblement and exaltation, but stupidity hasn’t. - Primo Levi ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Python file escaping issue?
@Wayne, sorry to have replied to you directly.. On Sun, Feb 21, 2010 at 9:23 PM, Sithembewena Lloyd Dube wrote: > Solved by moving the file just under C:\ > > Must be an issue with directory name spaces. > > By the way, my code was riddled with bugs. readlines() returns a list > object, which does not have a close method - and i had not placed quotes > around the new_file argument. > > > > > On Sun, Feb 21, 2010 at 9:01 PM, Sithembewena Lloyd Dube < > zebr...@gmail.com> wrote: > >> Hi Wayne, >> >> Thank you for responding. I did try declaring it as a raw string, and what >> i get is pretty much the same result (slashes in the path become double >> slashes, so naturally the file is not found). >> >> I did try googling around for how to handle/ escape file paths, but i >> cannot seem to dig up anything useful. >> >> Regards, >> Sithembewena >> >> >> On Sun, Feb 21, 2010 at 8:41 PM, Wayne Werner wrote: >> >>> On Sun, Feb 21, 2010 at 12:22 PM, Sithembewena Lloyd Dube < >>> zebr...@gmail.com> wrote: >>> Hi all, I'm trying to read a file (Python 2.5.2, Windows XP) as follows: assignment_file = open('C:\Documents and Settings\coderoid\My Documents\Downloads\code_sample.txt', 'r+').readlines() new_file = open(new_file.txt, 'w+') for line in assignment_file: new_file.write(line) new_file.close() assignment_file.close() When the code runs, the file path has the slashes converted to double slashes. When try to escape them, i just seemto add more slashes. What am i missing? >>> >>> >>> try using the r to declare it as a raw string: >>> >>> filename = r'C:\Documents and Settings\coderoid\otherstuff' >>> >>> filename >>> 'C:\\Documents and Settings\\coderoid\\otherstuff' >>> >>> that should work. >>> >>> HTH, >>> Wayne >>> >>> >>> -- >>> To be considered stupid and to be told so is more painful than being >>> called gluttonous, mendacious, violent, lascivious, lazy, cowardly: every >>> weakness, every vice, has found its defenders, its rhetoric, its ennoblement >>> and exaltation, but stupidity hasn’t. - Primo Levi >>> >> >> >> >> -- >> Regards, >> Sithembewena Lloyd Dube >> http://www.lloyddube.com >> > > > > -- > Regards, > Sithembewena Lloyd Dube > http://www.lloyddube.com > -- Regards, Sithembewena Lloyd Dube http://www.lloyddube.com ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
[Tutor] working with email module
I subscribe to an email list that distributes image files. I'd like to automate the process of saving the images to individual files. I've got it mostly figured out with two exceptions. 1) Sometimes msg.filename returns 'None' even though msg.get_content_type returns 'image/jpeg' and the actual message (looking at it with a file viewer) *does* have a filename parameter. When I use the 'v' command in 'mutt' the filename shows so why does python fail to find it? 2) Is it possible to 'detach' a single subpart of a message? The docs show msg.attach() to create an email but I can't find anything that would allow deleting an attachment. I know how to delete an entire message from a mailbox. -- "There is no medicine like hope, no incentive so great, and no tonic so powerful as expectation of something tomorrow." -- Orison Marden Rick Pasottor...@niof.nethttp://www.niof.net ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Superclass call problem
Hi, I am having trouble understanding how superclass calls work. Here's some code... What version of Python are you using? In Python 2.x, you MUST inherit from object to use super, and you MUST explicitly pass the class and self: class ParentClass(object): def __init__(self, a, b, c): do something here class ChildClass(ParentClass): def __init__(self, a, b, c): super(ChildClass, self).__init__(a, b, c) In Python 3.x, all classes inherit from object and you no longer need to explicitly say so, and super becomes a bit smarter about where it is called from: # Python 3 only class ParentClass: def __init__(self, a, b, c): do something here class ChildClass(ParentClass): def __init__(self, a, b, c): super().__init__(a, b, c) I assume you are using Python 3.0 or 3.1. (If you're 3.0, you should upgrade to 3.1: 3.0 is s-l-o-w and no longer supported.) Your mistake was to pass self as an explicit argument to __init__. This is not needed, because Python methods automatically get passed self: def __init__(self): super().__init__(self) That has the effect of passing self *twice*, when __init__ expects to get self *once*, hence the error message you see: When the super().__init__ line runs I get the error "__init__() takes exactly 1 positional argument (2 given)" Hi Steven, thanks for the reply. Fortunately I am using Python 3.1, so I can use the super().__init__(a, b, c) syntax. Regards, Alan ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Python file escaping issue?
On Mon, 22 Feb 2010 05:22:10 am Sithembewena Lloyd Dube wrote: > Hi all, > > I'm trying to read a file (Python 2.5.2, Windows XP) as follows: > > assignment_file = open('C:\Documents and Settings\coderoid\My > Documents\Downloads\code_sample.txt', 'r+').readlines() > new_file = open(new_file.txt, 'w+') > for line in assignment_file: > new_file.write(line) > > new_file.close() > assignment_file.close() > > When the code runs, the file path has the slashes converted to double > slashes. When try to escape them, i just seemto add more slashes. > What am i missing? An understanding of how backslash escapes work in Python. Backslashes in string literals (but not in text you read from a file, say) are used to inject special characters into the string, just like C and other languages do. These backslash escapes include: \t tab \n newline \f formfeed \\ backslash and many others. Any other non-special backslash is left alone. So when you write a string literal including backslashes and a special character, you get this: >>> s = 'abc\tz' # tab >>> print s abc z >>> print repr(s) 'abc\tz' >>> len(s) 5 But if the escape is not a special character: >>> s = 'abc\dz' # nothing special >>> print s abc\dz >>> print repr(s) 'abc\\dz' >>> len(s) 6 The double backslash is part of the *display* of the string, like the quotation marks, and not part of the string itself. The string itself only has a single backslash and no quote marks. So if you write a pathname like this: >>> path = 'C:\datafile.txt' >>> print path C:\datafile.txt >>> len(path) 15 It *seems* to work, because \d is left as backlash-d. But then you do this, and wonder why you can't open the file: >>> path = 'C:\textfile.txt' >>> print path C: extfile.txt >>> len(path) 14 Some people recommend using raw strings. Raw strings turn off backslash processing, so you can do this: >>> path = r'C:\textfile.txt' >>> print path C:\textfile.txt But raw strings were invented for the regular expression module, not for Windows pathnames, and they have a major limitation: you can't end a raw string with a backslash. >>> path = r'C:\directory\' File "", line 1 path = r'C:\directory\' ^ SyntaxError: EOL while scanning single-quoted string The best advice is to remember that Windows allows both forward and backwards slashes as the path separator, and just write all your paths using the forward slash: 'C:/directory/' 'C:textfile.txt' -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
[Tutor] "Two" Card Monty with File Operations--Reading from Wrong Folder (W7?)
I have a program called TrackStudy.py and another called ReportTool.py Track runs above an Events folder that contains txt files that it examines.Report runs in an Events folder on the same txt files. Neither is operated when the other is operating. Both only read the same files. I've been operating Report in another Events folder at another folder level. Both work fine that way. I decided to copy Report into the Event folder below Track. If I run Track, it sees every txt file in Events. However, if I run Track, it refers back to the other Events folder, and works fine even though on the wrong set of files. I've pretty much assumed by observation, that these programs read the txt files in alphabetical order. It still seems the case. There is one difference in Track on this. It uses: paths = glob.glob(final_string) to locate files. They always occur in the proper low to high sequence. Report produces the wrong files in alpha order. The question is why does Report see the folder in the wrong folder? Although I think I've verified matters, I could be off. Is there a way to ensure I'm really getting to the right folder? There may be a Win7 problem here. See below. Here's a diagram that might help. Cap names means folder. BINGO EVENTS a1.txt a2.txt report.py CARDS track.py EVENTS b1.txt b2.txt b3.txt Now copy report.py to CARDS BINGO EVENTS a1.txt a2.txt report.py CARDS track.py EVENTS b1.txt b2.txt b3.txt report.py While working on this problem, I came up with a Win7 puzzler. It amounts to this. If I search for files in EVENTS of CARDS for "b", it only finds one of the b-files. I had a long 1 hour talk with HP tech support. They had no answer, but will take it up with MS. It may be related to the first problem. Probably not, but curious. I suspect that Win7's new folder search has somehow used a filter. I had no idea of any filter use until I accidentally found it in W7 Help. Perhaps the filter was some how set. Not by me. Here's a stab in the dark. Maybe the copied report.py really is a pointer to the other one? -- "There is nothing so annoying as to have two people talking when you're busy interrupting." -- Mark Twain ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
[Tutor] Extracting comments from a file
Hi, I have an html file, with xml style comments in: I'd like to extract only the comments. My sense of smell suggests that there's probably a library (maybe an xml library) that does this already. Otherwise, my current alogorithm looks a bit like this: * Iterate over file * If current line contains - Toggle 'is_comment' to no This feels crude, but is it effective, or ok? Thanks, Laomao ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Python file escaping issue?
Just a little complement to Steven's excellent explanation: On Mon, 22 Feb 2010 10:01:06 +1100 Steven D'Aprano wrote: [...] > So if you write a pathname like this: > > >>> path = 'C:\datafile.txt' > >>> print path > C:\datafile.txt > >>> len(path) > 15 > > It *seems* to work, because \d is left as backlash-d. But then you do > this, and wonder why you can't open the file: I consider this misleading, since it can only confuse newcomers. Maybe "lonely" single backslashes (not forming a "code" with following character(s)) should be invalid. Meaning literal backslashes would always be doubled (in plain, non-raw, strings). What do you think? > But if the escape is not a special character: > > >>> s = 'abc\dz' # nothing special > >>> print s > abc\dz > >>> print repr(s) > 'abc\\dz' > >>> len(s) > 6 > > The double backslash is part of the *display* of the string, like the > quotation marks, and not part of the string itself. The string itself > only has a single backslash and no quote marks. This "display" is commonly called "representation", thus the name of the function repr(). It is a string representation *for the programmer* only, both on input and output: * to allow one writing, in code itself, string literal constants containing special characters, in a practical manner (eg file pathes/names) * to allow one checking the actual content of string values, at testing time The so-called interactive interpreter outputs representations by default. An extreme case: >>> s = "\\" >>> s '\\' >>> print s, len(s) \ 1 >>> print repr(s), len(repr(s)) '\\' 4 >>> The string holds 1 char; its representation (also a string, indeed) holds 4. > The best advice is to remember that Windows allows both forward and > backwards slashes as the path separator, and just write all your paths > using the forward slash: > > 'C:/directory/' > 'C:textfile.txt' Another solution is to take the habit to always escape '\' by doubling it. Denis la vita e estrany http://spir.wikidot.com/ ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor