Sorry for not providing all the required info. I am running python 3.2 on windows vista. I determined the files are double spaced by viewong them (random sampling) in notepad++. Not double spaced on server by downloading one in the browser. Can I use the 'Wu' flag when writing. I might just be 'w'-ing. I will look when I get home.
Thanks Mark Also a little bummed that subprocess module doesn't appear to work on windows. I probably (hopefully) won't need it, but it still bums me. On Nov 28, 2011 4:27 AM, "Dave Angel" <d...@davea.name> wrote: > On 11/28/2011 04:28 AM, Mark Lybrand wrote: > >> Okay, so I just started to learn Python. I have been working through Dive >> Into Python 3 and the Google stuff (great exercises IMHO, totally fun). >> However, with Dive, I had an issue with him referencing the files in the >> example directory, which from the website seem very unhandy. Although I >> have since stumbled upon his GitHub, I made a Python script to grab those >> files for me and it works great, with the exception of doubling the line >> spacing. So here is my code. I hope you critique the heck out of my and >> that you point out what I did wrong to introduce double line-spacing. >> Thanks a bunch: >> >> import os >> import urllib.request >> import re >> >> url_root = >> 'http://diveintopython3.ep.io/**examples/<http://diveintopython3.ep.io/examples/> >> ' >> file_root = os.path.join(os.path.**expanduser("~"), "diveintopython3", >> "examples") >> >> main_page = urllib.request.urlopen(url_**root).read() >> main_page = main_page.decode("utf-8") >> >> pattern = 'href="([^"].*?.)(py|xml)"' >> matches = re.findall(pattern, main_page) >> for my_tuple in matches: >> this_file = my_tuple[0] + my_tuple[1] >> data = urllib.request.urlopen(url_**root + this_file).read() >> data = data.decode("utf-8") >> with open(os.path.join(file_root, this_file), mode='w', encoding='utf-8') >> as a_file: >> a_file.write(data) >> >> You don't tell what your environment is, nor how you decide that the > file is double-spaced. You also don't mention whether you're using Python > 2.x or 3.x > > My guess is that you are using a Unix/Linux environment, and that the Dive > author(s) used Windows. And that your text editor is interpreting the > cr/lf pair (hex 0d 0a) as two line-endings. I believe emacs would have > ignored the redundant cr. Python likewise probably won't care, though I'm > not positive about things like lines that continue across newline > boundaries. > > You can figure out what is actually in the file by using repr() on bytes > read from the file in binary mode. Exactly how you do that will differ > between Python 2.x and 3.x > > As for fixing it, you could either just use one of the dos2unix utilities > kicking around (one's available on my Ubuntu from the Synaptic package > manager), or you could make your utility manage it. On a regular file > open, there's a mode paramter that you can use "u", or better "ru" to say > Universal. It's intended to handle any of the three common line endings, > and use a simple newline for all 3 cases. I don't know whether urlopen() > also has that option, but if not, you can always copy the file after you > have it locally. > > > -- > > DaveA > >
_______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor