Re: [Tutor] weather scraping with Beautiful Soup
> The posts basocally say go and look at the HTML and find the > right tags for the data you need. This is fubndamental to any kind of web > scraping, you need to understand the HTML tree well enough to identify > where yourt data exists. > > How familiar are you with HTML and its structures? Reasonably familiar. > Can you view the source in your browser and identify the heirarchy > of tags to the place where your data lives? I can view the source, and have made my own web pages (HTML, CSS). I am less sure about the hierarchy of tags. For example, here is the section around the current temperature: West of Town, Jamestown, Pennsylvania (PWS) Updated: 3:00 AM EDT on July 17, 2009 http://icons-pe.wxug.com/i/c/a/nt_clear.gif"; width="42" height="42" alt="Clear" class="condIcon" /> 60.3 °F The 60.3 is the value I want to extract. It appears to be down within a hierarchy something like: But I am far from sure I got all that right; it is not easy to look at HTML and match with . Unless I am missing something? Do I have to use all of the above in my Beautiful Soup? CM _ Windows Live™ SkyDrive™: Get 25 GB of free online storage. http://windowslive.com/online/skydrive?ocid=TXT_TAGLM_WL_SD_25GB_062009___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] weather scraping with Beautiful Soup
Che M wrote: The 60.3 is the value I want to extract. soup.find("div",id="curcondbox").findNext("span","b").renderContents() -- "The ability of the OSS process to collect and harness the collective IQ of thousands of individuals across the Internet is simply amazing." - Vinod Valloppillil http://www.catb.org/~esr/halloween/halloween4.html ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] How to limit logs
hi, i'm writing a log client to recieve logs from socket connection and display on a textbox now i have a few question i used Pmw.ScrolledText but i don't now its buffer size. how can i limit it? and i want to create a log file with logging module but i want to limit log size for example if user sets log file limit to 1000 lines, i have to save last 1000 line on the fly and refresh it until program closed. Any idea ? tkanks... ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] weather scraping with Beautiful Soup
On Thu, Jul 16, 2009 at 11:21 PM, Che M wrote: > Hi, > > I am interested in gathering simple weather data using Beautiful Soup, but > am having trouble understanding what I'm doing. I have searched the > archives and so far haven't found enough to get me moving forward. > > Basically I am trying to start off this example: > > Grabbing Weather Underground Data with BeautifulSoup > http://flowingdata.com/2007/07/09/grabbing-weather-underground-data-with-beautifulsoup/ > > But I get to the exact same problem that this other person got to in this > post: > http://groups.google.com/group/beautifulsoup/browse_thread/thread/13eb3dbf713b8a4a > > Unfortunately, that post never gives enough help for me to understand how to > solve that person's or my problem. > > What I want to understand is how to find the bits of data you want--in this > case, say, today's average temperature and whether it was clear or > cloudy--within a web page, and then indicate that to Beautiful Soup. One thing that might help is to use the Lite page, if you are not already. It has much less formatting and extraneous information to wade through. You might also look for a site that has weather data formatted for computer. For example the NOAA has forcast data available as plain text: http://forecast.weather.gov/product.php?site=NWS&issuedby=BOX&product=CCF&format=txt&version=1&glossary=0 Kent ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] weather scraping with Beautiful Soup
Che M wrote: > > > West of Town, Jamestown, Pennsylvania > (PWS) > Updated: pwsid="KPAJAMES1" pwsunit="english" pwsvariable="lu" value="1247814018">3:00 > AM EDT on July 17, 2009 > > > > > >src="http://icons-pe.wxug.com/i/c/a/nt_clear.gif"; width="42" height="42" > alt="Clear" class="condIcon" /> > >pwsid="KPAJAMES1" pwsunit="english" pwsvariable="tempf" english="°F" > metric="°C" value="60.3"> > 60.3 °F > > > The 60.3 is the value I want to extract. It appears to be down within a > hierarchy > something like: > > > > > > You may consider using lxml's cssselect module: from lxml import html doc = html.parse("http://some/url/to/parse.html";) spans = doc.cssselect("div.bluebox > #curcondbox span.b") print spans[0].text However, I'd rather go for the other "60.3" value using XPath: print doc.xpath('//sp...@pwsvariable="tempf"]/@value') Stefan ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] How to limit logs
On Fri, Jul 17, 2009 at 2:12 AM, barış ata wrote: > hi, > i'm writing a log client to recieve logs from socket connection and display > on a textbox > now i have a few question > i used Pmw.ScrolledText but i don't now its buffer size. how can i limit > it? > and i want to create a log file with logging module but i want to limit log > size > for example if user sets log file limit to 1000 lines, i have to save last > 1000 line on the fly and refresh it until program closed. Any idea ? I don't know if this is the "best" solution, but while the program is open if you simply truncate the file each time and just keep a list of no larger than 1000 lines and write each of those, that would work. Another option is to keep those 1000 lines and just seek to the beginning and rewrite the file each time. This could get pretty expensive as far as disk access goes, though. I don't think there's really a way in python to implement the disk access part, but a circular array (or file write) would work perfect in this situation. But with python, I'd probably just open the file in read/write mode (assuming you want to preserve the log from the previous run), seek to the beginning and write whatever is in the lines. Each time I add a line I'd check len(lines) > 1000: del(lines[0]), then write the lines and seek back to the front of the file. I don't know if there's a more efficient way to do it in python, but that's what I'd do. HTH, Wayne ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] sftp get single file
Hello All. I need to use paramiko to sftp get a single file from a remote server. The remote file's base name will be today's date (%Y%m%d) dot tab. I need help joining the today with the .tab extension. Do I need globbing? example: 20090716.tab #!/usr/bin/env python import paramiko import glob import os import time hostname = 'sftp.booboo.com' port = 22 username = 'booboo' password = '07N4219?' # glob_pattern='*.tab' today = time.strftime("%Y%m%d") remotepath = today.tab localpath = '/home/data/text' if __name__ == "__main__": t = paramiko.Transport((hostname, port)) t.connect(username=username, password=password) sftp = paramiko.SFTPClient.from_transport(t) sftp.get(remotepath, localpath) t.close() -- I fear you speak upon the rack, Where men enforced do speak anything. - William Shakespeare ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] sftp get single file
2009/7/17 Matt Herzog : > Hello All. > > I need to use paramiko to sftp get a single file from a remote server. > The remote file's base name will be today's date (%Y%m%d) dot tab. > I need help joining the today with the .tab extension. Do I need globbing? > > example: 20090716.tab > > #!/usr/bin/env python > import paramiko > import glob > import os > import time > hostname = 'sftp.booboo.com' > port = 22 > username = 'booboo' > password = '07N4219?' > # glob_pattern='*.tab' > today = time.strftime("%Y%m%d") > remotepath = today.tab > localpath = '/home/data/text' > > if __name__ == "__main__": > t = paramiko.Transport((hostname, port)) > t.connect(username=username, password=password) > sftp = paramiko.SFTPClient.from_transport(t) > sftp.get(remotepath, localpath) > t.close() > -- You don't need glob if you know in advance what the filename is. Print example below. --- import time today = time.localtime() datestr = time.strftime("%Y%m%d",today) ext = ".tab" print datestr + ext --- Greets Sander ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] Very wierd namespace problem
I am not able to understand this , perhaps it is an aspect of python I do not understand Now usually I would think that for a class class a: ... def __init__(self): ... pass ... def __something__(self): ... pass >>> b=a() >>> dir(b) ['__doc__', '__init__', '__module__', '__something__'] This should be correct so now i can refer one of these functions from the object of the same class as obj.__something__() now I have created a class with some functions defined in it but the thing is that they are named in a completely different manner?(or so I think) <__main__.SmPriceWindow instance at 0x9dd448c> ['_SmPriceWindow__add_columns', '_SmPriceWindow__create_model', '_SmPriceWindow__review_button_click', '_SmPriceWindow__show_all', '_SmPriceWindow__url_button_click', '__doc__', '__init__', '__module__', 'model', 'tree', 'window'] Traceback (most recent call last): File "smnotebook3.py", line 135, in price_button_pressed sw.__show_all() AttributeError: SmPriceWindow instance has no attribute '__show_all' I have refered to these functions(__add_columns ,__create_model etc ... as self.__add_columns as would be expected in the class itself ) This functionality is unexpected according to me Since I cannot take much of the space here explaining the code I have pasted here: http://pastebin.com/mddf7a43 -- A-M-I-T S|S ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] sftp get single file
On Fri, Jul 17, 2009 at 11:42 AM, Sander Sweers wrote: > import time > > today = time.localtime() > datestr = time.strftime("%Y%m%d",today) > ext = ".tab" > > print datestr + ext You can include literal characters in the format string: In [4]: time.strftime("%Y%m%d.tab",today) Out[4]: '20090717.tab' Kent ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] != -1: versus == 1
Hi I have been trying to understand a python script and I keep coming across this kind of structure that says "If it is not equal to negative one" for line in theLines: if line.find("Source Height") != -1: #etc... ### Is there some special reason for this. Why not just write "If it is equal to one" # for line in theLines: if line.find("Source Height") == 1: #etc... ### Pete ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] != -1: versus == 1
On Fri, Jul 17, 2009, pedro wrote: > Hi I have been trying to understand a python script and I keep coming > across this kind of structure > that says "If it is not equal to negative one" > > > for line in theLines: >if line.find("Source Height") != -1: > #etc... > ### > > Is there some special reason for this. Why not just write "If it is > equal to one" The string find method returns -1 if the string is not found, otherwise it returns the 0-based index into the string matching the argument to find. The test above will return -1 if ``Source Heigtht'' is not in line, and one generally wants to have the test return True if there is something to do. The alternative would be to say ``if not line.find('Source Height') == -1: ...'' Bill -- INTERNET: b...@celestial.com Bill Campbell; Celestial Software LLC URL: http://www.celestial.com/ PO Box 820; 6641 E. Mercer Way Voice: (206) 236-1676 Mercer Island, WA 98040-0820 Fax:(206) 232-9186 Skype: jwccsllc (206) 855-5792 Windows is a computer virus with a user interface!! ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] != -1: versus == 1
2009/7/17 pedro : > > for line in theLines: > if line.find("Source Height") != -1: > #etc... > ### > > Is there some special reason for this. Why not just write "If it is equal to > one" Yes, str.find() returns -1 on failure. See below the documentation for str.find() | find(...) | S.find(sub [,start [,end]]) -> int | | Return the lowest index in S where substring sub is found, | such that sub is contained within s[start:end]. Optional | arguments start and end are interpreted as in slice notation. | | Return -1 on failure. Idle example: >>> teststr = 'a' >>> teststr.find('a') 0 >>> teststr.find('b') -1 Greets Sander ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] Form values
Hi, I am reading values from a form and writing them to a text file. I keep getting a syntax error for outfile=open("filename", "a")I cant see it, does any body else. fileName = "requests.txt" # Create instance of FieldStorage Form = cgi.FieldStorage() # Get data from fields if Form and Form['submit'].value == "Submit": the_name = Form.getvalue('name') the_email = Form.getvalue('email') the_address = Form.getvalue('address') the_telephone = Form.getvalue('telephone') IpAddress = cgi.escape(os.environ["REMOTE_ADDR"]); Time = "(time.localtime()):", time.asctime(time.localtime()) entry = name + '|' + email + '|' + address + '|' + telephone + '|' + IpAddress + '|' + Time + "\n" outfile=open("fileName", "a") outfile.write(entry) outfile.close() ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Very wierd namespace problem
On Fri, Jul 17, 2009 at 12:04 PM, Amit Sethi wrote: > now I have created a class with some functions defined in it but the > thing is that they are named in a completely different manner?(or > so I think) > > <__main__.SmPriceWindow instance at 0x9dd448c> > ['_SmPriceWindow__add_columns', '_SmPriceWindow__create_model', > '_SmPriceWindow__review_button_click', '_SmPriceWindow__show_all', > '_SmPriceWindow__url_button_click', '__doc__', '__init__', > '__module__', 'model', 'tree', 'window'] > Traceback (most recent call last): > File "smnotebook3.py", line 135, in price_button_pressed > sw.__show_all() > AttributeError: SmPriceWindow instance has no attribute '__show_all' > > I have refered to these functions(__add_columns ,__create_model etc > ... as self.__add_columns as would be expected in the class itself ) Names beginning with __ are 'mangled' to make them sort of secret. This is a feature: http://docs.python.org/reference/expressions.html#index-889 Kent ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Form values
On 7/17/2009 9:43 AM keith...@beyondbb.com said... Hi, I am reading values from a form and writing them to a text file. I keep getting a syntax error for outfile=open("filename", "a")I cant see it, does any body else. There isn't an error in the statement. Try to pare down your code to a short example that causes the error. Then post the actual error and code segment. It's more likely there's an unfinished something before this line... Emile fileName = "requests.txt" # Create instance of FieldStorage Form = cgi.FieldStorage() # Get data from fields if Form and Form['submit'].value == "Submit": the_name = Form.getvalue('name') the_email = Form.getvalue('email') the_address = Form.getvalue('address') the_telephone = Form.getvalue('telephone') IpAddress = cgi.escape(os.environ["REMOTE_ADDR"]); Time = "(time.localtime()):", time.asctime(time.localtime()) entry = name + '|' + email + '|' + address + '|' + telephone + '|' + IpAddress + '|' + Time + "\n" outfile=open("fileName", "a") outfile.write(entry) outfile.close() ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] weather scraping with Beautiful Soup
> Date: Fri, 17 Jul 2009 12:27:36 +0200 > From: mot...@xs4all.nl > To: tutor@python.org > CC: pine...@hotmail.com > Subject: Re: [Tutor] weather scraping with Beautiful Soup > > Che M wrote: > > > The 60.3 is the value I want to extract. > > soup.find("div",id="curcondbox").findNext("span","b").renderContents() Thanks, but that isn't working for me. Here's my code: - import urllib2 from BeautifulSoup import BeautifulSoup url = "http://www.wunderground.com/cgi-bin/findweather/getForecast?query=43085"; page = urllib2.urlopen(url) soup = BeautifulSoup(page) daytemp = soup.find("div",id="curcondbox").findNext("span","b").renderContents() print "Today's temperature in Worthington is: ", daytemp - When I try this, I get this error: HTMLParseError: malformed start tag, at line 1516, column 60 Could someone suggest how to proceed? Thanks. CM _ Windows Live™ SkyDrive™: Get 25 GB of free online storage. http://windowslive.com/online/skydrive?ocid=TXT_TAGLM_WL_SD_25GB_062009___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Form values
On Fri, Jul 17, 2009 at 12:43 PM, wrote: > Hi, I am reading values from a form and writing them to a text file. I keep > getting > a syntax error for outfile=open("filename", "a")I cant see it, does any body > else. Indentation problems can cause mysterious syntax errors. Use a text editor that can show spaces and tabs and make sure your indentation is consistent. Kent ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] weather scraping with Beautiful Soup
Che M wrote: Thanks, but that isn't working for me. That's because BeautifulSoup isn't able to parse that webpage, not because the statement I posted doesn't work. I had BeautifulSoup parse the HTML fragment you posted earlier instead of the live webpage. This is actually the first time I see that BeautifulSoup is NOT able to parse a webpage... Greetings, ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] != -1: versus == 1
pedro wrote: > Hi I have been trying to understand a python script and I keep coming > across this kind of structure > that says "If it is not equal to negative one" > > > for line in theLines: >if line.find("Source Height") != -1: > #etc... > ### > > Is there some special reason for this. Why not just write "If it is > equal to one" > > # > for line in theLines: >if line.find("Source Height") == 1: > #etc... > ### Nothing special, it just they have different meaning (and different results). The former (!= -1) tests whether the str.find() method does not fail finding "Source Height" (i.e. str.find() succeeds finding "Source Height") while the latter tests whether "Source Height" is in the second position in the string. But if it were me, I'd write the former (!= -1) with 'in' operator: for line in theLines: if "Source Height" in line: ... ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] reading complex data types from text file
Chris Castillo wrote: how would i go about adding the names to a dictionary as a key and the scores as a value in this code? # refactored for better use of Python, correct logic, and flow scores = {} # empty dictionary total = 0 for line in open("bowlingscores.txt", "r"): if line.strip().isdigit(): score = int(line) scores[name] = score total += score else: name = line.strip() averageScore = total / len(scores) fileOut = open("bowlingaverages.txt", "w") fileOut.write("Bowling Report\n" + ("-" * 50) + "\n") for name, score in scores.items(): if score == 300: score = "\tPerfect score!" elif score < averageScore: score = "\tBelow average" elif score > averageScore: score = "\tAbove average!" else: score = "\tAverage!" print name, score -- Bob Gailer Chapel Hill NC 919-636-4239 ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] weather scraping with Beautiful Soup
> Date: Fri, 17 Jul 2009 08:09:10 -0400 > Subject: Re: [Tutor] weather scraping with Beautiful Soup > From: ken...@tds.net > To: pine...@hotmail.com > CC: tutor@python.org > > On Thu, Jul 16, 2009 at 11:21 PM, Che M wrote: > > Hi, > > > > I am interested in gathering simple weather data using Beautiful Soup, but > > am having trouble understanding what I'm doing. I have searched the > > archives and so far haven't found enough to get me moving forward. > > > > Basically I am trying to start off this example: > > > > Grabbing Weather Underground Data with BeautifulSoup > > http://flowingdata.com/2007/07/09/grabbing-weather-underground-data-with-beautifulsoup/ > > > > But I get to the exact same problem that this other person got to in this > > post: > > http://groups.google.com/group/beautifulsoup/browse_thread/thread/13eb3dbf713b8a4a > > > > Unfortunately, that post never gives enough help for me to understand how to > > solve that person's or my problem. > > > > What I want to understand is how to find the bits of data you want--in this > > case, say, today's average temperature and whether it was clear or > > cloudy--within a web page, and then indicate that to Beautiful Soup. > > One thing that might help is to use the Lite page, if you are not > already. It has much less formatting and extraneous information to > wade through. I was not aware Weather Underground had a Lite page; thank you, that is good to know. It was easier to figure things out in that HTML. I am getting closer, but still a bit stuck. Here is my code for the Lite page: import urllib2 from BeautifulSoup import BeautifulSoup url = "http://www.wund.com/cgi-bin/findweather/getForecast?query=Worthington%2C+OH"; page = urllib2.urlopen(url) soup = BeautifulSoup(page) daytemp = soup.find("div",id="main").findNext("h3").renderContents() print "Today's temperature in Worthington is: ", daytemp - This works, but gives this output: >>> Today's temperature in Worthington is: 75 °F Of course, I just want the 75, not the HTML tags, etc. around it. But I am not sure how to indicate that in Beautiful Soup. So, for example, if I change the soup.find line above to this (to incorporate the ): daytemp = soup.find("div",id="main").findNext("h3", "span").renderContents() then I get the following error: AttributeError: 'NoneType' object has no attribute 'renderContents' (I also don't understand what the point of having a tag with no style content in the page?) Any help is appreciated. This still feels kind of arcane, but I want to understand the general approach to doing this, as later I want to try other weather facts or screen scraping generally. Thanks. CM You might also look for a site that has weather data > formatted for computer. For example the NOAA has forcast data > available as plain text: > http://forecast.weather.gov/product.php?site=NWS&issuedby=BOX&product=CCF&format=txt&version=1&glossary=0 > > Kent _ Lauren found her dream laptop. Find the PC that’s right for you. http://www.microsoft.com/windows/choosepc/?ocid=ftp_val_wl_290___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] weather scraping with Beautiful Soup
Che M wrote: "http://www.wund.com/cgi-bin/findweather/getForecast?query=Worthington%2C+OH"; > Any help is appreciated. That would be: daytemp = soup.find("div",id="main").findNext("span").renderContents() -- "The ability of the OSS process to collect and harness the collective IQ of thousands of individuals across the Internet is simply amazing." - Vinod Valloppillil http://www.catb.org/~esr/halloween/halloween4.html ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] weather scraping with Beautiful Soup
2009/7/17 Michiel Overtoom : > This is actually the first time I see that BeautifulSoup is NOT able to > parse a webpage... Depends on which version is used. If 3.1 then it is much worse with malformed html than prior releases. See [1] for more info. Greets Sander [1] http://www.crummy.com/software/BeautifulSoup/3.1-problems.html ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] weather scraping with Beautiful Soup
> Date: Fri, 17 Jul 2009 20:02:22 +0200 > From: mot...@xs4all.nl > To: tutor@python.org > CC: pine...@hotmail.com > Subject: Re: [Tutor] weather scraping with Beautiful Soup > > Che M wrote: > > > "http://www.wund.com/cgi-bin/findweather/getForecast?query=Worthington%2C+OH"; > > > > Any help is appreciated. > > That would be: > >daytemp = soup.find("div",id="main").findNext("span").renderContents() > Thank you, that works! I'll go try some more things and read more of the documentation and if I bump into more confusions I may have more questions. Che _ Lauren found her dream laptop. Find the PC that’s right for you. http://www.microsoft.com/windows/choosepc/?ocid=ftp_val_wl_290___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] how to join two different files
Hi, I have two large different column datafiles now i want to join them as single multi-column datafile:-- I tried the command:-- >>> file('ala', 'w').write(file('/home/amrita/alachems/chem2.txt', 'r').read()+file('/home/amrita/pdbfile/pdb2.txt', 'r').read()) but it is priniting second file after first, whereas i want to join them columwise like:--- FileA FileB FileC 12 14 12 14 15 + 16 = 15 16 18 17 18 17 20 19 20 19 What command I should use? Thanks, Amrita Kumari Research Fellow IISER Mohali Chandigarh INDIA ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] != -1: versus == 1
Thanks for the replies. I think my mistake was assuming that "1" meant "true" when if fact it means "index 1". Whe I tested for "== 1", since "Source Height" was coincidentally at index 1 it returned something which looked like it worked. Thanks for the clarification. And thanks for the suggestion to use "in" that is much more readable. Pete ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] weather scraping with Beautiful Soup
> 2009/7/17 Michiel Overtoom : > > This is actually the first time I see that BeautifulSoup is NOT able to > > parse a webpage... > > Depends on which version is used. If 3.1 then it is much worse with > malformed html than prior releases. See [1] for more info. > > Greets > Sander > > [1] http://www.crummy.com/software/BeautifulSoup/3.1-problems.html YesI just read about the 3.1 problems, and switched to 3.07a, re-ran the exact same code as befor...and it worked perfectly this time. (For those novices who are unsure of how to install a previous version, I found that I only had to simply clear out what BS files I had already under site-packages, and just put the Beautiful Soup.py 3.0.7a file found on the website into that site-packages folder...very easy). Lots of useful information today, thank you to everyone. Che _ Windows Live™ SkyDrive™: Get 25 GB of free online storage. http://windowslive.com/online/skydrive?ocid=TXT_TAGLM_WL_SD_25GB_062009___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] sending to file
I have a form and script that works. I can enter values into the fields, however the data will not show up on the server in the text file. The properties for the file is 0777. Can the form cause this problem? http://web.nmsu.edu/~keithabt/cgi-bin/07_index.cgi";> Please feel out the entire form: Name: Email Address: Mailing Address: Telephone Number: """ fileName = "requests.txt" # Create instance of FieldStorage Form = cgi.FieldStorage() # Get data from fields if Form and Form['submit'].value == "Submit": the_name = Form.getvalue('name') the_email = Form.getvalue('email') the_address = Form.getvalue('address') the_telephone = Form.getvalue('telephone') IpAddress = cgi.escape(os.environ["REMOTE_ADDR"]); Time = "(time.localtime()):", time.asctime(time.localtime()) entry = name + '|' + email + '|' + address + '|' + telephone + '|' + IpAddress + '|' + Time + "\n" Outfile=open("fileName", "a") Outfile.write(entry) Outfile.close() ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] sending to a file
I have a form and script that works. I can enter values into the fields, however the data will not show up on the server in the text file. The properties for the file is 0777. Can the form cause this problem? 28359-form method.docx Description: Binary data ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] how to join two different files
wrote but it is priniting second file after first, whereas i want to join them columwise like:--- FileA FileB FileC 12 14 12 14 15 + 16 = 15 16 18 17 18 17 20 19 20 19 I'm not sure what the plus sign in the second line signifies but otherwise it looks like you will need to process each file line by line and concatenate each string. for line1 in fileA: line2 = fileB.readline() fileC.write("%s\t%s" % line1,line2) You might need to strip the lines before writing them... -- Alan Gauld Author of the Learn to Program web site http://www.alan-g.me.uk/ ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] how to join two different files
On 7/17/2009 11:37 AM amr...@iisermohali.ac.in said... Hi, I have two large different column datafiles now i want to join them as single multi-column datafile:-- I tried the command:-- file('ala', 'w').write(file('/home/amrita/alachems/chem2.txt', 'r').read()+file('/home/amrita/pdbfile/pdb2.txt', 'r').read()) but it is priniting second file after first, whereas i want to join them columwise like:--- FileA FileB FileC 12 14 12 14 15 + 16 = 15 16 18 17 18 17 20 19 20 19 What command I should use? Assuming it's this simple, otherwise flavor to taste... delim= '\t' file('ala', 'w').writelines( [ delim.join([ii,jj] for ii,jj in zip( [xx.strip() for xx in file('/home/amrita/alachems/chem2.txt','r').readlines() ], file('/home/amrita/pdbfile/pdb2.txt', 'r').readlines() ) ] ) Emile ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] how to join two different files
On Fri, Jul 17, 2009 at 3:14 PM, Emile van Sebille wrote: > delim= '\t' > > file('ala', 'w').writelines( > [ delim.join([ii,jj] for ii,jj in > zip( > [xx.strip() for xx in > file('/home/amrita/alachems/chem2.txt','r').readlines() > ], > file('/home/amrita/pdbfile/pdb2.txt', 'r').readlines() > ) > ] > ) Maybe you could break that up a bit? This is the tutor list, not a one-liner competition! Kent ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] how to join two different files
On 7/17/2009 1:27 PM Kent Johnson said... On Fri, Jul 17, 2009 at 3:14 PM, Emile van Sebille wrote: delim= '\t' file('ala', 'w').writelines( [ delim.join([ii,jj] for ii,jj in zip( [xx.strip() for xx in file('/home/amrita/alachems/chem2.txt','r').readlines() ], file('/home/amrita/pdbfile/pdb2.txt', 'r').readlines() ) ] ) Maybe you could break that up a bit? This is the tutor list, not a one-liner competition! Yeah, I knew that -- but as the OP submitted a one-liner to start with, I simply responded in kind. That's why I put the flavor-to-taste comment on... Emile ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] weather scraping with Beautiful Soup
OK, got the very basic case to work when using Beautiful Soup 3.0.7a and scraping the Weather Underground Lite page. That gives me the current temperature, etc. Good. But now I would like the average temperature from, say, yesterday. I am again having trouble finding elements in the soup. This is the code: -- import urllib2 from BeautifulSoup import BeautifulSoup url = "http://www.wunderground.com/weatherstation/WXDailyHistory.asp?ID=KPAJAMES1&month=7&day=16&year=2009"; page = urllib2.urlopen(url) soup = BeautifulSoup(page) table = soup.find("td",id="dataTable tm10") print table When I look at the page source for that page, there is this section, which contains the "dataTable tm10" table I want to zoom in on: Current: High: Low: Average: Temperature: 73.6 °F 83.3 °F 64.2 °F 74.1 °F -- And yet when I run the above code, what it prints is: >> None In other words, it is not finding that table. Why? Help appreciated again. Che _ Lauren found her dream laptop. Find the PC that’s right for you. http://www.microsoft.com/windows/choosepc/?ocid=ftp_val_wl_290___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] large strings and garbage collection
This was discussed in a previous post but I didn't see a solution. Say, you have for i in veryLongListOfStringValues: s += i As per previous post (http://thread.gmane.org/gmane.comp.python.tutor/54029/focus=54139), (quoting verbatim) "... the following happens inside the python interpreter: 1. get a reference to the current value of s. 2. get a reference to the string value i. 3. compute the new value += i, store it in memory, and make a reference to it. 4. drop the old reference of s (thus free-ing "abc") 5. give s a reference to the newly computed value. After step 3 and before step 4, the old value of s is still referenced by s, and the new value is referenced internally (so step 5 can be performed). In other words, both the old and the new value are in memory at the same time after step 3 and before step 4, and both are referenced (that is, they cannot be garbage collected). ... " As s gets very large, how do you deal with this situation to avoid a memory error or what I think will be a general slowing down of the system if the for-loop is repeated a large number of times. Dinesh ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] sending to file
wrote I have a form and script that works. I can enter values into the fields, however the data will not show up on the server in the text file. The properties for the file is 0777. Can the form cause this problem? Does the problem only occur when running it on the web server or does it occur if you run it as yourself? Do you know where the file is trying to write to? Is it the cgi-bin folder or the root folder? What are the permissions of the folder? 777 is probably a bad idea! But let's get the file created first and worry about security later! :-) action="http://web.nmsu.edu/~keithabt/cgi-bin/07_index.cgi";> Please feel out the entire form: Name: Email Address: Mailing Address: Telephone Number: value="telephone"> """ fileName = "requests.txt" # Create instance of FieldStorage Form = cgi.FieldStorage() # Get data from fields if Form and Form['submit'].value == "Submit": the_name = Form.getvalue('name') the_email = Form.getvalue('email') the_address = Form.getvalue('address') the_telephone = Form.getvalue('telephone') IpAddress = cgi.escape(os.environ["REMOTE_ADDR"]); Time = "(time.localtime()):", time.asctime(time.localtime()) entry = name + '|' + email + '|' + address + '|' + telephone + '|' + IpAddress + '|' + Time + "\n" We'll have more success if you send us the real code. The variables here don't match those above... Outfile=open("fileName", "a") Outfile.write(entry) Outfile.close() HTH, -- Alan Gauld Author of the Learn to Program web site http://www.alan-g.me.uk/ ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] large strings and garbage collection
"Dinesh B Vadhia" wrote This was discussed in a previous post but I didn't see a solution. Say, you have for i in veryLongListOfStringValues: s += i In general avoid string addition. (Although recent versions of Python have optimised it somewhat) As s gets very large, how do you deal with this situation to avoid a memory error or what I think will be a general slowing down of the system if the for-loop is repeated a large number of times. Avoid string addition. join() is one way, a format string might also work - but I haven't benchmarked that for memory or speed fmt = "%s" * len(biglist) s = fmt % tuple(biglist) Just some ideas... -- Alan Gauld Author of the Learn to Program web site http://www.alan-g.me.uk/ ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Form values
2009/7/17 : > Hi, I am reading values from a form and writing them to a text file. I keep > getting > a syntax error for outfile=open("filename", "a")I cant see it, does any body > else. > > > fileName = "requests.txt" > # Create instance of FieldStorage > Form = cgi.FieldStorage() > > # Get data from fields > if Form and Form['submit'].value == "Submit": > > the_name = Form.getvalue('name') > the_email = Form.getvalue('email') > the_address = Form.getvalue('address') > the_telephone = Form.getvalue('telephone') > > > IpAddress = cgi.escape(os.environ["REMOTE_ADDR"]); > Time = "(time.localtime()):", time.asctime(time.localtime()) > > entry = name + '|' + email + '|' + address + '|' + telephone + '|' + > IpAddress + '|' + Time + "\n" > > > > > outfile=open("fileName", "a") > outfile.write(entry) > outfile.close() > > ___ > Tutor maillist - tu...@python.org > http://mail.python.org/mailman/listinfo/tutor > There is an error before there: In the line entry = name + '|' + email + '|' + address + '|' + telephone + '|' + IpAddress + '|' + Time + "\n" None of the names are defined except IpAddress and Time. Everything else has "the_" prepended when it is defined earlier in the code, but, as you can see, not in the line were they are accessed. -- Rich "Roadie Rich" Lovely There are 10 types of people in the world: those who know binary, those who do not, and those who are off by one. ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] how to join two different files
> Maybe you could break that up a bit? This is the tutor list, not a > one-liner competition! rather than one-liners, we can try to create the most "Pythonic" solution. below's my entry. :-) cheers, -wesley myMac$ cat parafiles.py #!/usr/bin/env python from itertools import izip from os.path import exists def parafiles(*files): vec = (open(f) for f in files if exists(f)) data = izip(*vec) [f.close() for f in vec] return data for data in parafiles('fileA.txt', 'fileB.txt'): print ' '.join(d.strip() for d in data) myMac$ cat fileA.txt FileA 12 15 18 20 myMac$ cat fileB.txt FileB 14 16 18 20 22 myMac$ parafiles.py FileA FileB 12 14 15 16 18 18 20 20 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - "Core Python Programming", Prentice Hall, (c)2007,2001 "Python Fundamentals", Prentice Hall, (c)2009 http://corepython.com wesley.j.chun :: wescpy-at-gmail.com python training and technical consulting cyberweb.consulting : silicon valley, ca http://cyberwebconsulting.com ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] weather scraping with Beautiful Soup
2009/7/17 Che M : > table = soup.find("td",id="dataTable tm10") Almost right. attrs should normall be a dict so {'class':'dataTable tm10'} but you can use a shortcut, read on. > > > When I look at the page source for that page, there is this section, which > contains the "dataTable tm10" table I want to zoom in on: > > > > > > > Current: > High: > Low: > Average: > > > > > Temperature: > > 73.6 °F > > > 83.3 °F > > > 64.2 °F > > > 74.1 °F > > -- The tag you are looking for is table not td. The tag td is inside the table tag. So with shortcut it looks like, table = soup.find("table","dataTable tm10") or without shortcut, table = soup.find("table",{'class':'dataTable tm10'}) Greets Sander ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] large strings and garbage collection
2009/7/17 Dinesh B Vadhia : > This was discussed in a previous post but I didn't see a solution. Say, you > have > > for i in veryLongListOfStringValues: > s += i > > As per previous post > (http://thread.gmane.org/gmane.comp.python.tutor/54029/focus=54139), > (quoting verbatim) "... the following happens inside the python interpreter: > > 1. get a reference to the current value of s. > 2. get a reference to the string value i. > 3. compute the new value += i, store it in memory, and make a reference to > it. > 4. drop the old reference of s (thus free-ing "abc") > 5. give s a reference to the newly computed value. > > After step 3 and before step 4, the old value of s is still referenced by s, > and the new value is referenced internally (so step 5 can be performed). In > other words, both the old and the new value are in memory at the same time > after step 3 and before step 4, and both are referenced (that is, they > cannot be garbage collected). ... " > > As s gets very large, how do you deal with this situation to avoid a memory > error or what I think will be a general slowing down of the system if the > for-loop is repeated a large number of times. > > Dinesh > > ___ > Tutor maillist - tu...@python.org > http://mail.python.org/mailman/listinfo/tutor > > If all you are doing is concatenating a list of strings, use the str.join() method, which is designed for the job: >>> listOfStrings ['And', 'now', 'for', 'something', 'completely', 'different.'] >>> print " ".join(listOfStrings) And now for something completely different. >>> print "_".join(listOfStrings) And_now_for_something_completely_different. If you need to perform other operations first, you can pass a generator expression as the argument, for example: >>> " ".join((s.upper() if n%2 else s.lower()) for n, s in >>> enumerate(listOfStrings)) 'and NOW for SOMETHING completely DIFFERENT.' Hope that helps you. -- Rich "Roadie Rich" Lovely There are 10 types of people in the world: those who know binary, those who do not, and those who are off by one. ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] sending to a file
On Fri, Jul 17, 2009 at 5:10 PM, wrote: > > > On Fri Jul 17 17:51 , Wayne sent: > > >On Fri, Jul 17, 2009 at 1:55 PM, wrote: > > > > > > > > > >I have a form and script that works. I can enter values into the fields, > however > > > >the data will not show up on the server in the text file. The properties > for the > > > >file is 0777. Can the form cause this problem? > >Are the properties for the parent directories set to read/write? > >-Wayne > > Yes, they are for the owner. > If the script is being executed from a user that's not the owner, you'll have problems. Try letting the script create the file and see if you still have problems. HTH, Wayne -- To be considered stupid and to be told so is more painful than being called gluttonous, mendacious, violent, lascivious, lazy, cowardly: every weakness, every vice, has found its defenders, its rhetoric, its ennoblement and exaltation, but stupidity hasn’t. - Primo Levi ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] weather scraping with Beautiful Soup
> Date: Sat, 18 Jul 2009 01:09:32 +0200 > From: sander.swe...@gmail.com > To: tutor@python.org > Subject: Re: [Tutor] weather scraping with Beautiful Soup > > 2009/7/17 Che M : > > table = soup.find("td",id="dataTable tm10") > > Almost right. attrs should normall be a dict so {'class':'dataTable > tm10'} but you can use a shortcut, read on. > > > > > > > When I look at the page source for that page, there is this section, which > > contains the "dataTable tm10" table I want to zoom in on: > > > > > > > > > > > > > > Current: > > High: > > Low: > > Average: > > > > > > > > > > Temperature: > > > > 73.6 °F > > > > > > 83.3 °F > > > > > > 64.2 °F > > > > > > 74.1 °F > > > > -- > > The tag you are looking for is table not td. The tag td is inside the > table tag. So with shortcut it looks like, > > table = soup.find("table","dataTable tm10") > > or without shortcut, > > table = soup.find("table",{'class':'dataTable tm10'}) > > Greets > Sander Thank you. I was able to find the table in the soup this way. After a surprising amount of tinkering (for some reason this Soup is more like chowder than broth for me still), I was able to get my goal, that 74.1 above, using this: --- import urllib2 from BeautifulSoup import BeautifulSoup url = "http://www.wunderground.com/weatherstation/WXDailyHistory.asp?ID=KPAJAMES1&month=7&day=16&year=2009"; page = urllib2.urlopen(url) soup = BeautifulSoup(page) table = soup.find("table","dataTable tm10") #find the table tbody = table.find("tbody") #find the table's body alltd = tbody.findAll('td') #find all the td's temp_full = alltd[4] #identify the 4th td, the one I want. print 'temp_full = ', temp_full temp = temp_full.findNext('span','b').renderContents() #into the span and b and render print 'temp = ', temp -- Does this seem like the right (most efficient/readable) way to do this? Thanks for your time. CM _ Hotmail® has ever-growing storage! Don’t worry about storage limits. http://windowslive.com/Tutorial/Hotmail/Storage?ocid=TXT_TAGLM_WL_HM_Tutorial_Storage_062009___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] large strings and garbage collection
join with generator expression is what was needed. terrific! From: Rich Lovely Sent: Friday, July 17, 2009 4:19 PM To: Dinesh B Vadhia Cc: tutor@python.org Subject: Re: [Tutor] large strings and garbage collection 2009/7/17 Dinesh B Vadhia : > This was discussed in a previous post but I didn't see a solution. Say, you > have > > for i in veryLongListOfStringValues: > s += i > > As per previous post > (http://thread.gmane.org/gmane.comp.python.tutor/54029/focus=54139), > (quoting verbatim) "... the following happens inside the python interpreter: > > 1. get a reference to the current value of s. > 2. get a reference to the string value i. > 3. compute the new value += i, store it in memory, and make a reference to > it. > 4. drop the old reference of s (thus free-ing "abc") > 5. give s a reference to the newly computed value. > > After step 3 and before step 4, the old value of s is still referenced by s, > and the new value is referenced internally (so step 5 can be performed). In > other words, both the old and the new value are in memory at the same time > after step 3 and before step 4, and both are referenced (that is, they > cannot be garbage collected). ... " > > As s gets very large, how do you deal with this situation to avoid a memory > error or what I think will be a general slowing down of the system if the > for-loop is repeated a large number of times. > > Dinesh > > ___ > Tutor maillist - Tutor@python.org > http://mail.python.org/mailman/listinfo/tutor > > If all you are doing is concatenating a list of strings, use the str.join() method, which is designed for the job: >>> listOfStrings ['And', 'now', 'for', 'something', 'completely', 'different.'] >>> print " ".join(listOfStrings) And now for something completely different. >>> print "_".join(listOfStrings) And_now_for_something_completely_different. If you need to perform other operations first, you can pass a generator expression as the argument, for example: >>> " ".join((s.upper() if n%2 else s.lower()) for n, s in >>> enumerate(listOfStrings)) 'and NOW for SOMETHING completely DIFFERENT.' Hope that helps you. -- Rich "Roadie Rich" Lovely There are 10 types of people in the world: those who know binary, those who do not, and those who are off by one. ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] how to join two different files
Thankyou sir it is working.but one more thing i want to ask that if my file will have entries like:--- fileA and fileB 12 10 13 12 14 15 means if their no. of entries will not match then how to combine them(both input files have more than one column). Thanks, Amrita >> Maybe you could break that up a bit? This is the tutor list, not a >> one-liner competition! > > rather than one-liners, we can try to create the most "Pythonic" > solution. below's my entry. :-) > > cheers, > -wesley > > myMac$ cat parafiles.py > #!/usr/bin/env python > > from itertools import izip > from os.path import exists > > def parafiles(*files): > vec = (open(f) for f in files if exists(f)) > data = izip(*vec) > [f.close() for f in vec] > return data > > for data in parafiles('fileA.txt', 'fileB.txt'): > print ' '.join(d.strip() for d in data) > > myMac$ cat fileA.txt > FileA > 12 > 15 > 18 > 20 > > myMac$ cat fileB.txt > FileB > 14 > 16 > 18 > 20 > 22 > > myMac$ parafiles.py > FileA FileB > 12 14 > 15 16 > 18 18 > 20 20 > > - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > "Core Python Programming", Prentice Hall, (c)2007,2001 > "Python Fundamentals", Prentice Hall, (c)2009 > http://corepython.com > > wesley.j.chun :: wescpy-at-gmail.com > python training and technical consulting > cyberweb.consulting : silicon valley, ca > http://cyberwebconsulting.com > ___ > Tutor maillist - Tutor@python.org > http://mail.python.org/mailman/listinfo/tutor > Amrita Kumari Research Fellow IISER Mohali Chandigarh INDIA ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor