Re: [Tutor] BeautifulSoup - getting cells without new line characters

Kent Johnson Fri, 31 Mar 2006 09:05:45 -0800

[EMAIL PROTECTED] wrote:

> List of states:
> http://en.wikipedia.org/wiki/U.S._state 
> 
> : soup = BeautifulSoup(html)
> : # Get the second table (list of states).
> : table = soup.first('table').findNext('table')
> : print table 
> 
> ...
> <tr>
> <td>WY</td>
> <td>Wyo.</td>
> <td><a href="/wiki/Wyoming" title="Wyoming">Wyoming</a></td>
> <td><a href="/wiki/Cheyenne%2C_Wyoming" title="Cheyenne, 
> Wyoming">Cheyenne</a></td>
> <td><a href="/wiki/Cheyenne%2C_Wyoming" title="Cheyenne, 
> Wyoming">Cheyenne</a></td>
> <td><a href="/wiki/Image:Flag_of_Wyoming.svg" class="image" title=""><img 
> src="http://upload.wikimedia.org/wikipedia/commons/thumb/b/bc/Flag_of_Wyomin 
> g.svg/45px-Flag_of_Wyoming.svg.png" width="45" alt="" height="30" 
> longdesc="/wiki/Image:Flag_of_Wyoming.svg" /></a></td>
> </tr>
> </table> 
> 
> Of each row (tr), I want to get the cells (td): 1,3,4 
> (postal,state,capital). But cells 3 and 4 have anchors.


So dig into the cells and get the data from the anchor.

cells = row('td')
cells[0].string
cells[2]('a').string
cells[3]('a').string

Kent

_______________________________________________
Tutor maillist  -  [email protected]
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] BeautifulSoup - getting cells without new line characters

Reply via email to