Hi,
I have these bunch of html files from which I've stripped presentation with
BeautifulSoup (only kept a content div with the bare content).
I've received a php template for the new site from the company we work with so
I went on taking the same part of my first script that iterates through
Hi,
I'm in the process of cleaning some html files with BeautifulSoup and
I want to remove all traces of the tables. Here is the bit of the code
that deals with tables:
def remove(soup, tagname):
for tag in soup.findAll(tagname):
contents = tag.contents
parent = tag.parent
t;
del (soup.body["onload"])
# This is what needs to be done:
## change tables to divs
## remove all td tags
## remove all tr tags
# Tidying
soup = soup.prettify()
erreurs = ""
tidy_options = {"tidy-mark": 0,
"wrap": 0,
"wrap-attributes"
but decided it
> was too long to be instructional, so I pared it back to what I've
> included.
>
> Hope this gets you started,
> e.
>
> Eric Brunson wrote:
>> Eric Brunson wrote:
>>
>>> Sebastien Noel wrote:
>>>
>>>> Hi,
>>>
Hi,
I'm doing a little script with the help of the BeautifulSoup HTML parser
and uTidyLib (HTML Tidy warper for python).
Essentially what it does is fetch all the html files in a given
directory (and it's subdirectories) clean the code with Tidy (removes
deprecated tags, change the output to b
Hi,
I have this website (http://solutions-linux.org/) and I have a little news
section on the right side.
Presently the pages are just static html pages, but I would like to do a little
rss file to put the news in it and then do a little script that puts them on
the pages with the right marku