Instead of copying and pasting and then just doing a simple match, why
not use urllib2 to download the html and then run through it with HTMLParse?
Liam Clarke wrote:
Hi all,
I have a large amount of HTML that a previous person has liberally
sprinkled a huge amount of applets through, instead of html links,
which kills my browser to open.
So, want to go through and replace all applets with nice simple links,
and want to use Python to find the applet, extract a name and an URL,
and create the link.
My problem is, somewhere in my copying and pasting into the text file
that the HTMl currently resides in, it got all messed up it would
seem, and there's a bunch of strange '=' all through it. (Someone said
that the code had been generated in Frontpage. Is that a good thing or
bad thing?)
So, I want to search for <applet code=, but it may be in the file as
<app=
let
code
or <applet
code
or <ap=
plet
etc. etc. (Full example of yuck here
http://www.rafb.net/paste/results/WcKPCy64.html)
So, I want to be write a search that will match <applet code and
<app=\nlet code (etc. etc.) without having to strip the file of '='
and '\n'.
I was thinking the re module is for this sort of stuff? Truth is, I
wouldn't know where to begin with it, it seems somewhat powerful.
Or, there's a much easier way, which I'm missing totally. If there is,
I'd be very grateful for pointers.
Thanks for any help you can offer.
Liam Clarke
_______________________________________________
Tutor maillist - [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor