[PHP] strip tags but preserve title attributes

Ashley Sheridan Mon, 14 Dec 2009 15:48:00 -0800

I'm looking for a way to strip HTML tags out of some text content
(sourced from a web page) to leave just the text which I'll be running
some basic analysis on. The thing is, I want to preserve text that is in
alt and title attributes. I can't use any DOM functions, as I can't
guarantee that the content will be valid XHTML, although it should be
valid HTML.


I'm happy doing this with string functions and regular expressions, but
I was wondering if something for this already existed? The server I plan
on putting this on does not have access to the shell (although it is a
Linux server) so I won't be able to have Lynx or Elinks parse the
content for me either :(

Thanks,
Ash
http://www.ashleysheridan.co.uk

[PHP] strip tags but preserve title attributes

Reply via email to