Lazy me, after a short break, alway's helping, I found out wthat it has
to be:
/\<(?!\?xml|\!DOCTYPE|\!ENTITY|image|item|\/item)/
the ?! negate this text, I though that I could put it in every value
like this (?!\?xml|?!\!ENTITY ... but no by putting in first he do it
for all (k.i.s.s. Francis).
Cu
Francis Fillion wrote:
>
> I'm having problem with regular expression, not a good eek this week it
> seen like I alway's get a wall of problem. I know that it surely been
> ask a 1000 times, I look around, didn't find anythings, if you find
> somethings please point me out.
>
> So here what I want to do, I need to parse a xml document , but before
> to parse it I need to get rid of bad html that I don't want, but the
> document that I want require some stuff that I need too, so I don't want
> to get ride of all they HTML.
>
> So what I want to do, I already did a little bite of code that get out
> my good element and check for bad stuff, the only bad thing is that
> "text<text-1" is a good stuff, but I need to change < to < or it will
> do bad things with my xml parser.
>
> Here what I try
>
> $simple = <<<XMLDATA
> <?xml version='1.0'?>
> <!DOCTYPE chapter SYSTEM "/just/a/test.dtd" [
> <!ENTITY plainEntity "FOO entity">
> <!ENTITY systemEntity SYSTEM "xmltest2.xml">
> ]>
> <item>
> text
> <bad stuff>
> text<text-1
> text
> <image title="Ceci est mon titre2" description="Ceci est ma
> description"
> link="http://www.windplanet.com/"
> url="http://www.windplanet.com/images/news/988991159.gif"
> align="left" width="235" height="131" size="13310"/>
> text
> text
> <image title="Ceci est mon titre" description="Ceci est ma description"
> link="http://www.windplanet.com/"
> url="http://www.windplanet.com/images/news/988991159.gif" align="left"
> width="235" height="131" size="13310"/>
> </item>
>
> XMLDATA;
> //$simple = str_replace("\n\n"," <br/> <br/> ",$simple);
>
> /* trouve moi tous les < sauf suivant ceci ... */
> $data = $simple;
> print $data;
>
>if(preg_match_all("/\<(?:(?:\!|\/|\?|)(?:<!xml|<!DOCTYPE|<!ENTITY|<!image|<!item|))/",$data,$cbadhtml)){
> foreach( $cbadhtml as $key => $myarray){
> foreach( $myarray as $key2 => $myarray2){
> print "<p><font color='red'>You can't use HTML here so ".
> htmlentities($myarray2) ." is not allowed</font></p>\n";
> }
> }
> // what html? we exit
> //exit;
>
> }
>
> It find all the < but doesnt' remove the one that I accept, so how can I
> find the bad < and transform them to < ?
>
> Thank you and have a nice day.
>
> --
> Francis Fillion, BAA SI
> Broadcasting live from his linux box.
> And the maintainer of http://www.windplanet.com
>
> --
> PHP General Mailing List (http://www.php.net/)
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> To contact the list administrators, e-mail: [EMAIL PROTECTED]
--
Francis Fillion, BAA SI
Broadcasting live from his linux box.
And the maintainer of http://www.windplanet.com
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]