Hi,
Saturday, February 22, 2003, 12:35:15 PM, you wrote:
AC> My apologies in advance if this too basic or there's a solution easily
AC> found out there, but after lots of searching, I'm still lost.
AC> I'm trying to build a regexp that would parse user-supplied text and
AC> identify cases where HTML tags are left open or are not properly
AC> matched-e.g., <b> tags without closing </b> tags. This is for a sort of
AC> message board type of application, and I'd like to allow users to use
AC> some HTML, but just would like to check to ensure that no stray tags are
AC> input that would screw up the rest of the page's display. I'm new to
AC> regular expressions, and the one below is as far as I've gotten. If
AC> anyone has any suggestions, they'd be very much appreciated.
AC> Thanks,
AC> Andy
AC> $suspect_tags = "b|i|u|strong|em|font|a|ol|ul|blockquote ";
AC> $pattern = '/<(' . $suspect_tags . '[^>]*>)(.*)(?!<\/\1)/Ui';
AC> if (preg_match($pattern,$_POST['entry'],$matches)) {
AC> //do something to report the unclosed tags
AC> } else {
AC> echo 'Input looks fine. No unmatched tags.';
AC> }
Here is a function that will fixup simple tags like <b> <i> ,it will add in the
missing /b tag at the next start/end tag or end of document.
function fix_mismatch($str){
$match = array();
$split = preg_split('!\<(.*?)\>!s', $str);
$c = count($split);
$r = ($c == 1)? $str : '';
if($c > 1){
$fix = '';
preg_match_all('!\<(.*?)\>!s', $str,$match);
for($x=0,$y=0;$x < $c;$x++){
$out = $split[$x].$fix; //add in text + any fixup end
tag
$fix = '';
if(isset($match[0][$x])){
$list = explode(' ',$match[1][$x]); //split up
compound tag like <img src="">
$t = trim(strtolower($list[0])); //get
the tag name
switch ($t){
//add tags to check/fix here
case 'b':
case 'div':
case 'i':
case 'textarea':
$st = '/'.$t; //make an end
tag to search for
$rest = array_slice($match[1],$x+1);
// get the remaining tags
$found = false;
while(!$found && list(,$v) =
each($rest)){
$et = explode(' ',$v);
$found = ($st ==
trim(strtolower($et[0])))? True:False; //have we found it ?
}
if(!$found){
$fix = '<'.$st.'>'; //create
an html end tag
}
break;
}
$out .= $match[0][$x]; //add in tag
}
$r .= $out; //build return string
}
}
return $r;
}
//usage
$test1 = '<div>This is a <B >bold word <img src="hello.jpg"></b> and another <b>bold
word </div>end of <b>test';
$test2 = '<b><b><i>frog';
echo fix_mismatch($test1);
echo '<br>';
echo fix_mismatch($test2);
--
regards,
Tom
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php