[PHP-BUG] Bug #64430 [NEW]: strip_tags() clobbers non-tag entities

2013-03-15 Thread gdataonline at gmail dot com
From: gdataonline at gmail dot com
Operating system: Windows 7 x64
PHP version:  5.4.13
Package:  Unknown/Other Function
Bug Type: Bug
Bug description:strip_tags() clobbers non-tag entities

Description:

strip_tags() will clobber all remaining input after encountering non-HTML
tags, regardless if embedded in "allowable-tags".

When strip_tags() encounters a "<" followed by *any non-whitespace*, it
assumes it's an HTML tag, even if it couldn't legally be one.  In the
example below, even something as simple as a "<=" appearing in JavaScript
is enough to trigger the behavior.


For a full list of characters that strip_tags() clobbers on, see this
example: http://ideone.com/BEPINI

Test script:
---
if (foo >= bar){ alert(0); }";
var_dump(strip_tags($code));
var_dump(strip_tags($code, ""));

// Example 2: With <=
$code = "<script>if (foo <= bar){ alert(0); }";
var_dump(strip_tags($code));
var_dump(strip_tags($code, ""));

?>

Expected result:

string(28) "if (foo >= bar){ alert(0); }"
string(45) "<script>if (foo >= bar){ alert(0); }"

string(28) "if (foo <= bar){ alert(0); }"
string(45) "if (foo <= bar){ alert(0); }"

Actual result:
--
string(28) "if (foo >= bar){ alert(0); }"
string(45) "if (foo >= bar){ alert(0); }"

string(8) "if (foo "
string(16) "

Bug #64430 [Com]: strip_tags() clobbers non-tag entities

2013-03-15 Thread gdataonline at gmail dot com
Edit report at https://bugs.php.net/bug.php?id=64430&edit=1

 ID: 64430
 Comment by: gdataonline at gmail dot com
 Reported by:gdataonline at gmail dot com
 Summary:strip_tags() clobbers non-tag entities
 Status: Open
 Type:   Bug
 Package:Unknown/Other Function
 Operating System:   Windows 7 x64
 PHP Version:5.4.13
 Block user comment: N
 Private report: N

 New Comment:

I'm starting to suspect that strip_tags treats anything that matches <[^>\s]*> 
as an HTML tag, regardless of where it appears or if it's valid HTML.


Previous Comments:

[2013-03-15 15:15:58] gdataonline at gmail dot com

Description:

strip_tags() will clobber all remaining input after encountering non-HTML tags, 
regardless if embedded in "allowable-tags".

When strip_tags() encounters a "<" followed by *any non-whitespace*, it assumes 
it's an HTML tag, even if it couldn't legally be one.  In the example below, 
even something as simple as a "<=" appearing in JavaScript is enough to trigger 
the behavior.


For a full list of characters that strip_tags() clobbers on, see this example: 
http://ideone.com/BEPINI

Test script:
---
if (foo >= bar){ alert(0); }";
var_dump(strip_tags($code));
var_dump(strip_tags($code, ""));

// Example 2: With <=
$code = "<script>if (foo <= bar){ alert(0); }";
var_dump(strip_tags($code));
var_dump(strip_tags($code, ""));

?>

Expected result:

string(28) "if (foo >= bar){ alert(0); }"
string(45) "<script>if (foo >= bar){ alert(0); }"

string(28) "if (foo <= bar){ alert(0); }"
string(45) "if (foo <= bar){ alert(0); }"

Actual result:
--
string(28) "if (foo >= bar){ alert(0); }"
string(45) "if (foo >= bar){ alert(0); }"

string(8) "if (foo "
string(16) "