Edit report at https://bugs.php.net/bug.php?id=43896&edit=1

 ID:                 43896
 Comment by:         hardin at boulder dot nist dot gov
 Reported by:        arnaud dot lb at gmail dot com
 Summary:            htmlspecialchars() returns empty string on invalid
                     unicode sequence
 Status:             Closed
 Type:               Bug
 Package:            Strings related
 Operating System:   *
 PHP Version:        5CVS-2008-07-15
 Assigned To:        cataphract
 Block user comment: N
 Private report:     N

 New Comment:

cataphract, thanks for looking at this.
When I try your tests, neither line generates any output.  I am using PHP 
Version 5.3.5-1ubuntu7.2.  Any ideas where I should look for a problem?  Could 
this be a server configuration issue?  Thanks.


Previous Comments:
------------------------------------------------------------------------
[2011-08-11 09:38:26] cataphr...@php.net

I can't reproduce that:

<?php
echo htmlentities("some\x80 text&gt;", ENT_QUOTES | ENT_IGNORE, 'UTF-8', 
false), "\n";
echo htmlentities("some\x80 text&lt;", ENT_QUOTES | ENT_IGNORE, 'UTF-8');

gives the expected

some text&gt;
some text&amp;lt;

------------------------------------------------------------------------
[2011-08-11 04:59:39] hardin at boulder dot nist dot gov

echo "test = " . htmlentities("some text", ENT_QUOTES | ENT_IGNORE, 'UTF-8', 
false);
returns: test = 

echo "test = " . htmlentities("some text", ENT_QUOTES | ENT_IGNORE, 'UTF-8');
returns: test = some text

The latter is the expected result, but why does adding the fourth parameter, to 
prevent double-encoding, cause this function (and also htmlspecialchars) to 
return the empty string?  How can this be prevented?

I have a form that I want to redisplay to users until all their input has been 
corrected, preserving their responses in the fields so they can start from what 
worked.  The users are international, with names containing lots of accent 
marks and utf-8 characters, and some of the input is mathematical, with Greek 
characters and such, so I want to assume the input is utf-8 to preserve all of 
this, without messing it up on multiple passes.  Thanks for your help.

------------------------------------------------------------------------
[2011-02-06 21:02:51] shaun dot bruno at gmail dot com

Ah... I realized I need 5.3

------------------------------------------------------------------------
[2011-02-06 20:58:07] shaun dot bruno at gmail dot com

I'm still having this problem - running php 5.2.15

------------------------------------------------------------------------
[2010-10-11 03:16:23] cataphr...@php.net

Noted addition of ENT_IGNORE in the manual entries for htmlspecialchars and 
htmlentities.

------------------------------------------------------------------------


The remainder of the comments for this report are too long. To view
the rest of the comments, please view the bug report online at

    https://bugs.php.net/bug.php?id=43896


-- 
Edit this bug report at https://bugs.php.net/bug.php?id=43896&edit=1

Reply via email to