Hi,
Thursday, January 15, 2004, 3:07:02 AM, you wrote:
RS> Hello,
RS> This question may border on OT...
RS> I have a web form where visitors must enter large amounts of text at one
RS> time (text area). Once submitted, the large amount of text is stored as
RS> a CLOB in an Oracle database.
RS> Some of my visitors create their text in Ms-Word and then cut and paste
RS> it into the text area and then submit the form.
RS> When I retrieve it from the database, I do a stripslahses, htmlentities
RS> and nl2br in that order to preserve the format of the submitted test.
RS> When I view this text, single or double quotes show up as little white
RS> square blocks. I've tested this out with MS-Word on a windows machine
RS> and a mac machine. Same thing happens with either OS. This only
RS> happens when they cut and paste from MS-Word into the text area. If
RS> they type text into the text area directly, everything is fine...
RS> I know I can search through their submitted text and swap out the
RS> unrecognized character and insert the proper one. I just don't know
RS> what to look for as being the unrecognized character.
RS> I've googled all over looking at ascII charts and keyboard maps.
RS> Nothing mentions MS-Word specific information though.
RS> Anyone out there dealt with this before?
RS> Thanks,
RS> R
The quotes are actually a sequence of three bytes with values like
226 128 156
226 128 157
for the 2 quotes
here is a bit of code to fix them and a few others, I would be
interested if anyone knew the complete set of these weirdos :)
$crap =
array(chr(226).chr(128).chr(147),chr(226).chr(128).chr(156),chr(226).chr(128).chr(157),chr(226).chr(128).chr(153));
$clean = array('-','"','"',"'");
$content = str_replace($crap,$clean,$text);
--
regards,
Tom
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php