Edit report at https://bugs.php.net/bug.php?id=45356&edit=1
ID: 45356 Comment by: gtisza at gmail dot com Reported by: al at txtlocal dot com Summary: fgetcsv() £ symbol stripped if first char in cell Status: No Feedback Type: Bug Package: Filesystem function related Operating System: Linux PHP Version: 5.2.6 Block user comment: N Private report: N New Comment: fgetcsv() seems to throw the first character away if it is invalid in the current locale, but ignores invalid characters which are not at the beginning of a cell. This code reproduces the problem in PHP 5.3.6: <?php setlocale(LC_ALL,'C'); $utfchar = chr(0xC3).chr(0x89); // U+009C in UTF-8 $csv = $utfchar."x".$utfchar."x\n"; file_put_contents('test.csv', $csv); $file = fopen('test.csv', 'r'); $data = fgetcsv($file); for ($i = 0; $i < strlen($data[0]); $i++) { echo dechex(ord($data[0][$i])).' '; } echo "\n"; unlink('test.csv'); // expected: c3 89 78 c3 89 78 - "ÉxÉx" // actual: 78 c3 89 78 - "xÉx" ?> I agree with the commenter in bug 12127 that a CSV function should not mess with encodings in the first place, just copy the content byte-by-byte. Previous Comments: ------------------------------------------------------------------------ [2008-09-08 22:06:42] sfschiller at gmail dot com based on [mk at kurznet dot com] a change of the locale information helps. setlocale(LC_ALL,'de_DE.8859-1'); setting the locale information to a unicode or UTF locale names will lose the first letters. ------------------------------------------------------------------------ [2008-09-08 19:04:43] mk at kurznet dot com if have the same problem with php 5.2.6 the csv file looks like this: äüö123äüö;auo123äüö $handle = fopen($path."Mappe3.csv","r"); while ($data = fgetcsv ($handle, 4096, ";")) { print_r($data); } fclose ($handle); Array ( [0] => 123äüö [1] => auo123äüö ) with PHP 5.2.5 and 4.4.8 everything is ok ? is this a bug or a feature ? ------------------------------------------------------------------------ [2008-07-27 01:00:01] php-bugs at lists dot php dot net No feedback was provided for this bug for over a week, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open". ------------------------------------------------------------------------ [2008-07-19 17:50:18] m...@php.net Thank you for this bug report. To properly diagnose the problem, we need a short but complete example script to be able to reproduce this bug ourselves. A proper reproducing script starts with <?php and ends with ?>, is max. 10-20 lines long and does not require any external resources such as databases, etc. If the script requires a database to demonstrate the issue, please make sure it creates all necessary tables, stored procedures etc. Please avoid embedding huge scripts into the report. I'm unable to reproduce it with a simple scripts neither with 5.2.6 nor with 5.3.0-dev. ------------------------------------------------------------------------ [2008-06-25 18:08:31] al at txtlocal dot com If you have csv file: name,price James,£150 fgetcsv() will remove the £. All other chars seem to be fine. I have searched forums for an answer to this and there are a few people reporting the same - but no definitive answer. In addition - this is only if the £ character in the first char in a "cell". This would work fine: name,price James,1£50 ------------------------------------------------------------------------ The remainder of the comments for this report are too long. To view the rest of the comments, please view the bug report online at https://bugs.php.net/bug.php?id=45356 -- Edit this bug report at https://bugs.php.net/bug.php?id=45356&edit=1