Edit report at https://bugs.php.net/bug.php?id=25972&edit=1
ID: 25972 Comment by: j dot faithw at yahoo dot com Reported by: phpbug at chipple dot net Summary: ODBC truncates multi-byte text (w/ MSSQL) Status: Analyzed Type: Feature/Change Request Package: ODBC related Operating System: Win2K 5.00.2195 SP4 PHP Version: 4.3, 5 Block user comment: N Private report: N New Comment: I have the same problem with php 5.3.8 using PostgreSQL with char columns. If the database is created with e.g the EUC_CN character encoding(createdb -E EUC_CN). This encoding uses between 1 and 3 bytes per character. So a char(10) could need up to 30 bytes. The problem is in the odbc_bindcols function in ext/odbc/php_odbc.c SQLColAttributes is called with SQL_COLUMN_DISPLAY_SIZE but this indicates the maximum number of characters required not the number of bytes. This means the buffer allocated for the value may not be big enough result->values[i].value=(char)emalloc(displaysize+1); Later on in e.g. odbc_fetch_into Z_STRLEN_P(tmp) = result->values[i].vallen; Z_STRVAL_P(tmp) = estrndup(result->values[i].value,Z_STRLEN_P(tmp)); This can result in a vallen bigger that displaysize. But the ODBC driver will only fill in at most displaysize+1 bytes(including null terminator). This means character data is missed and junk bytes are returned instead. The same problem may exist in ext/pdo_odbc/odbc_stmt.c. Where rc = SQLColAttribute(S->stmt, colno+1, SQL_DESC_DISPLAY_SIZE, NULL, 0, NULL, &displaysize); is called. But I have not tested this. The following fixes odbc_bindcols for the char(x) datatype. I believe 4 bytes is the maximum required for any charater encoding. php_odbc.c:line 988 if (result->values[i].coltype == SQL_CHAR) { //If using a multibyte character encoding //number of bytes could be 4*SQL_COLUMN_DISPLAY_SIZE. //Without this workaround various functions //e.g. odbc_fetch_into will return data with a null after //diplaysize bytes and extra junk data at the end as //vallen can be bigger than displaysize. Tested using //PostgreSQL with EUC_CN encoding. displaysize*=4; } The fix may be needed for other data types as well as SQL_CHAR. Previous Comments: ------------------------------------------------------------------------ [2003-11-05 02:45:46] phpbug at chipple dot net In case this helps... One thing I just noticed from my test above (in the results before the change) is that strlen() on the tTitle field value gives 86 [bytes], the string's correct length. I verified that the 6 last bytes are all null (ASCII 0), only the first 80 bytes are correctly being returned. ------------------------------------------------------------------------ [2003-11-05 01:18:08] phpbug at chipple dot net Thank you very much for the attention to my bug report. I gave the fix a try in my environment but then all field values received are empty (details below). Perhaps the SQL_COLUMN_LENGTH attribute always contains the value 0? // Test code $oOdbcConn = odbc_connect(C_Gen_sDbDSN,C_Gen_sDbUser,C_Gen_sDbPassword); $oOdbcRs = odbc_exec($oOdbcConn,$sSql); $aOdbcRow = odbc_fetch_array($oOdbcRs); for ($i = 1; $i <= odbc_num_fields($oOdbcRs); $i++) echo odbc_field_name($oOdbcRs,$i).": ". odbc_field_len($oOdbcRs,$i).": ". strlen($aOdbcRow[odbc_field_name($oOdbcRs,$i)]).": ". gettype($aOdbcRow[odbc_field_name($oOdbcRs,$i)])."<br>"; // Result with php4-STABLE-200311050430 before change // (SQL_COLUMN_DISPLAY_SIZE) aCourseID: 10: 1: string tTitle: 80: 86: string // Result with php4-STABLE-200311050430 after change // (SQL_COLUMN_LENGTH) aCourseID: 10: 0: NULL tTitle: 80: 0: NULL ------------------------------------------------------------------------ [2003-11-04 18:33:10] kalow...@php.net I do not like the idea of introducing ODBCv3 based code/ options to a system predominately defined by ODBCv2 specifications. So I am against the inclusion of Moriyoshi's initial suggested fix for this. That being said, I did a bit more research on this today and think the following change should allow the double wide characters to work much better. I haven't tested it out yet myself, but if someone else has the time, it would be beneficial to all. Sorry about the bug system mangling. Essentially the SQL_COLUMNS_DISPLAY_SIZE lists the number of characters needed to display everything. This works fine but in the case of a double wide character array it doesn't (as explained my Moriyoshi). The SQL_COLUMN_LENGTH should return the number of bytes necessary for retrival of the column. WARNING: this change may fundamentally alter the functionality of the longreadlen variable comparisons as well. Use at your own risk right now. Index: php_odbc.c ======================================================== =========== RCS file: /repository/php-src/ext/odbc/php_odbc.c,v retrieving revision 1.176 diff -r1.176 php_odbc.c 671,672c671,672 < rc = SQLColAttributes(result->stmt, (UWORD)(i+1), SQL_COLUMN_DISPLAY_SIZE, < NULL, 0, NULL, &displaysize); --- > rc = SQLColAttributes(result->stmt, (UWORD)(i+1), > SQL_COLUMN_LENGTH, NULL, 0, NULL, &displaysize); ------------------------------------------------------------------------ [2003-11-04 09:31:56] moriyo...@php.net Well, then how did you conclude NVARCHAR support will break some kinds of compatibilities? I think I have already pointed out that we'd still be able to handle it on ODBCv2. (Sorry if this sounds offending. I don't mean so :) Basically we don't have to check whether the column type is NVARCHAR or not, but just allocate enough space for that type of characters. That way we also got to take a slight loss of memory into account though. ------------------------------------------------------------------------ [2003-11-04 07:56:51] kalow...@php.net moriyoshi, It's not as simple as you show it to be. First you must realize that PHP's ODBC layer is written as an ODBC v2 compliant system, to just randomly add in support for NVARCHAR (and friends) will break support for other database systems. The point of my post wasn't to say this isn't a bug (hence why I marked it as verified), but rather to say it's a known bug and the issue is the extension is in need of updating. ------------------------------------------------------------------------ The remainder of the comments for this report are too long. To view the rest of the comments, please view the bug report online at https://bugs.php.net/bug.php?id=25972 -- Edit this bug report at https://bugs.php.net/bug.php?id=25972&edit=1