Edit report at https://bugs.php.net/bug.php?id=25972&edit=1

 ID:                 25972
 Comment by:         j dot faithw at yahoo dot com
 Reported by:        phpbug at chipple dot net
 Summary:            ODBC truncates multi-byte text (w/ MSSQL)
 Status:             Analyzed
 Type:               Feature/Change Request
 Package:            ODBC related
 Operating System:   Win2K 5.00.2195 SP4
 PHP Version:        4.3, 5
 Block user comment: N
 Private report:     N

 New Comment:

I have the same problem with php 5.3.8 using PostgreSQL with char columns. If 
the database is created with e.g the EUC_CN character encoding(createdb -E 
EUC_CN).

This encoding uses between 1 and 3 bytes per character. So a char(10) could 
need up to 30 bytes. 

The problem is in the odbc_bindcols function in ext/odbc/php_odbc.c
  SQLColAttributes is called with SQL_COLUMN_DISPLAY_SIZE but this indicates 
the maximum number of characters required not the number of bytes.
This means the buffer allocated for the value may not be big enough
  result->values[i].value=(char)emalloc(displaysize+1);

Later on in e.g. odbc_fetch_into
  Z_STRLEN_P(tmp) = result->values[i].vallen;
  Z_STRVAL_P(tmp) = estrndup(result->values[i].value,Z_STRLEN_P(tmp));

This can result in a vallen bigger that displaysize. But the ODBC driver will 
only fill in at most displaysize+1 bytes(including null terminator). This means 
character data is missed and junk bytes are returned instead.

The same problem may exist in ext/pdo_odbc/odbc_stmt.c. Where 
  rc = SQLColAttribute(S->stmt, colno+1, SQL_DESC_DISPLAY_SIZE,
            NULL, 0, NULL, &displaysize);
is called. But I have not tested this.

The following fixes odbc_bindcols for the char(x) datatype. I believe 4 bytes 
is the maximum required for any charater encoding.
php_odbc.c:line 988
      if (result->values[i].coltype == SQL_CHAR) {
        //If using a multibyte character encoding
        //number of bytes could be 4*SQL_COLUMN_DISPLAY_SIZE.
        //Without this workaround various functions
        //e.g. odbc_fetch_into will return data with a null after
        //diplaysize bytes and extra junk data at the end as
        //vallen can be bigger than displaysize. Tested using
        //PostgreSQL with EUC_CN encoding.
        displaysize*=4;
      }

The fix may be needed for other data types as well as SQL_CHAR.


Previous Comments:
------------------------------------------------------------------------
[2003-11-05 02:45:46] phpbug at chipple dot net

In case this helps...

One thing I just noticed from my test above (in the results before the change) 
is that strlen() on the tTitle field value gives 86 [bytes], the string's 
correct length. I verified that the 6 last bytes are all null (ASCII 0), only 
the first 80 bytes are correctly being returned.

------------------------------------------------------------------------
[2003-11-05 01:18:08] phpbug at chipple dot net

Thank you very much for the attention to my bug report.

I gave the fix a try in my environment but then all field values received are 
empty (details below).

Perhaps the SQL_COLUMN_LENGTH attribute always contains the value 0?

// Test code

$oOdbcConn = odbc_connect(C_Gen_sDbDSN,C_Gen_sDbUser,C_Gen_sDbPassword);
$oOdbcRs = odbc_exec($oOdbcConn,$sSql);
$aOdbcRow = odbc_fetch_array($oOdbcRs);
for ($i = 1; $i <= odbc_num_fields($oOdbcRs); $i++)
  echo odbc_field_name($oOdbcRs,$i).": ".
       odbc_field_len($oOdbcRs,$i).": ".
       strlen($aOdbcRow[odbc_field_name($oOdbcRs,$i)]).": ".
       gettype($aOdbcRow[odbc_field_name($oOdbcRs,$i)])."<br>";

// Result with php4-STABLE-200311050430 before change
// (SQL_COLUMN_DISPLAY_SIZE)

aCourseID: 10: 1: string
tTitle: 80: 86: string

// Result with php4-STABLE-200311050430 after change
// (SQL_COLUMN_LENGTH)

aCourseID: 10: 0: NULL
tTitle: 80: 0: NULL

------------------------------------------------------------------------
[2003-11-04 18:33:10] kalow...@php.net

I do not like the idea of introducing ODBCv3 based code/
options to a system predominately defined by ODBCv2 
specifications.  So I am against the inclusion of 
Moriyoshi's initial suggested fix for this.

That being said, I did a bit more research on this today 
and think the following change should allow the double 
wide characters to work much better.  I haven't tested 
it out yet myself, but if someone else has the time, it 
would be beneficial to all.  Sorry about the bug system 
mangling.

Essentially the SQL_COLUMNS_DISPLAY_SIZE lists the 
number of characters needed to display everything.  This 
works fine but in the case of a double wide character 
array it doesn't (as explained my Moriyoshi).  The 
SQL_COLUMN_LENGTH should return the number of bytes 
necessary for retrival of the column.  

WARNING: this change may fundamentally alter the 
functionality of the longreadlen variable comparisons as 
well. Use at your own risk right now.



Index: php_odbc.c
========================================================
===========
RCS file: /repository/php-src/ext/odbc/php_odbc.c,v
retrieving revision 1.176
diff -r1.176 php_odbc.c
671,672c671,672
<                               rc = 
SQLColAttributes(result->stmt, (UWORD)(i+1), 
SQL_COLUMN_DISPLAY_SIZE,
<                                                                       
NULL, 0, NULL, &displaysize);
---
>                               rc = 
SQLColAttributes(result->stmt, (UWORD)(i+1),
>                         SQL_COLUMN_LENGTH, NULL, 0, 
NULL, &displaysize);

------------------------------------------------------------------------
[2003-11-04 09:31:56] moriyo...@php.net

Well, then how did you conclude NVARCHAR support will break some kinds of 
compatibilities? I think I have already pointed out that we'd still be able to 
handle it on ODBCv2. 
(Sorry if this sounds offending. I don't mean so :)

Basically we don't have to check whether the column type is NVARCHAR or not, 
but just allocate enough space for that type of characters. That way we also 
got to take a slight loss of memory into account though.


------------------------------------------------------------------------
[2003-11-04 07:56:51] kalow...@php.net

moriyoshi,

It's not as simple as you show it to be.  First you must 
realize that PHP's ODBC layer is written as an ODBC v2 
compliant system, to just randomly add in support for 
NVARCHAR (and friends) will break support for other 
database systems.  

The point of my post wasn't to say this isn't a bug 
(hence why I marked it as verified), but rather to say 
it's a known bug and the issue is the extension is in 
need of updating.  

------------------------------------------------------------------------


The remainder of the comments for this report are too long. To view
the rest of the comments, please view the bug report online at

    https://bugs.php.net/bug.php?id=25972


-- 
Edit this bug report at https://bugs.php.net/bug.php?id=25972&edit=1

Reply via email to