ID:          19507
 Updated by:  [EMAIL PROTECTED]
 Reported By: [EMAIL PROTECTED]
 Status:      Open
 Bug Type:    mbstring related
 PHP Version: 4.2.3
 New Comment:

Could not verify this with CVS-HEAD on Linux 2.4.18


Previous Comments:
------------------------------------------------------------------------

[2002-09-19 16:45:33] [EMAIL PROTECTED]

When compiling php with the mbstring extension, parsing of POST and GET
variables which contain url-encoded keys is not done correctly. this
especially effects array input variables in GET/POST request. for
example:

let's say we have a page only containing:
<?php phpinfo(); ?>

invoking this page like info.php?tld[]=.com&tld[]=.name&tld[]=.us
shows the correct result:
_GET["tld"] = Array
(
    [0] => .com
    [1] => .name
    [2] => .us
)

but the browser usually encodes [ with %5B and ] with %5D in the url,
so let's try that:
info.php?tld%5B%5D=.com&tld%5B%5D=.name&tld%5B%5D=.us
shows an incorrect result:
_GET["tld"] = Array
(
    [0] => 
    [1] => e
    [2] => .us
)

To be frankly :) I already tracked down the problem by reviewing the
source code of PHP and noticed that it was introduced specifically in
4.2.3. Around line 1030 of mbstring.c, the code which parses the URL
variables was, it looks like, cleaned up a bit. it was in 4.2.2:

        while (var && n < num) {
                val = strchr(var, '=');
                if (val) { /* have a value */
                        *val++ = '\0';
                        val_list[n] = var;
                        len_list[n] = php_url_decode(var, strlen(var));
                        n++;
                        val_list[n] = val;
                        len_list[n] = php_url_decode(val, strlen(val));
                } else {
                        val_list[n] = var;
                        len_list[n] = php_url_decode(var, strlen(var));
                        n++;
                        val_list[n] = NULL;
                        len_list[n] = 0;
                }
                n++;
                var = php_strtok_r(NULL, separator, &strtok_buf);
        }
        num = n;

and in 4.2.3 changed to:

        /* split and decode the query */
        n = 0;
        strtok_buf = NULL;
        var = php_strtok_r(res, separator, &strtok_buf);
        while (var)  {
                val = strchr(var, '=');
                val_list[n] = var;
                len_list[n] = php_url_decode(var, strlen(var));
                n++;
                if (val) { /* have a value */
                        *val++ = '\0';
                        val_list[n] = val;
                        len_list[n] = php_url_decode(val, strlen(val));
                } else {
                        val_list[n] = "";
                        len_list[n] = 0;
                }
                n++;
                var = php_strtok_r(NULL, separator, &strtok_buf);
        }
        num = n; /* make sure to process initilized vars only */

the problem with the code change is that the equal sign is not
overwritten with the string terminator \0 before php_url_decode is
called on the GET/POST variable name, so the GET/POST variable value is
urldecoded in the same step. but the code following the first
php_url_decode assumes that the equal sign is still on the same place,
which is not the case if the key contained characters that were
url-encoded (had characters in the %## syntax), because the string gets
smaller and the equal sign moves in this case.
one url-encoded character causes the string to get two characters
smaller, so we can explain our example from the top:
[ and ] are url-encoded, two characters, for each of them the equal
sign (and with it the GET/POST variable value) moves two characters.
so, it moves four characters at total:

_GET["tld"] = Array
(
    [0] =>                // .com are four characters, after four
disappear to the left, none remains
    [1] => e              // .name are five characters, after four
disappear to the left, only the last remains
    [2] => .us            // php_url_decode does place the end-of line
character of the decoded string just before the dot, so .us does not
get shortened
)

Maybe it's better to rewind this code part to what it was in 4.2.2.
just a suggestion, i don't want tell anyone how to do their work :)
You can also use the patch below. It is against PHP 4.2.3.
I already verified that the patch works, it fixes the problem.

--- php-4.2.3/ext/mbstring/mbstring.c.org       2002-09-19 22:32:22.000000000
+0200
+++ php-4.2.3/ext/mbstring/mbstring.c   2002-09-19 22:29:32.000000000
+0200
@@ -1031,15 +1031,18 @@
        var = php_strtok_r(res, separator, &strtok_buf);
        while (var)  {
                val = strchr(var, '=');
-               val_list[n] = var;
-               len_list[n] = php_url_decode(var, strlen(var));
-               n++;
                if (val) { /* have a value */
                        *val++ = '\0';
+                       val_list[n] = var;
+                       len_list[n] = php_url_decode(var, strlen(var));
+                       n++;
                        val_list[n] = val;
                        len_list[n] = php_url_decode(val, strlen(val));
                } else {
-                       val_list[n] = "";
+                       val_list[n] = var;
+                       len_list[n] = php_url_decode(var, strlen(var));
+                       n++;
+                       val_list[n] = "";
                        len_list[n] = 0;
                }
                n++;

btw, my configure line is (if you still want to know it):
'./configure' '--with-apxs=/usr/local/apache/bin/apxs'
'--prefix=/usr/local/php' '--disable-debug' '--with-gettext'
'--with-pear' '--enable-safe-mode' '--enable-mbstring'
'--enable-mbstr-enc-trans' '--with-zlib' '--enable-bcmath' '--with-bz2'
'--enable-calendar' '--enable-ctype' '--with-curl' '--enable-exif'
'--enable-ftp' '--with-gd' '--enable-gd-native-ttf' '--with-ttf'
'--with-t1lib' '--with-gmp' '--with-hyperwave' '--enable-mailparse'
'--with-mcrypt' '--with-mhash' '--with-mysql' '--with-pcre-regex'
'--enable-posix' '--enable-trans-sid' '--enable-sockets'
'--with-regex=php'


------------------------------------------------------------------------


-- 
Edit this bug report at http://bugs.php.net/?id=19507&edit=1

Reply via email to