ID: 25778 Comment by: dlitz at dlitz dot net Reported By: gmirchev at usa dot net Status: Wont fix Bug Type: URL related Operating System: Linux PHP Version: 4.3.3 New Comment:
Why is this bug marked "Wont fix"? Usually, "Wont fix" is reserved for situations where fixing a certain bug would result in some more-undesirable problem than leaving things alone. In this case, I don't understand why making parse_url conform to RFC 2396 (which has Draft Standard status) would be a problem, especially when Python's urlparse function already does this. Previous Comments: ------------------------------------------------------------------------ [2003-10-08 08:01:19] [EMAIL PROTECTED] parse_url() is only intended to be used with fully qualified URLs. ------------------------------------------------------------------------ [2003-10-08 07:32:44] gmirchev at usa dot net No! Those are URLs. Please read the RFC. RFC 2396 part 4: URI-reference = [ absoluteURI | relativeURI ] [ "#" fragment ] also RFC 2396: 1.2. URI, URL, and URN A URI can be further classified as a locator, a name, or both. The term "Uniform Resource Locator" (URL) refers to the subset of URI that identify resources via a representation of their primary access mechanism (e.g., their network "location"), rather than identifying the resource by name or by some other attribute(s) of that resource. The term "Uniform Resource Name" (URN) refers to the subset of URI that are required to remain globally unique and persistent even when the resource ceases to exist or becomes unavailable. ------------------------------------------------------------------------ [2003-10-07 18:43:33] [EMAIL PROTECTED] Those are not URLS. ------------------------------------------------------------------------ [2003-10-07 10:02:09] gmirchev at usa dot net Description: ------------ parse_url does not correctly handle these relative URLs: a.cgi?keywords=6:54+ a.cgi?keywords=6:54 In the reproduce code is correct PHP implementation. Reproduce code: --------------- function url_parse($url) { $parts = array( 'scheme' => '', 'host' => '', 'port' => '', 'user' => '', 'pass' => '', 'path' => '', 'query' => '', 'fragment' => '' ); # Regular Expression from RFC 2396 (appendix B) preg_match('"^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?"', $url, $matches); if (array_key_exists(2, $matches)) $parts['scheme'] = $matches[2]; if (array_key_exists(4, $matches)) $authority = $matches[4]; if (array_key_exists(5, $matches)) $parts['path'] = $matches[5]; if (array_key_exists(7, $matches)) $parts['query'] = $matches[7]; if (array_key_exists(9, $matches)) $parts['fragment'] = $matches[9]; # Extract username, password, host and port from authority preg_match('"(([^:@]*)(:([^:@]*))?@)?([^:]*)(:(.*))?"', $authority, $matches); if (array_key_exists(2, $matches)) $parts['user'] = $matches[2]; if (array_key_exists(4, $matches)) $parts['pass'] = $matches[4]; if (array_key_exists(5, $matches)) $parts['host'] = $matches[5]; if (array_key_exists(7, $matches)) $parts['port'] = $matches[7]; return $parts; } ------------------------------------------------------------------------ -- Edit this bug report at http://bugs.php.net/?id=25778&edit=1