Edit report at https://bugs.php.net/bug.php?id=63769&edit=1

 ID:                 63769
 Comment by:         hanskrentel at yahoo dot de
 Reported by:        hanskrentel at yahoo dot de
 Summary:            file:// protocol does not support percent-encoded
                     characters
 Status:             Not a bug
 Type:               Bug
 Package:            Streams related
 Operating System:   Windows
 PHP Version:        5.4.9
 Block user comment: N
 Private report:     N

 New Comment:

Pierre, not helpful. Should I say "as usual"?

I explain you briefly, so you can see how easily you fool yourself:

You point to some standard reference (here to the MSDN) to make up the argument 
that % can be part of a file-name. I never neglected that. So what do you want 
to say with that link? Probably that there is some standardization in 
file-names inside an OS?

Well that's fine.

My point is that some standard with the URI standardization is not properly 
implemented in PHP. A very common standard btw. despite the file:// URI is not 
really standardized, URI and percent encoding *is*.

Now you bring in some other standard. You probably wanted to create the 
impression that PHP itself would actually follow that standard, but what should 
I tell you: Naturally like not following the URI standard (as pointed out in 
this issue), the Windows rules for valid file names aren't properly implemented 
either (!!!).

But this is not what my bug-report is about. 

Or was it that you just wanted to give the example that PHP does not even needs 
the file-system file-naming rules because it makes it's own ones? That it does 
not have to follow these, because it's superior?


Previous Comments:
------------------------------------------------------------------------
[2013-01-16 17:40:40] paj...@php.net

it is your job to decode it, file:// does not have and does not follow the % 
used 
in other areas.

btw, paths on windows can contain the %, see http://msdn.microsoft.com/en-
us/library/windows/desktop/aa365247(v=vs.85).aspx for a list of not allowed 
characters.

------------------------------------------------------------------------
[2013-01-16 16:18:41] hanskrentel at yahoo dot de

@ab:

Consider you have a file containing a space in the filename, and you *need to* 
specify the filename in form of a file:// URI for which space is a special 
character that needs proper URL-encoding.

That file://-URI btw is set in an environemnt variable that requires it 
(XML_CATALOG_FILES).

Domdocument in PHP internally is then using that file://-URI and can't process 
it properly because the wrapper is not able to properly decode the path 
information.

You actually pretty well demonstrate the problem in your example:

php -r "echo file_get_contents('file://C:/my/path/catalog%202.xml');"
percent filename

Is obviously wrong. %20 in a file://-URI is an ecoded space, so the content

space filename

needs to be output instead. The filename you meant is properly written as:

php -r "echo file_get_contents('file://C:/my/path/catalog%25202.xml');"
percent filename

compare: http://tools.ietf.org/html/rfc3986#section-2.1

Please add that example to yours because only if you have the two opposite 
cases (encoded *and* decoded) you can actually work out concrete results. You 
are just having two times the same example, 
of which I think both shows the 
same form of wrong: Missing encoding in those URIs.

Which brings me to the point: Is there actually any interest to fix this? I 
mean there is not much standing in the way if you ask me. Normally users are 
not using the file:// URIs at all.

Those who did most likely used the space (or would have complained earlier 
here, but I could find no bug-report). The only edge-case I can see is with 
files containing percent-signs, however how 
likely is that at all?

Let me know if I would sponsor some well written patch how the chances would be 
to get this fixed.

------------------------------------------------------------------------
[2013-01-08 17:12:54] a...@php.net

@hanskrentel

That's my test:

- create file 'catalog%202.xml' with content "percent filename"
- create file 'catalog 2.xml' with content "space filename"
- then run
php -r "echo file_get_contents('file://C:/my/path/catalog%202.xml');"
percent filename

- then run
php -r "echo file_get_contents('file://C:/my/path/catalog 2.xml');"
space filename

That's pretty straight forward. That's what I mean - no decoding, both are 
valid filenames. The decoding should be done in your app depending on what it 
needs. In your example - you create 'catalog 2.xml' and are trying to stat 
'catalog%202.xml', literally. But 'catalog%202.xml' doesn't exist.

------------------------------------------------------------------------
[2013-01-06 07:03:56] anon at anon dot anon

Actually, hold on a sec, plus signs are *not* supposed to be decoded here. That 
means that file names containing plus signs would not be broken by a fix, and 
only file names containing a '%xx' (where x is a hexit) sequence would be 
affected, which is probably uncommon. Perhaps you have a chance.

------------------------------------------------------------------------
[2013-01-06 06:38:45] anon at anon dot anon

>You would have wanted to access it via 'file:///C:/temp/catalog%%25202.xml'

Actually, 'catalog%25202.xml', but I know, I'm agreeing with you. I'm just 
pointing out that this erroneous behavior may be depended on somewhere in some 
PHP script, where the author, in good faith, did whatever made things work. I 
assume you're going to pass your path through urldecode (or not encode it in 
the first place), and then you'll be one of them.

In any case, you're unlikely to get any support here. The reviewers here don't 
do much except dismiss things as 'Not a bug' and once they've successfully done 
that they lose interest. C'est le PHP.

------------------------------------------------------------------------


The remainder of the comments for this report are too long. To view
the rest of the comments, please view the bug report online at

    https://bugs.php.net/bug.php?id=63769


-- 
Edit this bug report at https://bugs.php.net/bug.php?id=63769&edit=1

Reply via email to