[issue13207] os.path.expanduser brakes when using unicode character in the username

2011-10-18 Thread Manuel de la Pena

New submission from Manuel de la Pena :

During our development we have experience the following:

If you have a user in your Windows machine with a name hat uses Japanese 
characters like “雄鳥お人好し” you will have the following in your system:

* The Windows Shell will show the path correctly, that is: “C:\Users\雄鳥お人好し”
* cmd.exe will show: “C:\Users\??”
* All the env variables will be wrong, which means they will be similar to the 
info shown in cmd.exe

The above is a problem because the implementation of expanduser in ntpath.py 
uses the env variables to get expand the path which means that in this case the 
returned path will be wrong. 

I have attached a small example of how to get the user profile path (~) on 
Windows using SHGetFolderPathW or SHGetKnownFolderPathW to fix the issue. 

PS: I don't know if this issue also occurs on python 3.

--
components: Windows
files: expanduser.py
messages: 145798
nosy: mandel
priority: normal
severity: normal
status: open
title: os.path.expanduser brakes when using unicode character in the username
type: behavior
versions: Python 2.7
Added file: http://bugs.python.org/file23442/expanduser.py

___
Python tracker 
<http://bugs.python.org/issue13207>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13234] os.listdir breaks with literal paths

2011-10-20 Thread Manuel de la Pena

New submission from Manuel de la Pena :

During the development of an application that needed to write paths longer than 
260 chars we opted to use \\?\ as per 
http://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx#maxpath.

When working with literal paths the following the os.listdir funtion would 
return the following trace:

>>> import os
>>> test = r'\\?\C:\Python27'
>>> os.listdir(test)
Traceback (most recent call last):
  File "", line 1, in 
WindowsError: [Error 123] The filename, directory name, or volume label syntax 
is incorrect: '?\\C:\\Python27/*.*'

The reason for this is that the implementation of listdir appends '/' at the 
end of the path if os.path.sep is not present at the end of it which 
FindFirstFile does not like. This is a inconsistency from the OS but it can be 
easily fixed (see attached patch).

--
components: Library (Lib)
files: listdir.patch
keywords: patch
messages: 146031
nosy: mandel
priority: normal
severity: normal
status: open
title: os.listdir breaks with literal paths
versions: Python 2.7
Added file: http://bugs.python.org/file23482/listdir.patch

___
Python tracker 
<http://bugs.python.org/issue13234>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13234] os.listdir breaks with literal paths

2011-10-25 Thread Manuel de la Pena

Manuel de la Pena  added the comment:

Indeed, in our code we had to write a number of wrappers around the os calls to 
be able to work with long path on Windows. At the moment working with long 
paths on windows and python is broken in a number of places and is a PITA to 
work with.

--

___
Python tracker 
<http://bugs.python.org/issue13234>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13234] os.listdir breaks with literal paths

2011-10-25 Thread Manuel de la Pena

Manuel de la Pena  added the comment:

In case of my patch (I don't know about santa4nt case) I did not use 
shutil.remove because it was not used in the other tests and I wanted to be 
consistent and not add a new import. Certainly if there is not an issue with 
that we should use it.

--

___
Python tracker 
<http://bugs.python.org/issue13234>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15275] isinstance is called a more times that needed in ntpath

2012-07-07 Thread Manuel de la Pena

New submission from Manuel de la Pena :

The problem is simple, the code that allows to use binary strings and unicode 
is making more calls that needed to isinstance(path, bytes) since the result of 
the code is not shared. For example, the following calls are present in the 
module:

def _get_empty(path):
if isinstance(path, bytes):
return b'' 
else:
return ''

def _get_sep(path):
if isinstance(path, bytes):
return b'\\'
else:
return '\\'

def _get_altsep(path):
if isinstance(path, bytes):
return b'/'
else:
return '/'

def _get_bothseps(path):
if isinstance(path, bytes):
return b'\\/'
else:
return '\\/'

def _get_dot(path):
if isinstance(path, bytes):
return b'.'
else:
return '.'

...

And then something similar to the following is found in the code:

def normpath(path):
"""Normalize path, eliminating double slashes, etc."""
sep = _get_sep(path)
dotdot = _get_dot(path) * 2
special_prefixes = _get_special(path)
if path.startswith(special_prefixes):
# in the case of paths with these prefixes:
# \\.\ -> device names
# \\?\ -> literal paths
# do not do any normalization, but return the path unchanged
return path
path = path.replace(_get_altsep(path), sep)
prefix, path = splitdrive(path)

As you can see the isinstance call is performed more than needed which 
certainly affects the performance of the path operations. 

The attached patch removes the number of calls to isinstance(obj, bytes) and 
also ensures that the function that returns the correct literal is as fast as 
possible by using a dict.

--
components: Windows
files: less_isinstance.patch
hgrepos: 140
keywords: patch
messages: 164842
nosy: mandel
priority: normal
severity: normal
status: open
title: isinstance is called a more times that needed in ntpath
versions: Python 3.3
Added file: http://bugs.python.org/file26294/less_isinstance.patch

___
Python tracker 
<http://bugs.python.org/issue15275>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15275] isinstance is called a more times that needed in ntpath

2012-07-07 Thread Manuel de la Pena

Changes by Manuel de la Pena :


Added file: http://bugs.python.org/file26295/f5c57ba1124b.diff

___
Python tracker 
<http://bugs.python.org/issue15275>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15275] isinstance is called a more times that needed in ntpath

2012-07-07 Thread Manuel de la Pena

Changes by Manuel de la Pena :


Removed file: http://bugs.python.org/file26295/f5c57ba1124b.diff

___
Python tracker 
<http://bugs.python.org/issue15275>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15286] normpath does not work with local literal paths

2012-07-07 Thread Manuel de la Pena

New submission from Manuel de la Pena :

Local literal paths are those paths that do use the \\?\ that allows to have 
paths longer that the MAX_PATH set by Windows 
(http://msdn.microsoft.com/en-us/library/windows/desktop/aa365247%28v=vs.85%29.aspx#short_vs._long_names).

While UNC (http://en.wikipedia.org/wiki/Path_%28computing%29) paths should not 
be normalized, local paths that do use the \\?\ prefix should so that 
developers that use such a trink to allow longer paths on windows do not have 
to wrapp the call in the following way:

LONG_PATH_PREFIX = '\\?\'
path = path.replace(LONG_PATH_PREFIX, '')
result = LONG_PATH_PREFIX + os.path.normpath(path)

The possible solution would be for the normalization to work and return the 
path normalized with the prefix added.

--
components: Windows
files: literal-normpath.patch
hgrepos: 141
keywords: patch
messages: 164909
nosy: mandel
priority: normal
severity: normal
status: open
title: normpath does not work with local literal paths
versions: Python 3.3
Added file: http://bugs.python.org/file26307/literal-normpath.patch

___
Python tracker 
<http://bugs.python.org/issue15286>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15286] normpath does not work with local literal paths

2012-07-08 Thread Manuel de la Pena

Manuel de la Pena  added the comment:

Antoine,

What the MSDN is stating is that the Windows functions from COM will not 
normalize the path if it is prefixed by \\?\. That is, if a user wanted to do:

path = r'\\?\C:\Users\mandel\..\Desktop\test'
with open(path, 'w') as fd:
fd.write('hello!')

he will get the following:

[Errorno 22] Invalid argument. r'\\?\C:\Users\mandel\..\Desktop\test'

The same think would happen if a C function is used, that is, open is doing the 
right thing. On the other hand, the same code without the \\?\ works.

This makes it even more important to allow the normpath users to normalize such 
paths, that is, a developer knows that the path has more than 260 chars and 
wants to make sure that the path can be written in the system:

May I ask you why you mention the symbolic links? I know that if one of the 
segments of the path is a symbolic link there are problems but this is not 
related to \\?\ or am I confused? Just curious :)

Brian,

The ntpath module is a little mess (look at my other patch 
http://bugs.python.org/issue15275) and I think there are more performance 
problems hidden there somewhere...

I imported string within the function because the same is done in expandvars 
(around line 430) and wanted to follow the style that was already in use in the 
file. I do agree that imports at the top are the way to go :)

--

___
Python tracker 
<http://bugs.python.org/issue15286>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15275] isinstance is called a more times that needed in ntpath

2012-07-08 Thread Manuel de la Pena

Changes by Manuel de la Pena :


--
nosy: +brian.curtin

___
Python tracker 
<http://bugs.python.org/issue15275>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15275] isinstance is called a more times that needed in ntpath

2012-07-24 Thread Manuel de la Pena

Manuel de la Pena  added the comment:

Tests indeed cover the changes made. I don't know about a decent way of doing 
benchmarks for the changes. Any recommendation?

> If this patch is applied I think it would be good to change posixpath too.

I agree and I'd love to do it but in a diff bug to make things self-contained, 
what do you think?

--

___
Python tracker 
<http://bugs.python.org/issue15275>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com