[issue13207] os.path.expanduser brakes when using unicode character in the username
New submission from Manuel de la Pena : During our development we have experience the following: If you have a user in your Windows machine with a name hat uses Japanese characters like “雄鳥お人好し” you will have the following in your system: * The Windows Shell will show the path correctly, that is: “C:\Users\雄鳥お人好し” * cmd.exe will show: “C:\Users\??” * All the env variables will be wrong, which means they will be similar to the info shown in cmd.exe The above is a problem because the implementation of expanduser in ntpath.py uses the env variables to get expand the path which means that in this case the returned path will be wrong. I have attached a small example of how to get the user profile path (~) on Windows using SHGetFolderPathW or SHGetKnownFolderPathW to fix the issue. PS: I don't know if this issue also occurs on python 3. -- components: Windows files: expanduser.py messages: 145798 nosy: mandel priority: normal severity: normal status: open title: os.path.expanduser brakes when using unicode character in the username type: behavior versions: Python 2.7 Added file: http://bugs.python.org/file23442/expanduser.py ___ Python tracker <http://bugs.python.org/issue13207> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13234] os.listdir breaks with literal paths
New submission from Manuel de la Pena : During the development of an application that needed to write paths longer than 260 chars we opted to use \\?\ as per http://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx#maxpath. When working with literal paths the following the os.listdir funtion would return the following trace: >>> import os >>> test = r'\\?\C:\Python27' >>> os.listdir(test) Traceback (most recent call last): File "", line 1, in WindowsError: [Error 123] The filename, directory name, or volume label syntax is incorrect: '?\\C:\\Python27/*.*' The reason for this is that the implementation of listdir appends '/' at the end of the path if os.path.sep is not present at the end of it which FindFirstFile does not like. This is a inconsistency from the OS but it can be easily fixed (see attached patch). -- components: Library (Lib) files: listdir.patch keywords: patch messages: 146031 nosy: mandel priority: normal severity: normal status: open title: os.listdir breaks with literal paths versions: Python 2.7 Added file: http://bugs.python.org/file23482/listdir.patch ___ Python tracker <http://bugs.python.org/issue13234> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13234] os.listdir breaks with literal paths
Manuel de la Pena added the comment: Indeed, in our code we had to write a number of wrappers around the os calls to be able to work with long path on Windows. At the moment working with long paths on windows and python is broken in a number of places and is a PITA to work with. -- ___ Python tracker <http://bugs.python.org/issue13234> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13234] os.listdir breaks with literal paths
Manuel de la Pena added the comment: In case of my patch (I don't know about santa4nt case) I did not use shutil.remove because it was not used in the other tests and I wanted to be consistent and not add a new import. Certainly if there is not an issue with that we should use it. -- ___ Python tracker <http://bugs.python.org/issue13234> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15275] isinstance is called a more times that needed in ntpath
New submission from Manuel de la Pena : The problem is simple, the code that allows to use binary strings and unicode is making more calls that needed to isinstance(path, bytes) since the result of the code is not shared. For example, the following calls are present in the module: def _get_empty(path): if isinstance(path, bytes): return b'' else: return '' def _get_sep(path): if isinstance(path, bytes): return b'\\' else: return '\\' def _get_altsep(path): if isinstance(path, bytes): return b'/' else: return '/' def _get_bothseps(path): if isinstance(path, bytes): return b'\\/' else: return '\\/' def _get_dot(path): if isinstance(path, bytes): return b'.' else: return '.' ... And then something similar to the following is found in the code: def normpath(path): """Normalize path, eliminating double slashes, etc.""" sep = _get_sep(path) dotdot = _get_dot(path) * 2 special_prefixes = _get_special(path) if path.startswith(special_prefixes): # in the case of paths with these prefixes: # \\.\ -> device names # \\?\ -> literal paths # do not do any normalization, but return the path unchanged return path path = path.replace(_get_altsep(path), sep) prefix, path = splitdrive(path) As you can see the isinstance call is performed more than needed which certainly affects the performance of the path operations. The attached patch removes the number of calls to isinstance(obj, bytes) and also ensures that the function that returns the correct literal is as fast as possible by using a dict. -- components: Windows files: less_isinstance.patch hgrepos: 140 keywords: patch messages: 164842 nosy: mandel priority: normal severity: normal status: open title: isinstance is called a more times that needed in ntpath versions: Python 3.3 Added file: http://bugs.python.org/file26294/less_isinstance.patch ___ Python tracker <http://bugs.python.org/issue15275> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15275] isinstance is called a more times that needed in ntpath
Changes by Manuel de la Pena : Added file: http://bugs.python.org/file26295/f5c57ba1124b.diff ___ Python tracker <http://bugs.python.org/issue15275> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15275] isinstance is called a more times that needed in ntpath
Changes by Manuel de la Pena : Removed file: http://bugs.python.org/file26295/f5c57ba1124b.diff ___ Python tracker <http://bugs.python.org/issue15275> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15286] normpath does not work with local literal paths
New submission from Manuel de la Pena : Local literal paths are those paths that do use the \\?\ that allows to have paths longer that the MAX_PATH set by Windows (http://msdn.microsoft.com/en-us/library/windows/desktop/aa365247%28v=vs.85%29.aspx#short_vs._long_names). While UNC (http://en.wikipedia.org/wiki/Path_%28computing%29) paths should not be normalized, local paths that do use the \\?\ prefix should so that developers that use such a trink to allow longer paths on windows do not have to wrapp the call in the following way: LONG_PATH_PREFIX = '\\?\' path = path.replace(LONG_PATH_PREFIX, '') result = LONG_PATH_PREFIX + os.path.normpath(path) The possible solution would be for the normalization to work and return the path normalized with the prefix added. -- components: Windows files: literal-normpath.patch hgrepos: 141 keywords: patch messages: 164909 nosy: mandel priority: normal severity: normal status: open title: normpath does not work with local literal paths versions: Python 3.3 Added file: http://bugs.python.org/file26307/literal-normpath.patch ___ Python tracker <http://bugs.python.org/issue15286> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15286] normpath does not work with local literal paths
Manuel de la Pena added the comment: Antoine, What the MSDN is stating is that the Windows functions from COM will not normalize the path if it is prefixed by \\?\. That is, if a user wanted to do: path = r'\\?\C:\Users\mandel\..\Desktop\test' with open(path, 'w') as fd: fd.write('hello!') he will get the following: [Errorno 22] Invalid argument. r'\\?\C:\Users\mandel\..\Desktop\test' The same think would happen if a C function is used, that is, open is doing the right thing. On the other hand, the same code without the \\?\ works. This makes it even more important to allow the normpath users to normalize such paths, that is, a developer knows that the path has more than 260 chars and wants to make sure that the path can be written in the system: May I ask you why you mention the symbolic links? I know that if one of the segments of the path is a symbolic link there are problems but this is not related to \\?\ or am I confused? Just curious :) Brian, The ntpath module is a little mess (look at my other patch http://bugs.python.org/issue15275) and I think there are more performance problems hidden there somewhere... I imported string within the function because the same is done in expandvars (around line 430) and wanted to follow the style that was already in use in the file. I do agree that imports at the top are the way to go :) -- ___ Python tracker <http://bugs.python.org/issue15286> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15275] isinstance is called a more times that needed in ntpath
Changes by Manuel de la Pena : -- nosy: +brian.curtin ___ Python tracker <http://bugs.python.org/issue15275> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15275] isinstance is called a more times that needed in ntpath
Manuel de la Pena added the comment: Tests indeed cover the changes made. I don't know about a decent way of doing benchmarks for the changes. Any recommendation? > If this patch is applied I think it would be good to change posixpath too. I agree and I'd love to do it but in a diff bug to make things self-contained, what do you think? -- ___ Python tracker <http://bugs.python.org/issue15275> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com