[issue9425] Rewrite import machinery to work with unicode paths

2010-10-18 Thread STINNER Victor
STINNER Victor added the comment: Starting at r85691, the full test suite of Python 3.2 pass with ASCII, ISO-8859-1 and UTF-8 locale encodings in a non-ascii directory. The work on this issue is done. -- resolution: -> fixed status: open -> closed ___

[issue9425] Rewrite import machinery to work with unicode paths

2010-09-29 Thread STINNER Victor
STINNER Victor added the comment: r85115 closes #9630: an important patch for #9425, redecode all filenames when setting the filesystem encoding. Next tasks (maybe not in this order): - merge getpath.c - redecode argv[0] used by PySys_SetArgvEx() to feed sys.path (encode argv[0] with the lo

[issue9425] Rewrite import machinery to work with unicode paths

2010-09-01 Thread STINNER Victor
Changes by STINNER Victor : Removed file: http://bugs.python.org/file18671/Py_UNICODE_strcat.patch ___ Python tracker ___ ___ Python-bugs-list

[issue9425] Rewrite import machinery to work with unicode paths

2010-09-01 Thread STINNER Victor
Changes by STINNER Victor : Removed file: http://bugs.python.org/file18672/Py_UNICODE_strdup.patch ___ Python tracker ___ ___ Python-bugs-list

[issue9425] Rewrite import machinery to work with unicode paths

2010-09-01 Thread STINNER Victor
STINNER Victor added the comment: r84429 creates Py_UNICODE_strcat() (change with the patch: return the right value). r84430 creates PyUnicode_strdup() (change with the patch: rename the function from Py_UNICODE_strdup() to PyUnicode_strdup() and mangle the function name). -- __

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-29 Thread STINNER Victor
STINNER Victor added the comment: Py_UNICODE_strcat.patch: create Py_UNICODE_strcat() function. Py_UNICODE_strdup.patch: create Py_UNICODE_strdup() function. -- Added file: http://bugs.python.org/file18672/Py_UNICODE_strdup.patch ___ Python tracker

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-29 Thread STINNER Victor
Changes by STINNER Victor : Added file: http://bugs.python.org/file18671/Py_UNICODE_strcat.patch ___ Python tracker ___ ___ Python-bugs-list ma

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-25 Thread STINNER Victor
STINNER Victor added the comment: > r84012 patchs zipimporter_init() to use the new PyUnicode_FSDecoder() > and use Py_UNICODE* (unicode) strings instead of char* (byte) strings. oops, it's r84013 (not r84012) -- ___ Python tracker

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-24 Thread Kristján Valur Jónsson
Kristján Valur Jónsson added the comment: Yes. in #1552880 I tried to make as minimal a change as possible. This particular patch is still in use in EVE Online, which is installed in various strange and exotic paths in the orient.. The trick I employed there was to encode everything to utf-

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-24 Thread STINNER Victor
STINNER Victor added the comment: See also #1552880. -- ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-22 Thread Romme
Changes by Romme : -- nosy: +Romme ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailma

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-17 Thread STINNER Victor
STINNER Victor added the comment: r84168 creates PyModule_GetFilenameObject(). I created a separated issue for the patch reencoding all filenames when setting the filesystem encoding: #9630. -- ___ Python tracker

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-16 Thread STINNER Victor
STINNER Victor added the comment: r84122 saves/restores the exception around "filename = _PyUnicode_AsString(co->co_filename);" because it raises an unicode error on unencodable filename. -- ___ Python tracker __

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-16 Thread STINNER Victor
STINNER Victor added the comment: r84121: repr() method zipimporter object uses unicode. -- ___ Python tracker ___ ___ Python-bugs-lis

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-16 Thread STINNER Victor
STINNER Victor added the comment: r84120: get_data() function of zipimport uses an unicode path. -- ___ Python tracker ___ ___ Python-

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-16 Thread STINNER Victor
Changes by STINNER Victor : Removed file: http://bugs.python.org/file18548/Py_UNICODE_strncmp-2.patch ___ Python tracker ___ ___ Python-bugs-li

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-16 Thread STINNER Victor
STINNER Victor added the comment: Py_UNICODE_strncmp-2.patch commited as r84111. -- ___ Python tracker ___ ___ Python-bugs-list mailin

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-16 Thread STINNER Victor
Changes by STINNER Victor : Removed file: http://bugs.python.org/file18547/Py_UNICODE_strncmp.patch ___ Python tracker ___ ___ Python-bugs-list

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-16 Thread STINNER Victor
STINNER Victor added the comment: Py_UNICODE_strncmp.patch was wrong for n=0. New version based on libiberty/strncmp.c source code. -- Added file: http://bugs.python.org/file18548/Py_UNICODE_strncmp-2.patch ___ Python tracker

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-16 Thread STINNER Victor
STINNER Victor added the comment: Py_UNICODE_strncmp.patch: create Py_UNICODE_strncmp() function. -- Added file: http://bugs.python.org/file18547/Py_UNICODE_strncmp.patch ___ Python tracker

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-16 Thread STINNER Victor
Changes by STINNER Victor : Removed file: http://bugs.python.org/file18527/zipimport_read_directory.patch ___ Python tracker ___ ___ Python-bug

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-16 Thread STINNER Victor
STINNER Victor added the comment: zipimport_read_directory.patch commited as r84095. -- ___ Python tracker ___ ___ Python-bugs-list ma

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-15 Thread STINNER Victor
STINNER Victor added the comment: I tried to fix Mac OS X (TESTFN_UNENCODABLE) with r84035, but I don't have access to Mac OS X to test and my patch was not correct. It should now be ok with r84080. -- ___ Python tracker

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-15 Thread Florent Xicluna
Florent Xicluna added the comment: It breaks test_unicode_file on OS X, too: File "/Users/db3l/buildarea/3.x.bolen-tiger/build/Lib/test/test_unicode_file.py", line 8, in from test.support import (run_unittest, rmtree, ImportError: cannot import name TESTFN_UNENCODABLE --

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-15 Thread Florent Xicluna
Florent Xicluna added the comment: r83972 breaks OS X buildbots: support.TESTFN_UNENCODABLE is not defined if sys.platform == 'darwin'. File "/Users/db3l/buildarea/3.x.bolen-tiger/build/Lib/test/test_imp.py", line 309, in class NullImporterTests(unittest.TestCase): File "/Users/db3l/

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-14 Thread STINNER Victor
STINNER Victor added the comment: zipimport_read_directory.patch: patch for read_directory() function of the zipimport module to support unencodable filenames. This patch requires #9599 (PySys_FormatStderr). The patch changes the encoding of the name: decode name byte string using the file sy

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-14 Thread STINNER Victor
STINNER Victor added the comment: r84030 creates _Py_fopen() for PyUnicodeObject path. -- ___ Python tracker ___ ___ Python-bugs-list

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-14 Thread STINNER Victor
STINNER Victor added the comment: r84012 patchs zipimporter_init() to use the new PyUnicode_FSDecoder() and use Py_UNICODE* (unicode) strings instead of char* (byte) strings. -- ___ Python tracker

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-14 Thread STINNER Victor
Changes by STINNER Victor : Removed file: http://bugs.python.org/file18448/_Py_stat.patch ___ Python tracker ___ ___ Python-bugs-list mailing l

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-14 Thread STINNER Victor
STINNER Victor added the comment: r84012 creates _Py_stat(). It is a little bit different than the attached patch (_Py_stat.patch): it doesn't clear Python exception on unicode conversion error. -- ___ Python tracker

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-13 Thread STINNER Victor
STINNER Victor added the comment: I created #9599: Add PySys_FormatStdout and PySys_FormatStderr functions. -- ___ Python tracker ___

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-13 Thread STINNER Victor
STINNER Victor added the comment: r83976 adds PyErr_WarnFormat() (pyerr_warnformat-2.patch). -- ___ Python tracker ___ ___ Python-bugs

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-13 Thread STINNER Victor
STINNER Victor added the comment: r83990 closes #9542 by creating the PyUnicode_FSDecoder() PyArg_ParseTuple parser. -- ___ Python tracker ___ __

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-13 Thread STINNER Victor
Changes by STINNER Victor : Removed file: http://bugs.python.org/file18514/_Py_wchar2char-2.patch ___ Python tracker ___ ___ Python-bugs-list m

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-13 Thread STINNER Victor
STINNER Victor added the comment: r83989 creates _Py_wchar2char() function (_Py_wchar2char-2.patch). -- ___ Python tracker ___ ___ Pyt

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-13 Thread STINNER Victor
STINNER Victor added the comment: > I know this is not introduced by your patch, just moved, but couldn’t > the typo in UNDECODEABLE be fixed? (extraneous e) I wasn't sure that it was a typo, so I kept it unchanged. It's now fixed by r83987. -- ___

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-13 Thread Éric Araujo
Éric Araujo added the comment: I know this is not introduced by your patch, just moved, but couldn’t the typo in UNDECODEABLE be fixed? (extraneous e) -- ___ Python tracker ___ _

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-13 Thread STINNER Victor
Changes by STINNER Victor : Removed file: http://bugs.python.org/file18431/_Py_wchar2char.patch ___ Python tracker ___ ___ Python-bugs-list mai

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-13 Thread STINNER Victor
STINNER Victor added the comment: New version of the patch _Py_wchar2char-2.patch: - _Py_wchar2char() only escapes characters in range U+DC80..U+DCFF (instead of U+DC00..U+DCFF) - add a comment to _Py_char2wchar() > I don't understand why you decrement `size` in the second pass. Because I w

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-13 Thread Antoine Pitrou
Antoine Pitrou added the comment: About wchar2char: - PEP 383 says “With this PEP, non-decodable bytes >= 128 will be represented as lone surrogate codes U+DC80..U+DCFF. Bytes below 128 will produce exceptions”. Your patch accepts bytes below 128. - I don't understand why you decrement `size`

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-13 Thread STINNER Victor
STINNER Victor added the comment: r83981 closes #9560: avoid the filename in _syscmd_file() to fix a bug with non encodable filenames in platform.architecture(). -- ___ Python tracker _

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-13 Thread STINNER Victor
STINNER Victor added the comment: Note about _Py_wchar2char(): it is possible to convert character by character (instead of working on substrings) because the input string doesn't contain surrogate pairs. _Py_char2wchar() ensures the the output string doens't contain surrogate pairs: if a byt

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-13 Thread STINNER Victor
Changes by STINNER Victor : Removed file: http://bugs.python.org/file18444/pyerr_warnformat-2.patch ___ Python tracker ___ ___ Python-bugs-list

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-13 Thread STINNER Victor
STINNER Victor added the comment: r83973 ignores the name argument of PyFile_FromFd() because it was already ignored (it did always produce an error) and it avoids my complex _PyFile_FromFdUnicode.patch. Thanks Antoine to having notice that name was ignored. -- _

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-13 Thread STINNER Victor
Changes by STINNER Victor : Removed file: http://bugs.python.org/file18469/_PyFile_FromFdUnicode.patch ___ Python tracker ___ ___ Python-bugs-l

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-13 Thread STINNER Victor
Changes by STINNER Victor : Removed file: http://bugs.python.org/file18434/nullimporter_unicode.patch ___ Python tracker ___ ___ Python-bugs-li

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-13 Thread STINNER Victor
STINNER Victor added the comment: I commited nullimporter_unicode.patch with an unit test as r83972. -- ___ Python tracker ___ ___ Pyt

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-13 Thread STINNER Victor
STINNER Victor added the comment: r83971 enables test.support.TESTFN_UNDECODEABLE on non-Windows OSes. -- ___ Python tracker ___ ___ P

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-12 Thread STINNER Victor
STINNER Victor added the comment: (About PyFile_FromFd) pitrou> Actually, I'm not sure there's much point since the "name" pitrou> attribute is currently read-only: (...) Oh, it remembers me #4762. I closed this issue with the message "The last problem occurs with imp.find_module(). But imp.f

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-11 Thread Antoine Pitrou
Antoine Pitrou added the comment: Actually, I'm not sure there's much point since the "name" attribute is currently read-only: >>> f = open(1, "wb") >>> f.name = "foo" Traceback (most recent call last): File "", line 1, in AttributeError: attribute 'name' of '_io.BufferedWriter' objects is

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-10 Thread STINNER Victor
STINNER Victor added the comment: _PyFile_FromFdUnicode.patch: create _PyFile_FromFdUnicode() function. It will be used in import.c to open a file using an unicode filename. For _PyFile_FromFd(), I kept the previous behaviour: clear the exception on PyUnicode_DecodeFSDefault() error. For fil

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-10 Thread STINNER Victor
Changes by STINNER Victor : Removed file: http://bugs.python.org/file18446/Py_UNICODE_strrchr.patch ___ Python tracker ___ ___ Python-bugs-list

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-10 Thread STINNER Victor
STINNER Victor added the comment: I commited Py_UNICODE_strrchr.patch as r83933 after removing the useless start variable. -- ___ Python tracker ___

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-08 Thread STINNER Victor
STINNER Victor added the comment: r83870 creates load_builtin() subfunction in import.c to prepare and simplify the big patch. -- ___ Python tracker ___

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-08 Thread STINNER Victor
STINNER Victor added the comment: _Py_stat.patch: create _Py_stat() function. It will be used in import.c and zipimport.c. I added the function to import.c because, initially, I only used it there. But it's maybe not the best place for such function. posixmodule.c doesn't fit because it is n

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-08 Thread STINNER Victor
STINNER Victor added the comment: I created a separated issue, #9542, to add the new function PyUnicode_FSDecoder(). -- ___ Python tracker ___ __

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-08 Thread STINNER Victor
STINNER Victor added the comment: Py_UNICODE_strrchr.patch: Create Py_UNICODE_strrchr() function. It will be used for zipimport to work on unicode paths instead of bytes paths. Antoine noticed that the input string is const whereas the output string is not const, which is unusual. I copy/past

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-08 Thread STINNER Victor
STINNER Victor added the comment: gutworth's comment about r83860: "Test?" -- ___ Python tracker ___ ___ Python-bugs-list mailing list

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-08 Thread STINNER Victor
Changes by STINNER Victor : Removed file: http://bugs.python.org/file18432/pyerr_warnformat.patch ___ Python tracker ___ ___ Python-bugs-list m

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-08 Thread STINNER Victor
STINNER Victor added the comment: pitrou> It looks like you are a fixing a bug in setup_context() pitrou> at the same time as you introduce PyErr_WarnFormat(). pitrou> Both changes should probably go in separately. Right. r83860 fixes the bug, and I attached a new version of the patch (with

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-08 Thread Antoine Pitrou
Antoine Pitrou added the comment: It looks like you are a fixing a bug in setup_context() at the same time as you introduce PyErr_WarnFormat(). Both changes should probably go in separately. The PyErr_WarnFormat() doc needs a "versionadded" tag. -- ___

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-08 Thread STINNER Victor
STINNER Victor added the comment: nullimporter_unicode.patch: patch NullImporter_init(): - use GetFileAttributesW() instead of GetFileAttributesA() for the Windows version to be fully Unicode compliant - use "O&" format with PyUnicode_FSConverter instead of "es" with Py_FileSystemDefaultEnco

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-08 Thread STINNER Victor
STINNER Victor added the comment: pyerr_warnformat.patch: create PyErr_WarnFormat() function, and use it in PyType_Ready() and PyUnicode_AsEncodedString(). The patch fixes also setup_context(): work on the unicode filename, not the encoded (bytes) filename. It does fix a bug because len is a

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-08 Thread STINNER Victor
STINNER Victor added the comment: r83783 creates run_file() subfunction. -- ___ Python tracker ___ ___ Python-bugs-list mailing list U

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-08 Thread STINNER Victor
STINNER Victor added the comment: _Py_wchar2char.patch: create _Py_wchar2char() private function, and _wstat() and _wfopen() use it. _Py_wchar2char() function has been improved since the previous version posted to Rietveld: it now computes the exact length of the output buffer, instead of usi

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-07 Thread STINNER Victor
STINNER Victor added the comment: r83779 creates run_command(), it's just a refactorization. -- ___ Python tracker ___ ___ Python-bugs

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-07 Thread STINNER Victor
STINNER Victor added the comment: The patch is too huge to be commited at once. I will split it again into smaller parts. First related commit: r83778 fixes tests for not encodable filenames. -- ___ Python tracker

[issue9425] Rewrite import machinery to work with unicode paths

2010-07-31 Thread STINNER Victor
STINNER Victor added the comment: After some tests on Windows, I realized that my patch is not enough to be fully unicode compliant (on Windows). Some functions are still using PyUnicode_DecodeFSDefault() or PyUnicode_EncodeFSDefault(). Until all functions are patched to use unicode strings,

[issue9425] Rewrite import machinery to work with unicode paths

2010-07-29 Thread STINNER Victor
STINNER Victor added the comment: Another important TODO: use weak references for the code objects list. -- I tested my patch on Windows. I fixes #8988 because non-ASCII characters are now correctly decoded with mbcs and not UTF-8. But it doesn't work with characters not encodable to mbcs. I

[issue9425] Rewrite import machinery to work with unicode paths

2010-07-29 Thread STINNER Victor
STINNER Victor added the comment: > The patch should also include more tests. Which kind of test? Run the test suite in a non-ASCII directory with encoding different than utf-8 is enough. If the patch is accepted, the solution is maybe a specific buildbot. -- ___

[issue9425] Rewrite import machinery to work with unicode paths

2010-07-29 Thread Arfrever Frehtes Taifersar Arahesis
Changes by Arfrever Frehtes Taifersar Arahesis : -- nosy: +Arfrever ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscrib

[issue9425] Rewrite import machinery to work with unicode paths

2010-07-29 Thread Ezio Melotti
Ezio Melotti added the comment: I wrote a few minor comments on codereview. The patch should also include more tests. -- nosy: +ezio.melotti ___ Python tracker ___ __

[issue9425] Rewrite import machinery to work with unicode paths

2010-07-29 Thread STINNER Victor
STINNER Victor added the comment: Oh, I forgot to say that I created an svn branch including my work: import_unicode. http://svn.python.org/view/python/branches/import_unicode/ You can try it if you prefer svn to an huge patch. I created a branch so you can follow my work commit by commit usi

[issue9425] Rewrite import machinery to work with unicode paths

2010-07-29 Thread STINNER Victor
New submission from STINNER Victor : Python (2 and 3) is unable to load a module installed in a directory containing characters not encodable to the locale encoding. And Python doesn't work if it's installed in non-ASCII directory on Windows or with a locale encoding different than UTF-8. On W