Re: [Python-Dev] Emit a BytesWarning on bytes filenames on Windows

2011-11-08 Thread Victor Stinner
Le samedi 29 octobre 2011 07:47:01, vous avez écrit : > Therefore, as you imply, I think the solution to this issue is to start > the process of deprecating the bytes version of the api in py3k with a > view to removing it completely - possibly with a less aggressive > timeline than normal. In Pyt

Re: [Python-Dev] [Python-checkins] cpython: Change decoders to use Unicode API instead of Py_UNICODE.

2011-11-09 Thread Victor Stinner
First of all, thanks for having upgraded this huge part (codecs) to the new Unicode API! > +static int > +unicode_widen(PyObject **p_unicode, int maxchar) > +{ > +PyObject *result; > +assert(PyUnicode_IS_READY(*p_unicode)); > +if (maxchar <= PyUnicode_MAX_CHAR_VALUE(*p_unicode)) > +

[Python-Dev] unicode_internal codec and the PEP 393

2011-11-09 Thread Victor Stinner
Hi, The unicode_internal decoder doesn't decode surrogate pairs and so test_unicode.UnicodeTest.test_codecs() is failing on Windows (16-bit wchar_t). I don't know if this codec is still revelant with the PEP 393 because the internal representation is now depending on the maximum character (Py_U

Re: [Python-Dev] PEP 405 (proposed): Python 2.8 Release Schedule

2011-11-09 Thread Victor Stinner
Le Mercredi 9 Novembre 2011 17:18:45 Amaury Forgeot d'Arc a écrit : > Hi, > > 2011/11/9 Barry Warsaw > > > I think we should have an official pronouncement about Python 2.8, and > > PEPs are as official as it gets 'round here. > > Do we need to designate a release manager? random.choice() shou

Re: [Python-Dev] unicode_internal codec and the PEP 393

2011-11-09 Thread Victor Stinner
Le mercredi 9 novembre 2011 22:03:52, vous avez écrit : > > > Should we: > > * Drop this codec (public and documented, but I don't know if it is > > used) * Use wchar_t* (Py_UNICODE*) to provide a result similar to > > Python 3.2, and > > > > so fix the decoder to handle surrogate pairs > >

Re: [Python-Dev] unicode_internal codec and the PEP 393

2011-11-11 Thread Victor Stinner
Le 09/11/2011 23:45, "Martin v. Löwis" a écrit : After a quick search on Google codesearch (before it disappears!), I don't think that "encoding" a Unicode string to its internal PEP-393 representation would satisfy any program. It looks like wchar_t* is a better candidate. Ok. Making it Py_UNI

Re: [Python-Dev] peps: And now for something completely different.

2011-11-14 Thread Victor Stinner
If the PEP 404 lists important changes between Python 2 and Python 3, the removal of old-style classes should also be mentioned because it is a change in the core language. Victor ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org

Re: [Python-Dev] Is Python insider blog dead?

2011-11-16 Thread Victor Stinner
Le Mercredi 16 Novembre 2011 07:23:03 Brian Curtin a écrit : > Not dead, there was just a period where I got a little too busy with real > life, plus development seemed to slow down for a while. I have a few drafts > working (like a post on all of the recent PEP activity) and a few more in > my hea

Re: [Python-Dev] Committing PEP 3155

2011-11-18 Thread Victor Stinner
I haven't seen any strong objections, so I would like to go ahead and commit PEP 3155 (*) soon. Is anyone against it? I'm not against it, but I have some questions. Does you a working implementing? Do you have a patch for issue #9276 using __qualname__? Maybe not a fully working patch, but a

[Python-Dev] Chose a name for a "get unicode as wide character, borrowed reference" function

2011-11-21 Thread Victor Stinner
Hi, With the PEP 393, the Py_UNICODE is now deprecated and scheduled for removal in Python 4. PyUnicode_AsUnicode() and PyUnicode_AsUnicodeAndSize() functions are still commonly used on Windows to get the string as wchar_t* without having to care of freeing the memory: it's a borrowed reference

Re: [Python-Dev] Chose a name for a "get unicode as wide character, borrowed reference" function

2011-11-21 Thread Victor Stinner
Le Lundi 21 Novembre 2011 16:04:06 Antoine Pitrou a écrit : > On Mon, 21 Nov 2011 12:53:17 +0100 > > Victor Stinner wrote: > > I would like to add a new PyUnicode_AsWideChar() function which would > > return the borrowed reference, exactly as PyUnicode_AsUnicode(). Th

Re: [Python-Dev] Chose a name for a "get unicode as wide character, borrowed reference" function

2011-11-21 Thread Victor Stinner
Le Lundi 21 Novembre 2011 16:55:05 Antoine Pitrou a écrit : > > I want to rename PyUnicode_AsUnicode() and change its result type > > (Py_UNICODE* => wchar_t*). The result will be a "borrowed reference", > > ie. you don't have to free the memory, it will be done when the Unicode > > string will be

[Python-Dev] PyUnicode_EncodeDecimal

2011-11-21 Thread Victor Stinner
Hi, I'm trying to rewrite PyUnicode_EncodeDecimal() to upgrade it to the new Unicode API. The problem is that the function is not accessible in Python nor tested. Should we document and test it, leave it unchanged and deprecate it, or simply remove it? -- Python has a PyUnicode_EncodeDecimal(

Re: [Python-Dev] PyUnicode_EncodeDecimal

2011-11-21 Thread Victor Stinner
Le lundi 21 novembre 2011 21:39:53, Victor Stinner a écrit : > I'm trying to rewrite PyUnicode_EncodeDecimal() to upgrade it to the new > Unicode API. The problem is that the function is not accessible in Python > nor tested. I added tests for this function in Python 2.

[Python-Dev] PyUnicode_Resize

2011-11-21 Thread Victor Stinner
Hi, In Python 3.2, PyUnicode_Resize() expects a number of Py_UNICODE units, whereas Python 3.3 expects a number of characters. It is tricky to convert a number of Py_UNICODE units to a number of characters, so it is diffcult to provide a backward compatibility PyUnicode_Resize() function takin

Re: [Python-Dev] PyUnicode_EncodeDecimal

2011-11-22 Thread Victor Stinner
Le mardi 22 novembre 2011 02:02:05, Victor Stinner a écrit : > This function is broken by design if an error handler is specified: the > caller cannot know the size of the output buffer, whereas the caller has > to allocate this buffer. > > I propose to raise an error if a

Re: [Python-Dev] cpython: fix compiler warning by implementing this more cleverly

2011-11-23 Thread Victor Stinner
Le Mercredi 23 Novembre 2011 01:49:28 Terry Reedy a écrit : > The one-liner could be followed by >assert(kind==1 || kind==2 || kind==4) > which would also serve to remind the reader of the possibilities. For a ready string, kind must be 1, 2 or 4. We might rename "kind" to "charsize" because

[Python-Dev] Reject characters bigger than U+10FFFF and Solaris issues

2011-12-07 Thread Victor Stinner
Hi, I would like to deny the creation of an Unicode string containing characters outside the range [U+; U+10]. The check is already present in some places (e.g. the builtin chr() function), but not everywhere. The last important function is PyUnicode_FromWideChar, function used to decod

Re: [Python-Dev] Reject characters bigger than U+10FFFF and Solaris issues

2011-12-08 Thread Victor Stinner
Le 08/12/2011 10:17, Stefan Krah a écrit : I'm think that b'\xA0' is a valid thousands separator. I agree, but it's not the point: the problem is that b'\xA0' is decoded to a strange U+3020 character by mbstowcs(). Currently I have this horrible function to deal with the problem: ...

Re: [Python-Dev] cpython: Document PyUnicode_Copy() and PyUnicode_EncodeCodePage()

2011-12-09 Thread Victor Stinner
On 09/12/2011 01:35, Antoine Pitrou wrote: On Fri, 09 Dec 2011 00:16:02 +0100 victor.stinner wrote: +.. c:function:: PyObject* PyUnicode_Copy(PyObject *unicode) + + Get a new copy of a Unicode object. + + .. versionadded:: 3.3 I'm not sure I understand. Why would you make a copy of an im

Re: [Python-Dev] cpython: Document PyUnicode_Copy() and PyUnicode_EncodeCodePage()

2011-12-11 Thread Victor Stinner
Le vendredi 9 décembre 2011 20:32:16 Antoine Pitrou a écrit : > ... it's a bit obscure why the function exists. Yeah ok, I marked the function as private: renamed to _PyUnicode_Copy() and I undocumented it. Victor ___ Python-Dev mailing list Python-Dev

Re: [Python-Dev] IEEE/ISO draft on Python vulnerabilities

2011-12-12 Thread Victor Stinner
IEEE/ISO are working on a draft document about Python vulunerabilities: http://grouper.ieee.org/groups/plv/DocLog/300-399/360-thru-379/22-WG23-N-0372/n0372.pdf (in the context of a larger effort to classify vulnerabilities in all languages: ISO/IEC TR 24772:2010, available from ISO at no cost a

Re: [Python-Dev] PyUnicodeObject / PyASCIIObject questions

2011-12-14 Thread Victor Stinner
Le mardi 13 décembre 2011 02:09:02 Jim Jewett a écrit : > (3) I would feel much less nervous if the remaining 4 values of > PyUnicode_Kind were explicitly reserved, and the macros raised an > error when they showed up. (Better still would be to allow other > values, and to have the macros delegat

Re: [Python-Dev] Compiling the source without stat

2011-12-15 Thread Victor Stinner
Le jeudi 15 décembre 2011 15:29:23 vous avez écrit : > If faking a stat struct and a function to fill it > solves the problem, and checking for existing files and folders is the > only thing that python needs to be compiled (i'm talking about 2.7) then > it's possible to fail-check it by just tryin

[Python-Dev] French sprint this week-end

2011-12-15 Thread Victor Stinner
Hi, I organize an online sprint on CPython this week-end with french developers. At least six developers will participe, some of them don't know C, most know Python. Do you know simple task to start contributing to Python? Something useful and not boring if possible :-) There is the "easy" t

Re: [Python-Dev] [Python-checkins] cpython: Move PyUnicode_WCHAR_KIND outside PyUnicode_Kind enum

2011-12-18 Thread Victor Stinner
On 18/12/2011 20:34, "Martin v. Löwis" wrote: Move PyUnicode_WCHAR_KIND outside PyUnicode_Kind enum What's the rationale for that change? It's a valid kind value, after all, and the C convention is that an enumeration lists all valid values (else there wouldn't be a need for an enumeration i

Re: [Python-Dev] [Python-checkins] cpython: Move PyUnicode_WCHAR_KIND outside PyUnicode_Kind enum

2011-12-18 Thread Victor Stinner
On 18/12/2011 21:04, "Martin v. Löwis" wrote: PyUnicode_KIND() only returns PyUnicode_1BYTE_KIND, PyUnicode_2BYTE_KIND or PyUnicode_4BYTE_KIND. Outside unicodeobject.c, you are not supposed to see PyUnicode_WCHAR_KIND. Why do you say that? It can very well happen, assuming you call PyUnicode_KI

Re: [Python-Dev] Difference between PyUnicode_IS_ASCII and PyUnicode_IS_COMPACT_ASCII ?

2011-12-20 Thread Victor Stinner
On 20/12/2011 09:54, Antoine Pitrou wrote: Hello, The include file (unicodeobject.h) seems to imply that some pure ASCII strings can be non-compact, but I don't understand how that can happen. If you create a string from Py_UNICODE* or wchar_t* (using the legacy API), PyUnicode_READY() may c

Re: [Python-Dev] Fwd: Anyone still using Python 2.5?

2011-12-21 Thread Victor Stinner
What's the general consensus on supporting Python 2.5 nowadays? There is no such consensus :-) Do people still have to use this in commercial environments or is everyone on 2.6+ nowadays? At work, we are still using Python 2.5. Six months ago, we started a project to upgrade to 2.7, but we

Re: [Python-Dev] Fwd: Anyone still using Python 2.5?

2011-12-21 Thread Victor Stinner
On 21/12/2011 15:26, anatoly techtonik wrote: I believe most AppEngine applications in Python are still using 2.5 run-time. So are development boxes for these applications. It may take another year or two for the transition. App engine 1.6 improved support of Python 2.7, so I hope that -slowly-

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-30 Thread Victor Stinner
Le 29/12/2011 02:28, Michael Foord a écrit : A paper (well, presentation) has been published highlighting security problems with the hashing algorithm (exploiting collisions) in many programming languages Python included: http://events.ccc.de/congress/2011/Fahrplan/attachments/2007_2

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-30 Thread Victor Stinner
Le 29/12/2011 14:19, Christian Heimes a écrit : Perhaps the dict code is a better place for randomization. The problem is the creation of a dict with keys all having the same hash value. The current implementation of dict uses a linked-list. Adding a new item requires to compare the new key t

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-30 Thread Victor Stinner
In case the watchdog is not a viable solution as I had assumed it was, I think it's more reasonable to indeed consider adding a flag to Python that allows randomization of hashes optionally before startup. A flag will only be needed if the overhead of the fix is too high. However as it was sai

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-01 Thread Victor Stinner
Le 01/01/2012 04:29, Paul McMillan a écrit : This is incorrect. Once an attacker has guessed the random seed, any operation which reveals the ordering of hashed objects can be used to verify the answer. JSON responses would be ideal. In fact, an attacker can do a brute-force attack of the random

Re: [Python-Dev] RNG in the core

2012-01-03 Thread Victor Stinner
A randomized hash doesn't need cryptographic RNG (which are slow and need a lot of new code), and the new hash function should maybe not be cryptographic. We need to make the DoS more expensive for the attacker, but we don't need to add "too much security" for that. Mersenne Twister is useless her

Re: [Python-Dev] RNG in the core

2012-01-04 Thread Victor Stinner
> (or is /dev/urandom still available in a chroot?) Last time that I played with chroot, I "binded" /dev and /proc. Many programs rely on specific devices like /dev/null. Python should not refuse to start if /dev/urandom (or CryptoGen) is missing or cannot be used, but should use a weak fallback.

Re: [Python-Dev] cpython: Add a new PyUnicode_Fill() function

2012-01-04 Thread Victor Stinner
Oops, it's a typo in the doc (copy/paste failure). It's now fixed, thanks. Victor 2012/1/4 Antoine Pitrou : > >> +.. c:function:: int PyUnicode_Fill(PyObject *unicode, Py_ssize_t start, \ >> +                        Py_ssize_t length, Py_UCS4 fill_char) >> + >> +   Fill a string with a character:

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Victor Stinner
2012/1/6 Barry Warsaw : >>Settings for PYRANDOMHASH: >> >> PYRANDOMHASH=1 >>   enable randomized hashing function >> >> PYRANDOMHASH=/path/to/seed >>   enable randomized hashing function and read seed from 'seed' >> >> PYRANDOMHASH=0 >>   disable randomed hashing function > > Why not PYTHONHASHSEED

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-06 Thread Victor Stinner
Using my patch (random-2.patch), the overhead is 0%. I cannot see a difference with and without my patch. Numbers: --- unpatched: == 3 characters == 1 loops, best of 3: 459 usec per loop == 10 characters == 1 loops, best of 3: 575 usec per loop == 500 characters == 1 loops, best of 3: 1.36 msec pe

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-09 Thread Victor Stinner
> That said, I don't think smallest-format is actually enforced with > anything stronger than comments (such as in unicodeobject.h struct > PyASCIIObject) and asserts (mostly calling > _PyUnicode_CheckConsistency).  I don't have any insight on how > prevalent non-conforming strings will be in pract

Re: [Python-Dev] Compiling 2.7.2 on OS/2

2012-01-09 Thread Victor Stinner
> -        if os.name in ('nt', 'os2'): > +        if os.name in ('nt'): This change is wrong: it should be os.name == 'nt'. Victor ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http:/

Re: [Python-Dev] [Python-checkins] cpython (2.7): Fix stock symbol for Microsoft

2012-01-10 Thread Victor Stinner
You may port the fix to 3.2 and 3.3. Victor 2012/1/10 raymond.hettinger : > http://hg.python.org/cpython/rev/068ce5d7f7e7 > changeset:   74320:068ce5d7f7e7 > branch:      2.7 > user:        Raymond Hettinger > date:        Tue Jan 10 09:51:51 2012 + > summary: >  Fix stock symbol for Microso

[Python-Dev] Status of the fix for the hash collision vulnerability

2012-01-12 Thread Victor Stinner
Many people proposed their own idea to fix the vulnerability, but only 3 wrote a patch: - Glenn Linderman proposes to fix the vulnerability by adding a new "safe" dict type (only accepting string keys). His proof-of-concept (SafeDict.py) uses a secret of 64 random bits and uses it to compute the h

Re: [Python-Dev] Status of the fix for the hash collision vulnerability

2012-01-13 Thread Victor Stinner
> Unfortunately it requires only a few seconds to compute enough 32bit > collisions on one core with no precomputed data. Are you running the hash function "backward" to generate strings with the same value, or you are more trying something like brute forcing? And how do you get the hash secret?

Re: [Python-Dev] Status of the fix for the hash collision vulnerability

2012-01-13 Thread Victor Stinner
> - Glenn Linderman proposes to fix the vulnerability by adding a new > "safe" dict type (only accepting string keys). His proof-of-concept > (SafeDict.py) uses a secret of 64 random bits and uses it to compute > the hash of a key. We could mix Marc's collision counter with SafeDict idea (being ab

Re: [Python-Dev] Status of the fix for the hash collision ulnerability

2012-01-15 Thread Victor Stinner
I don't think that it would be hard to patch this library to use another hash function. It can implement its own hash function, use MD5, SHA1, or anything else. hash() is not stable accross Python versions and 32/64 bit systems. Victor 2012/1/15 Hynek Schlawack : > Am Sonntag, 15. Januar 2012 um

Re: [Python-Dev] Status of the fix for the hash collision vulnerability

2012-01-16 Thread Victor Stinner
2012/1/17 Tim Delaney : > What if in a pathological collision (e.g. > 1000 collisions), we increased > the size of a dict by a small but random amount? It doesn't change anything, you will still get collisions. Victor ___ Python-Dev mailing list Python-

Re: [Python-Dev] Status of the fix for the hash collision vulnerability

2012-01-17 Thread Victor Stinner
> I thought that the original problem was that with N insertions in the > dictionary, by repeatedly inserting different keys generating the same > hash value an attacker could arrange that the cost of finding an open > slot is O(N), and thus the cost of N insertions is O(N^2). > > If so, frequent r

Re: [Python-Dev] Status of the fix for the hash collision vulnerability

2012-01-17 Thread Victor Stinner
I finished my patch transforming hash(str) to a randomized hash function, see random-8.patch attached to the issue: http://bugs.python.org/issue13703 The remaining question is which random number generator should be used on Windows to initialize the hash secret (CryptoGen adds an overhead of 10%,

Re: [Python-Dev] Hashing proposal: change only string-only dicts

2012-01-17 Thread Victor Stinner
2012/1/17 "Martin v. Löwis" : > I'd like to propose a different approach to seeding the string hashes: > only do so for dictionaries involving only strings, and leave the > tp_hash slot of strings unchanged. The real problem is in dict (or any structure using an hash table), so if it is possible,

Re: [Python-Dev] Hashing proposal: change only string-only dicts

2012-01-17 Thread Victor Stinner
> There is a simpler solution: > > bucket_index = (hash(str) ^ secret) & DICT_MASK. Oops, hash^secret doesn't add any security. Victor ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: htt

Re: [Python-Dev] Hashing proposal: change only string-only dicts

2012-01-17 Thread Victor Stinner
> Each string would get two hashes: the "public" hash, which is constant > across runs and bugfix releases, and the dict-hash, which is only used > by the dictionary implementation, and only if all keys to the dict are > strings. The distinction between secret (private, secure) and "public" hash (

Re: [Python-Dev] Status of the fix for the hash collision vulnerability

2012-01-17 Thread Victor Stinner
>> I plan to commit my fix to Python 3.3 if it is accepted. Then write a >> simplified version to Python 3.2 and backport it to 3.1. > > I'm opposed to any change to the hash values of strings in maintenance > releases, so I guess I'm opposed to your patch in principle. If randomized hash cannot b

Re: [Python-Dev] Status of the fix for the hash collision vulnerability

2012-01-18 Thread Victor Stinner
2012/1/18 "Martin v. Löwis" : > For 3.3 onwards, I'm skeptical whether all this configuration support is > really necessary. I think a much smaller patch which leaves no choice > would be more appropriate. The configuration helps unit testing: see changes on Lib/test/*.py in my last patch. I hesit

Re: [Python-Dev] Writable __doc__

2012-01-19 Thread Victor Stinner
> http://bugs.python.org/issue12773  :) The bug is marked as close, whereas the bug exists in Python 3.2 and has no been closed. The fix must be backported. Victor ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo

[Python-Dev] Counting collisions for the win

2012-01-19 Thread Victor Stinner
Hi, I'm working on the hash collision issue since 2 or 3 weeks. I evaluated all solutions and I think that I have now a good knowledge of the problem and how it should be solved. The major issue is to have a minor or no impact on applications (don't break backward compatibility). I saw three major

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Victor Stinner
2012/1/20 Ivan Kozik : > I'd like to point out that an attacker is not limited to sending just > one dict full of colliding keys.  Given a 22ms stall ... The presented attack produces a stall of at least 30 seconds (5 minutes or more if there is no time limit in the application), not 0.022 second.

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Victor Stinner
> I'm surprised we haven't seen bug reports about it from users > of 64-bit Pythons long ago A Python dictionary only uses the lower bits of a hash value. If your dictionary has less than 2**32 items, the dictionary order is exactly the same on 32 and 64 bits system: hash32(str) & mask == hash64(s

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Victor Stinner
2012/1/20 Frank Sievertsen : > No, that's not true. > Whenever a collision happens, other bits are mixed in very fast. Oh, I didn't know that. So the dict order is only the same if there is no collision. Victor ___ Python-Dev mailing list Python-Dev@pyt

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Victor Stinner
> The main issue with that approach is that it allows a new kind of attack. > > An attacker now needs to find 1000 colliding keys, and submit them > one-by-one into a database. The limit will not trigger, as those are > just database insertions. > > Now, if the applications also as a need to read t

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Victor Stinner
> (I'm thinking that the original > attack is trivial once the set of 65000 colliding keys is public knowledge, > which must be only a matter of time.) I have a program able to generate collisions: it takes 1 second to compute 60,000 colliding strings on a desktop computer. So the security of the

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Victor Stinner
> So I still think we should ditch the paranoia about dictionary order changing, > and fix this without counting. The randomized hash has other issues: - its security is based on its secret, whereas it looks to be easy to compute it (see more details in the issue) - my patch only changes hash(s

Re: [Python-Dev] Counting collisions for the win

2012-01-22 Thread Victor Stinner
> This seed is chosen randomly at runtime, but cannot > change once chosen. The hash is used to compare objects: if hash(obj1) != hash(obj2), objects are considered different. So two strings must have the same hash if their value is the same. > Salt could also be an appropriate term here, but sin

Re: [Python-Dev] threading.Semaphore()'s counter can become negative for non-ints

2012-01-29 Thread Victor Stinner
>> import threading >> s = threading.Semaphore(0.5) > > But why would you want to pass a float? It seems like API abuse to me. If something should be changed, Semaphore(arg) should raise a TypeError if arg is not an integer. Victor ___ Python-Dev mailin

[Python-Dev] Store timestamps as decimal.Decimal objects

2012-01-30 Thread Victor Stinner
Hi, In issues #13882 and #11457, I propose to add an argument to functions returning timestamps to choose the timestamp format. Python uses float in most cases whereas float is not enough to store a timestamp with a resolution of 1 nanosecond. I added recently time.clock_gettime() to Python 3.3 wh

Re: [Python-Dev] Store timestamps as decimal.Decimal objects

2012-01-31 Thread Victor Stinner
> I think this is definitely worth elaborating in a PEP (to recap the > long discussion in #11457 if nothing else). The discussion in issues #13882 and #11457 already lists many alternatives with their costs and benefits, but I can produce a PEP if you need a summary. > In particular, I'd want to

Re: [Python-Dev] Store timestamps as decimal.Decimal objects

2012-01-31 Thread Victor Stinner
Hi, 2012/1/31 Matt Joiner : > Sounds good, but I also prefer Alexander's method. The type information is > already encoded in the class object. Ok, I posted a patch version 6 to use types instead of strings. I also prefer types because it solves the "hidden import" issue. > This way you don't ne

Re: [Python-Dev] Store timestamps as decimal.Decimal objects

2012-01-31 Thread Victor Stinner
> - use datetime (bad idea for the reasons Martin mentioned) It is only a bad idea if it is the only available choice. > - use timedelta (not mentioned on the tracker, but a *much* better fit > for a timestamp than datetime, since timestamps are relative to the > epoch while datetime objects try

Re: [Python-Dev] Store timestamps as decimal.Decimal objects

2012-01-31 Thread Victor Stinner
> (I removed the timespec format, I consider that we don't need it.) > > Rather, I guess you removed it because it didn't fit the "types as flags" > pattern. I removed it because I don't like tuple: you cannot do arithmetic on tuple, like t2-t1. Print a tuple doesn't give you a nice output. It is

Re: [Python-Dev] Store timestamps as decimal.Decimal objects

2012-02-01 Thread Victor Stinner
2012/2/1 Nick Coghlan : > The secret to future-proofing such an API while only using integers > lies in making the decimal exponent part of the conversion function > signature: > >    def from_components(integer, fraction=0, exponent=-9): >        return Decimal(integer) + Decimal(fraction) * Decim

Re: [Python-Dev] Store timestamps as decimal.Decimal objects

2012-02-01 Thread Victor Stinner
> If a callback protocol is used at all, there's no reason those details > need to be exposed to the callbacks. Just choose an appropriate > exponent based on the precision of the underlying API call. If the clock divisor cannot be written as a power of 10, you loose precision, just because your f

[Python-Dev] PEP: New timestamp formats

2012-02-01 Thread Victor Stinner
rsion: $Revision$ Last-Modified: $Date$ Author: Victor Stinner Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 01-Feburary-2012 Python-Version: 3.3 Abstract Python 3.3 introduced functions supporting nanosecond resolutions. Python 3.3 only supports int or float to

Re: [Python-Dev] PEP: New timestamp formats

2012-02-02 Thread Victor Stinner
> I'd add datetime.timedelta to this list. It's exactly what timestamps > are, after all - the difference between the current time and the > relevant epoch value. Ah yes, I forgot to mention it, whereas it is listed in the "final timestamp formats list" :-) >>  * a) (sec, nsec): C timespec struct

Re: [Python-Dev] PEP: New timestamp formats

2012-02-02 Thread Victor Stinner
> Even if I like the idea, I don't think that we need all this machinery > to support nanosecond resolution. I should maybe forget my idea of > using datetime.datetime or datetime.timedelta, or only only support > int, float and decimal.Decimal. I updated my patch (issue #13882) to only support in

Re: [Python-Dev] PEP: New timestamp formats

2012-02-02 Thread Victor Stinner
> Why int? That doesn't seem to bring anything. It helps to deprecate/replace os.stat_float_times(), which may be used for backward compatibility (with Python 2.2 ? :-)). ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/li

Re: [Python-Dev] PEP: New timestamp formats

2012-02-02 Thread Victor Stinner
> That said, I don't understand why we couldn't simply deprecate > stat_float_times() right now. Having an option for integer timestamps > is pointless, you can just call int() on the result if you want. So which API do you propose for time.time() to get a Decimal object? time.time(timestamp=deci

Re: [Python-Dev] PEP: New timestamp formats

2012-02-02 Thread Victor Stinner
I updated and completed my PEP and published the last draft. It will be available at: http://www.python.org/dev/peps/pep-0410/ ( or read the source: http://hg.python.org/peps/file/tip/pep-0410.txt ) I tried to list all alternatives. Victor ___ Python-De

Re: [Python-Dev] PEP: New timestamp formats

2012-02-03 Thread Victor Stinner
> datetime.datetime > > - as noted earlier in the thread, total_seconds() actually gives you a > decent timestamp value and always returning UTC avoids timezone issues os.stat() and time.time() use the local time. Using UTC would be completly wrong. It is possible to get the current timezone for t

Re: [Python-Dev] PEP: New timestamp formats

2012-02-03 Thread Victor Stinner
> consider changing the default on any of these that return a time > value. these for example: >  * time.clock_gettime() >  * time.wallclock() (reuse time.clock_gettime(time.CLOCK_MONOTONIC)) Ah. Nanosecond resolution is overkill is common cases, float is enough and is faster. I prefer to use the

Re: [Python-Dev] PEP: New timestamp formats

2012-02-03 Thread Victor Stinner
> I don't see any real issue of adding datetime as another accepted > type, if Decimal is also accepted. Each type has limitations, and the > user can choose the best type for his/her use case. > > I dropped datetime because I prefer incremental changes (and a simpler > PEP is also more easily acce

Re: [Python-Dev] PEP: New timestamp formats

2012-02-03 Thread Victor Stinner
> Keep in mind timedelta has a microsecond resolution. The use cases > meant for the PEP imply nanosecond resolution (POSIX' clock_gettime(), > for example). datetime.datetime and datetime.timedelta can be patched to support nanosecond. >> A plain number of seconds is superficially simpler, but i

Re: [Python-Dev] Is this safe enough? Re: [Python-checkins] cpython: _Py_Identifier are always ASCII strings

2012-02-06 Thread Victor Stinner
2012/2/6 Jim Jewett : > I realize that _Py_Identifier is a private name, and that PEP 3131 > requires anything (except test cases) in the standard library to stick > with ASCII ... but somehow, that feels like too long of a chain. > > I would prefer to see _Py_Identifier renamed to _Py_ASCII_Identi

Re: [Python-Dev] Is this safe enough? Re: [Python-checkins] cpython: _Py_Identifier are always ASCII strings

2012-02-07 Thread Victor Stinner
2012/2/7 "Martin v. Löwis" : >> _Py_IDENTIFIER(xxx) defines a variable called PyId_xxx, so xxx can >> only be ASCII: the C language doesn't accept non-ASCII identifiers. > > That's not exactly true. In C89, source code is in the "source character > set", which is implementation-defined, except that

Re: [Python-Dev] Is this safe enough? Re: [Python-checkins] cpython: _Py_Identifier are always ASCII strings

2012-02-07 Thread Victor Stinner
> I'd rather restore support for allowing UTF-8 source here (I don't think > that requiring ASCII really improves much), than rename the macro. Done, I reverted my change. Victor ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/m

[Python-Dev] Add a new "locale" codec?

2012-02-07 Thread Victor Stinner
Hi, I added PyUnicode_DecodeLocale(), PyUnicode_DecodeLocaleAndSize() and PyUnicode_EncodeLocale() to Python 3.3 to fix bugs. I hesitate to expose this codec in Python: it can be useful is some cases, especially if you need to interact with C functions. The glib library has functions using the *c

Re: [Python-Dev] Add a new "locale" codec?

2012-02-08 Thread Victor Stinner
2012/2/8 Simon Cross : > Is the idea to have: > >  b"foo".decode("locale") > > be roughly equivalent to > >  encoding = locale.getpreferredencoding(False) >  b"foo".decode(encoding) > > ? Yes. Whereas: b"foo".decode(sys.getfilesystemencoding()) is equivalent to encoding = locale.getpreferredenc

Re: [Python-Dev] Add a new "locale" codec?

2012-02-08 Thread Victor Stinner
2012/2/8 Simon Cross : > I think I'm -1 on a "locale" encoding because it refers to different > actual encodings depending on where and when it's run, which seems > surprising, and there's already a more explicit way to achieve the > same effect. The following code is just an example to explain ho

Re: [Python-Dev] Add a new "locale" codec?

2012-02-08 Thread Victor Stinner
>> The current locale is process-wide: if a thread changes the locale, >> all threads are affected. Some functions have to use the current >> locale encoding, and not the locale encoding read at startup. Examples >> with C functions: strerror(), strftime(), tzname, etc. > > Could a core part of Pyt

Re: [Python-Dev] Add a new "locale" codec?

2012-02-09 Thread Victor Stinner
> I think there's a general expectation that if you encode something > with one codec you will be able to decode it with the same codec. > That's not necessarily true for the locale encoding. There is the same problem with the filesystem encoding (sys.getfilesystemencoding()), which is the user lo

Re: [Python-Dev] [Python-checkins] cpython: PEP 410

2012-02-09 Thread Victor Stinner
>>> changeset:   74832:f8409b3d6449 >>> user:        Victor Stinner >>> date:        Wed Feb 08 14:31:50 2012 +0100 >>> summary: >>>  PEP 410 >> >> Ah, even when written by a core dev, a PEP should still be at Accepted >> before we

Re: [Python-Dev] Add a new "locale" codec?

2012-02-09 Thread Victor Stinner
> With the difference that mbcs cannot change during execution. It is possible to change the "thread ANSI code page" (CP_THREAD_ACP) at runtime, but setting the system ANSI code page (CP_ACP) requires to restart Windows. > I don't even know if it is possible to change it at all, except by > reins

Re: [Python-Dev] Add a new "locale" codec?

2012-02-09 Thread Victor Stinner
> As And pointed out, this is already the behaviour of the "mbcs" codec > under Windows. "locale" would be the moral (*) equivalent of that under > Unix. On Windows, the ANSI code page codec will be accessible using 3 different names: "locale", "mbcs" and the real encoding name (sys.getfilesysteme

[Python-Dev] patch

2012-02-09 Thread Victor Stinner
patch Description: Binary data ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Add a new "locale" codec?

2012-02-09 Thread Victor Stinner
> If this is needed, it should be spelled "os.getlocaleencoding()" (or > "sys.getlocaleencoding()"?) There is already a locale.getpreferredencoding(False) function which give your the current locale encoding. The problem is that the current locale encoding may change and so you have to get the new

Re: [Python-Dev] Add a new "locale" codec?

2012-02-10 Thread Victor Stinner
2012/2/10 "Martin v. Löwis" : >> As And pointed out, this is already the behaviour of the "mbcs" codec >> under Windows. "locale" would be the moral (*) equivalent of that under >> Unix. > > Indeed, and that precedent should be enough reason *not* to include a > "locale" encoding. The "mbcs" encodi

[Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review

2012-02-12 Thread Victor Stinner
Hi, I finished the implementation of the PEP 410 ("Use decimal.Decimal type for timestamps"). The PEP: http://www.python.org/dev/peps/pep-0410/ The implementation: http://bugs.python.org/issue13882 Rietveld code review tool for this issue: http://bugs.python.org/review/13882/show The patch is h

[Python-Dev] How to round timestamps and durations?

2012-02-13 Thread Victor Stinner
Hi, My work on the PEP 410 tries to unify the code to manipulate timestamps. The problem is that I'm unable to decide how to round these numbers. Functions using a resolution of 1 second (e.g. time.mktime) expects rounding towards zero (ROUND_HALF_DOWN), as does int(float). Example: >>> time.mkt

Re: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review

2012-02-13 Thread Victor Stinner
> However, I am still -1 on the solution proposed by the PEP. I still think > that migrating to datetime use is a better way to go, rather than a > proliferation of the data types used to represent timestamps, along with an > API to specify the type of data returned. > > Let's look at each item in

Re: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review

2012-02-13 Thread Victor Stinner
Antoine Pitrou conviced me to drop simply the int type: float and Decimal are just enough. Use an explicit cast using int() to get int. os.stat_float_times() is still deprecated by the PEP. Victor ___ Python-Dev mailing list Python-Dev@python.org http://

Re: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review

2012-02-14 Thread Victor Stinner
> A datetime module based approach would need to either use a mix of > datetime.datetime() (when returning an absolute time) and > datetime.timedelta() (when returning a time relative to an unknown > starting point), Returning a different type depending on the function would be surprising and conf

<    24   25   26   27   28   29   30   31   32   33   >