Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-11 Thread Mark Shannon

On 11/12/12 03:45, Raymond Hettinger wrote:


On Dec 10, 2012, at 7:04 PM, Mark Shannon mailto:m...@hotpy.org>> wrote:


Another approach is to pre-allocate the two-thirds maximum
(This is simple and fast but gives the smallest space savings.)


What do you mean by maximum?


A dict with an index table size of 8 gets resized when it is two-thirds
full,
so the maximum number of entries is 5.  If you pre-allocate five entries
for the initial dict, you've spent 5 * 24 bytes + 8 bytes for the indices
for a total of 128 bytes.  This compares to the current table of 8 * 24
bytes
totaling 192 bytes.

Many other strategies are possible.  The proof-of-concept code
uses the one employed by regular python lists.
Their growth pattern is: 0, 4, 8, 16, 25, 35, 46, 58, 72, 88, 
This produces nice memory savings for entry lists.

If you have a suggested allocation pattern or other
constructive suggestion, it would be would welcome.

It seems like a reasonable starting point.
Trying to avoid resizing the index array and the entries array at the 
same time is probably a good idea.



Further sniping and unsubstantiated FUD would not.


Is asking that you back up your claims with some analysis
that unreasonable?

When you make claims such as
"""
The memory savings are significant (from 30% to 95%
compression depending on the how full the table is).
Small dicts (size 0, 1, or 2) get the most benefit.
"""
is it a surprise that I am sceptical?

I like you idea. I just don't want everyone getting their
hopes up for what may turn out to be a fairly minor improvement.

Don't forget Unladen Swallow :)

Cheers,
Mark.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] where is the python "import" implemented

2012-12-11 Thread martin

    in this situation, I can not find the source code how python
implement it. I test a wrong format pyc, and got a error "ImportError: bad
magic number",and I search "bad magic number" in the source code,  I
find it is in importlib/_bootstrap.py(line 815),but when I modify this
error info(eg: test bad magic) and run again, nothing is changed. It seems
that the file is not the correct position.


This is the right position. When you change _bootstrap.py, you need to
run "make" again, to freeze the modified _bootstrap.py.

Regards,
Martin


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] where is the python "import" implemented

2012-12-11 Thread Antoine Pitrou

Hello,

Le Tue, 11 Dec 2012 14:08:27 +0800,
"Isml" <76069...@qq.com> a écrit :
> Hi, everyone,
> I am testing modifying the pyc file when it is imported. As I
> know, there is three situation: 1、runing in the python.exe
>  eg: python.exe test.pyc
> in this situation, I find the source on line 1983 in file
> pythonrun.c 2、import the pyc from a zip file
> I find the source on line 1132 in zipimport.c
> 3、do a normal import
> eg: two file : main.py and testmodule.py
> and in main.py:
> import testmodule
>  
> in this situation, I can not find the source code how python
> implement it. I test a wrong format pyc, and got a error
> "ImportError: bad magic number",and I search "bad magic number" in
> the source code,  I find it is in importlib/_bootstrap.py(line
> 815),but when I modify this error info(eg: test bad magic) and run
> again, nothing is changed. It seems that the file is not the correct
> position.

importlib/_bootstrap.py is indeed the place, but you need to run "make"
once you have modified that file. _bootstrap.py is frozen into the
executable at compile time, because otherwise the bootstrap issues are
intractable.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-11 Thread Antoine Pitrou
Le Mon, 10 Dec 2012 18:17:57 -0500,
Raymond Hettinger  a écrit :
> 
> On Dec 10, 2012, at 2:48 AM, Christian Heimes 
> wrote:
> 
> > On the other hand every lookup and collision checks needs an
> > additional multiplication, addition and pointer dereferencing:
> > 
> >  entry = entries_baseaddr + sizeof(PyDictKeyEntry) * idx
> 
> 
> Currently, the dict implementation allows alternative lookup
> functions based on whether the keys are all strings.
> The choice of lookup function is stored in a function pointer.
> That lets each lookup use the currently active lookup
> function without having to make any computations or branches.

An indirect function call is technically a branch, as seen from the CPU
(and not necessarily a very predictable one, although recent Intel
CPUs are said to be quite good at that).

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-11 Thread Antoine Pitrou
Le Tue, 11 Dec 2012 08:41:32 +,
Mark Shannon  a écrit :
> >
> > If you have a suggested allocation pattern or other
> > constructive suggestion, it would be would welcome.
> It seems like a reasonable starting point.
> Trying to avoid resizing the index array and the entries array at the 
> same time is probably a good idea.

Why would you want to avoid that?
If we want to allocate the dict's data as a single memory block (which
saves a bit in memory consumption and also makes dict allocations
faster), we need to resize both arrays at the same time.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] 回复: where is the python "import" implemented

2012-12-11 Thread Isml
According to your advice, Now I can see my modified log. It's great! Thanks to Antoine and Martin!
 



Hello,Le Tue, 11 Dec 2012 14:08:27 +0800,"Isml" <76069...@qq.com> a écrit :> Hi, everyone,> I am testing modifying the pyc file when it is imported. As I> know, there is three situation: 1、runing in the python.exe>  eg: python.exe test.pyc> in this situation, I find the source on line 1983 in file> pythonrun.c 2、import the pyc from a zip file> I find the source on line 1132 in zipimport.c> 3、do a normal import> eg: two file : main.py and testmodule.py> and in main.py:> import testmodule>  > in this situation, I can not find the source code how python> implement it. I test a wrong format pyc, and got a error> "ImportError: bad magic number",and I search "bad magic number" in> the source code,  I find it is in importlib/_bootstrap.py(line> 815),but when I modify this error info(eg: test bad magic) and run> again, nothing is changed. It seems that the file is not the correct> position.importlib/_bootstrap.py is indeed the place, but you need to run "make"once you have modified that file. _bootstrap.py is frozen into theexecutable at compile time, because otherwise the bootstrap issues areintractable.RegardsAntoine.___Python-Dev mailing listPython-Dev@python.orghttp://mail.python.org/mailman/listinfo/python-devUnsubscribe: http://mail.python.org/mailman/options/python-dev/76069016%40qq.com
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-11 Thread Serhiy Storchaka

Yet some comments about your Python implementation.

1. Don't use "is FREE" and "is DUMMY" as array doesn't preserve identity.

2. Wrong limits used in _make_index(): 128 overflows 'b', 65536 
overflows 'h' and 'l' can be not enough for ssize_t.


3. round_upto_powtwo() can be implemented as 1 << n.bit_length().

4. i * 5 faster than (i << 2) + i on Python.

5. You can get rid of "size" attribute and use len(self.keylist) instead.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Draft PEP for time zone support.

2012-12-11 Thread Lennart Regebro
This PEP is also available on github:

https://github.com/regebro/tz-pep/blob/master/pep-04tz.txt

Text:

PEP: 4??
Title: Time zone support improvements
Version: $Revision$
Last-Modified: $Date$
Author: Lennart Regebro 
BDFL-Delegate: Barry Warsaw
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 11-Dec-2012
Post-History: 11-Dec-2012

Abstract


This PEP proposes the implementation of concrete time zone support in the
Python standard library, and also improvements to the time zone API to deal
with ambiguous time specifications during DST changes.


Proposal


Concrete time zone support
--

The time zone support in Python has no concrete implementation in the
standard library, only a tzinfo baseclass, and since Python 3.2, one concrete
time zone: UTC. To properly support time zones you need to include a database
over all time zones, both current and historical, including daylight saving
changes. But such information changes frequently, so even if we include the
last information in a Python release, that information would be outdated just
a few months later.

Timezone support has therefore only been available through two third-party
modules, ``pytz`` and ``dateutil``, both who include and wrap the "zoneinfo"
database. This database, also called "tz" or "The Olsen database", is the
de-facto standard time zone database over time zones, and it is included in
most variants of Unix operating systems, including OS X.

This gives us the opportunity to include only the code that supports the
zoneinfo data in the standard library, but by default use the operating
systems copy of the data, which typically will be kept updated by the
updating mechanism of the operating system or distribution.

For those who have an operating system that does not include the tz database,
for example Windows, a distribution containing the latest tz database should
also be available at the Python Package Index, so it can be easily installed
with the Python packaging tools such as ``easy_install`` or ``pip``. This
could also be done on Unices that are no longer recieving updates and
therefore has an outdated database.

With such a mechanism Python would have full time zone support in the
standard library on most platforms, and a simple package installation would
provide time zone support on those platforms where the tz database isn't
included, such as Windows.

The time zone support will be implemented by a new module called `timezone``,
based on Stuart Bishop's ``pytz`` module.


Getting the local time zone
---

On Unix there is no standard way of finding the name of the time zone that is
being used. All the information that is available is the time zone
abbreviations, such as ``EST`` and ``PDT``, but many of those abbreviations
are ambigious and therefore you can't rely on them to figure out which time
zone you are located in.

There is however a standard for finding the compiled time zone information
since it's located in ``/etc/localtime``. Therefore it is possible to create
a local time zone object with the correct time zone information even though
you don't know the name of the time zone. A function in ``datetime`` should
be provided to return the local time zone.

The support for this will be made by integrating Lennart Regebro's
``tzlocal`` module into the new ``timezone`` module.


Ambiguous times
---

When changing over from daylight savings time the clock is turned back one
hour. This means that the times during that hour happens twice, once without
DST and then once with DST. Similarily, when changing to daylight savings
time, one hour goes missing.

The current time zone API can not differentiating between the two ambiguous
times during a change from DST. For example, in Stockholm the time of
2012-11-28 02:00:00 happens twice, both at UTC 2012-11-28 00:00:00 and also
at 2012-11-28 01:00:00.

The current time zone API can not disambiguate this and therefore it's
unclear which time should be returned::

# This could be either 00:00 or 01:00 UTC:
>>> dt = datetime(2012, 11, 28, 2, 0, tzinfo=timezone('Europe/Stockholm'))
# But we can not specify which:
>>> dt.astimezone(timezone('UTC'))
datetime.datetime(2012, 11, 28, 1, 0, tzinfo=)

``pytz`` solved this problem by adding ``is_dst`` parameters to several
methods of the tzinfo objects to make it possible to disambiguate times when
this is desired.

This PEP proposes to add these ``is_dst`` parameters to the relevant methods
of the ``datetime`` API, and therefore add this functionality directly to
``datetime``. This is likely the hardest part of this PEP as this
involves updating
the


Implementation API
==

The new ``timezone``-module
---

The public API of the new ``timezone``-module contains one new class, one new
function and one new exception.

* New class: ``DstTzInfo``

  This class provides a concrete implementation of th

Re: [Python-Dev] Draft PEP for time zone support.

2012-12-11 Thread Dirkjan Ochtman
On Tue, Dec 11, 2012 at 4:23 PM, Lennart Regebro  wrote:
> Proposal
> 
>
> The time zone support will be implemented by a new module called `timezone``,
> based on Stuart Bishop's ``pytz`` module.

I wonder if there needs to be something here about how to port from
pytz to the new timezone library.

> * New function :``get_timezone(name=None, db=None)``
>
>   This function takes a name string that must be a string specifying a
>   valid zoneinfo timezone, ie "US/Eastern", "Europe/Warsaw" or "Etc/GMT+11".
>   If not given, the local timezone will be looked up. If an invalid zone name
>   are given, or the local timezone can not be retrieved, the function raises
>   `UnknownTimeZoneError`.
>
>   The function also takes an optional path to the location of the zoneinfo
>   database which should be used. If not specified, the function will check if
>   the `timezonedata` module is installed, and then use that location
> or otherwise
>   use the database in ``/usr/share/zoneinfo``.
>
>   If no database is found an ``UnknownTimeZoneError`` or subclass thereof will
>   be raised with a message explaining that no zoneinfo database can be found,
>   but that you can install one with the ``timezonedata`` package.

It seems like calling get_timezone() with an unknown timezone should
just throw ValueError, not necessarily some custom Exception? It would
probably be a good idea to have a different exception for the case of
no database available.

> Differences from the ``pytz`` API
> =
>
> * ``pytz`` has the functions ``localize()`` and ``normalize()`` to work
>   around that ``tzinfo`` doesn't have is_dst. When ``is_dst`` is
>   implemented directly in ``datetime.tzinfo`` they are no longer needed.
>
> * The ``pytz`` method ``timezone()`` is instead called
> ``get_timezone()`` for clarity.
>
> * ``get_timezone()``  will return the local time zone if called
> without parameters.
>
> * The class ``pytz.StaticTzInfo`` is there to provide the ``is_dst``
> support for static
>   timezones. When ``is_dst`` support is included in
> ``datetime.tzinfo`` it is no longer needed.

This feels a bit superfluous. Why not keep a bit more of the pytz API
to make porting easy? The pytz API has proven itself in the wild, so I
don't see much point in renaming "for clarity". It also seems
relatively painless to keep localize() and normalize() functions
around for easy porting.

> Discussion
> ==
>
> Should the windows installer include the data package?
> --
>
> It has been suggested that the Windows installer should include the data
> package. This would mean that an explicit installation no longer would be
> needed on Windows. On the other hand, that would mean that many using Windows
> would not be aware that the database quickly becomes outdated and would not
> keep it updated.

I still submit that it's pretty much just as easy to forget to update
the database whether it's been installed by hand zero or one times, so
I don't find your argument convincing. I don't mind the result much,
though.

Looking forward to have timezone support in the stdlib!

Cheers,

Dirkjan
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Draft PEP for time zone support.

2012-12-11 Thread Paul Moore
On 11 December 2012 15:39, Dirkjan Ochtman  wrote:
>> Should the windows installer include the data package?
>> --
>>
>> It has been suggested that the Windows installer should include the data
>> package. This would mean that an explicit installation no longer would be
>> needed on Windows. On the other hand, that would mean that many using Windows
>> would not be aware that the database quickly becomes outdated and would not
>> keep it updated.
>
> I still submit that it's pretty much just as easy to forget to update
> the database whether it's been installed by hand zero or one times, so
> I don't find your argument convincing. I don't mind the result much,
> though.

I agree. Also, in corporate or similar environments where each
individual package installation must be approved, having at least some
timezone data in the base install ensures that all Python code can
assume the *existence* of timezone support (if not necessarily the
accuracy of that data).

If the base Windows installer does not include timezone data, then the
documentation should note this and offer advice on how to write code
that degrades gracefully without timezones.

If the base installer *does* include timezone data, of course, there
should be a documented mechanism for updating it (we don't want magic
like the old xml package used, I assume).

Paul.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Draft PEP for time zone support.

2012-12-11 Thread Brian Curtin
On Tue, Dec 11, 2012 at 9:48 AM, Paul Moore  wrote:
> On 11 December 2012 15:39, Dirkjan Ochtman  wrote:
>>> Should the windows installer include the data package?
>>> --
>>>
>>> It has been suggested that the Windows installer should include the data
>>> package. This would mean that an explicit installation no longer would be
>>> needed on Windows. On the other hand, that would mean that many using 
>>> Windows
>>> would not be aware that the database quickly becomes outdated and would not
>>> keep it updated.
>>
>> I still submit that it's pretty much just as easy to forget to update
>> the database whether it's been installed by hand zero or one times, so
>> I don't find your argument convincing. I don't mind the result much,
>> though.
>
> I agree. Also, in corporate or similar environments where each
> individual package installation must be approved, having at least some
> timezone data in the base install ensures that all Python code can
> assume the *existence* of timezone support (if not necessarily the
> accuracy of that data).
>
> If the base Windows installer does not include timezone data, then the
> documentation should note this and offer advice on how to write code
> that degrades gracefully without timezones.
>
> If the base installer *does* include timezone data, of course, there
> should be a documented mechanism for updating it (we don't want magic
> like the old xml package used, I assume).

I think we should try to get the data into the base installer and then
include a small updater, perhaps putting it in a Windows scheduled
task and checking PyPI periodically for newer versions. If a new one
comes up, prompt if the user wants it.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Draft PEP for time zone support.

2012-12-11 Thread Antoine Pitrou
Le Tue, 11 Dec 2012 16:23:37 +0100,
Lennart Regebro  a écrit :
> 
> Changes in the ``datetime``-module
> --
> 
> A new ``is_dst`` parameter is added to several of the `tzinfo`
> methods to handle time ambiguity during DST changeovers.
> 
> * ``tzinfo.utcoffset(self, dt, is_dst=True)``
> 
> * ``tzinfo.dst(self, dt, is_dst=True)``
> 
> * ``tzinfo.tzname(self, dt, is_dst=True)``
> 
> The ``is_dst`` parameter can be ``True`` (default), ``False``, or
> ``None``.
> 
> ``True`` will specify that the given datetime should be interpreted
> as happening during daylight savings time, ie that the time specified
> is before the change from DST.

Why is it True by default? Do we have statistics showing that Python
gets more use in summer?

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Do more at compile time; less at runtime

2012-12-11 Thread Ned Batchelder

On 12/9/2012 5:22 PM, Mark Shannon wrote:
The current CPython bytecode interpreter is rather more complex than 
it needs to be. A number of bytecodes could be eliminated and a few 
more simplified by moving the work involved in handling compound 
statements (loops, try-blocks, etc) from the interpreter to the compiler.


As with all suggestions to optimize the bytecode generation, I'd like to 
re-iterate the need for a way to disable all optimization, for the sake 
of reasoning about the program.  For example, debugging, coverage 
measurement, etc.  This idea was misunderstood and defeated in 
http://bugs.python.org/issue2506, but I strongly believe it is important.


--Ned.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Do more at compile time; less at runtime

2012-12-11 Thread Guido van Rossum
+1
On Dec 11, 2012 8:47 AM, "Ned Batchelder"  wrote:

> On 12/9/2012 5:22 PM, Mark Shannon wrote:
>
>> The current CPython bytecode interpreter is rather more complex than it
>> needs to be. A number of bytecodes could be eliminated and a few more
>> simplified by moving the work involved in handling compound statements
>> (loops, try-blocks, etc) from the interpreter to the compiler.
>>
>
> As with all suggestions to optimize the bytecode generation, I'd like to
> re-iterate the need for a way to disable all optimization, for the sake of
> reasoning about the program.  For example, debugging, coverage measurement,
> etc.  This idea was misunderstood and defeated in http://bugs.python.org/*
> *issue2506 , but I strongly believe it
> is important.
>
> --Ned.
> __**_
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/**mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/**mailman/options/python-dev/**
> guido%40python.org
>
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-11 Thread Dino Viehland
PJ wrote:
> Actually, IronPython may already have ordered dictionaries by default; see:
> 
>   http://mail.python.org/pipermail/ironpython-users/2006-
> May/002319.html
> 
> It's described as an implementation detail that may change, perhaps that
> could be changed to being unchanging.  ;-)
> 

I think this has changed since 2006.  IronPython was originally using the .NET
dictionary class and just locking while using it, but it now has a custom 
dictionary
which is thread safe for multiple readers and allows 1 writer.  But it doesn't 
do
anything to preserve order of insertions.

OTOH changing certain dictionaries in IronPython (such as keyword args) to be
ordered would certainly be possible.  Personally I just wouldn't want to see it
be the default as that seems like unnecessary overhead when the specialized
class exists.



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Draft PEP for time zone support.

2012-12-11 Thread Barry Warsaw
On Dec 11, 2012, at 04:23 PM, Lennart Regebro wrote:

>This PEP is also available on github:
>
>https://github.com/regebro/tz-pep/blob/master/pep-04tz.txt

wget returns some html gobbledygook.  Why-oh-why github?!

>PEP: 4??

I've assigned this PEP 431, reformatted a few extra wide paragraphs, committed
and pushed.

Thanks Lennart!
-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Draft PEP for time zone support.

2012-12-11 Thread Donald Stufft
On Tuesday, December 11, 2012 at 3:31 PM, Barry Warsaw wrote:
> On Dec 11, 2012, at 04:23 PM, Lennart Regebro wrote:
> 
> > This PEP is also available on github:
> > 
> > https://github.com/regebro/tz-pep/blob/master/pep-04tz.txt
> 
> wget returns some html gobbledygook. Why-oh-why github?!'
wget https://raw.github.com/regebro/tz-pep/master/pep-04tz.txt 
> 
> > PEP: 4??
> 
> I've assigned this PEP 431, reformatted a few extra wide paragraphs, committed
> and pushed.
> 
> Thanks Lennart!
> -Barry
> 
> ___
> Python-Dev mailing list
> Python-Dev@python.org (mailto:Python-Dev@python.org)
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> http://mail.python.org/mailman/options/python-dev/donald.stufft%40gmail.com
> 
> 


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Draft PEP for time zone support.

2012-12-11 Thread Brandon W Maister
Barry you want github raw:
https://raw.github.com/regebro/tz-pep/master/pep-04tz.txt


On Tue, Dec 11, 2012 at 3:31 PM, Barry Warsaw  wrote:

> On Dec 11, 2012, at 04:23 PM, Lennart Regebro wrote:
>
> >This PEP is also available on github:
> >
> >https://github.com/regebro/tz-pep/blob/master/pep-04tz.txt
>
> wget returns some html gobbledygook.  Why-oh-why github?!
>
> >PEP: 4??
>
> I've assigned this PEP 431, reformatted a few extra wide paragraphs,
> committed
> and pushed.
>
> Thanks Lennart!
> -Barry
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/quodlibetor%40gmail.com
>
>
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Draft PEP for time zone support.

2012-12-11 Thread Guido van Rossum
On Tue, Dec 11, 2012 at 8:07 AM, Antoine Pitrou  wrote:
> Le Tue, 11 Dec 2012 16:23:37 +0100,
> Lennart Regebro  a écrit :
>>
>> Changes in the ``datetime``-module
>> --
>>
>> A new ``is_dst`` parameter is added to several of the `tzinfo`
>> methods to handle time ambiguity during DST changeovers.
>>
>> * ``tzinfo.utcoffset(self, dt, is_dst=True)``
>>
>> * ``tzinfo.dst(self, dt, is_dst=True)``
>>
>> * ``tzinfo.tzname(self, dt, is_dst=True)``
>>
>> The ``is_dst`` parameter can be ``True`` (default), ``False``, or
>> ``None``.
>>
>> ``True`` will specify that the given datetime should be interpreted
>> as happening during daylight savings time, ie that the time specified
>> is before the change from DST.
>
> Why is it True by default? Do we have statistics showing that Python
> gets more use in summer?

My question exactly.

The rest sounds good -- definitely use the system tz database on Unixy
systems, pre-install on Windows and make updating easy. Some
bikeshedding about static I don't really understand, so I'll leave to
others.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-11 Thread Nick Coghlan
On Wed, Dec 12, 2012 at 5:37 AM, Dino Viehland  wrote:

> OTOH changing certain dictionaries in IronPython (such as keyword args) to
> be
> ordered would certainly be possible.  Personally I just wouldn't want to
> see it
> be the default as that seems like unnecessary overhead when the specialized
> class exists.
>

Which reminds me, I was going to note that one of the main gains with
ordered keyword arguments, is their use in the construction of string-keyed
objects where you want to be able to control the order of iteration (e.g.
for serialisation or display purposes). Currently you have to go the path
of something like namedtuple where you define the order of iteration in one
operation, and set the values in another.

Initialising an ordered dict itself is one obvious use case, but anything
else where you want to control the iteration order *and* set field names
and values in a single call will potentially benefit.

Independently of that, I'll note that this change would make it possible to
add a .sort() method to dictionaries. Any subsequent mutation of the
dictionary would requiring resorting, though (which isn't really all that
different from maintaining a sorted list). The performance impact
definitely needs to be benchmarked though, as the need to read two memory
locations rather than one for a dictionary read could have weird caching
effects. (Fortunately, many of the benchmarks run on Python 3.3 now, so it
should be possible to get that data fairly easily)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Draft PEP for time zone support.

2012-12-11 Thread Nick Coghlan
On Wed, Dec 12, 2012 at 1:23 AM, Lennart Regebro  wrote:

> Abstract
> 
>
> This PEP proposes the implementation of concrete time zone support in the
> Python standard library, and also improvements to the time zone API to deal
> with ambiguous time specifications during DST changes.
>

Thanks for tackling this one, Lennart.


> Proposal
> 
>
> Concrete time zone support
> --
>
> The time zone support in Python has no concrete implementation in the
> standard library, only a tzinfo baseclass, and since Python 3.2, one
> concrete
> time zone: UTC.


This isn't quite right - the current concrete timezones support any fixed
offset from UTC, not just UTC itself.
http://docs.python.org/3/library/datetime#timezone-objects

(Although there a couple of bugs in those docs at the moment:
http://bugs.python.org/issue16667)


> The time zone support will be implemented by a new module called
> `timezone``,
> based on Stuart Bishop's ``pytz`` module.
>

Ick, why a new module? Why not just add this directly to datetime? (It
doesn't need to be provided by the C accelerator, it can go straight in the
pure Python part).


> This PEP proposes to add these ``is_dst`` parameters to the relevant
> methods
> of the ``datetime`` API, and therefore add this functionality directly to
> ``datetime``. This is likely the hardest part of this PEP as this
> involves updating
> the
>

Missing the end of this sentence...


> The ``timezonedata``-package
> -
>
> The zoneinfo database will be packaged for easy installation with
> ``easy_install``/``pip``/``buildout``. This package will not install any
> Python code, and will not contain any Python code except that which is
> needed
> for installation.
>

I'd prefer a more aggressive name for this like "tzdata_override". My
rationale is that *nix users need to thoroughly aware that if they install
this package, they will stop benefiting from the automatic tz database
updates provided by their OS (especially if they install it into the system
site packages on a distro that has migrated to Python 3 for system tools).

Such a name would also make it possible to provide *two* packaged
databases, one checked before the OS data (tzdata_override), and one
shipped with Python itself that is used only if the OS doesn't provide the
timezone database (tzdata_fallback). tzdata_fallback would then be updated
to the latest Olsen database for each maintenance release. Cross-platform
applications that wanted more reliably up to date timezone data could then
conditionally depend on tzdata_override for Windows deployments (using the
environment marker support in metadata 1.2+).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Draft PEP for time zone support.

2012-12-11 Thread Robert Brewer
Guido van Rossum wrote:
> Sent: Tuesday, December 11, 2012 4:11 PM
> To: Antoine Pitrou
> Cc: python-dev@python.org
> Subject: Re: [Python-Dev] Draft PEP for time zone support.
> 
> On Tue, Dec 11, 2012 at 8:07 AM, Antoine Pitrou 
> wrote:
> > Le Tue, 11 Dec 2012 16:23:37 +0100,
> > Lennart Regebro  a écrit :
> >>
> >> Changes in the ``datetime``-module
> >> --
> >>
> >> A new ``is_dst`` parameter is added to several of the `tzinfo`
> >> methods to handle time ambiguity during DST changeovers.
> >>
> >> * ``tzinfo.utcoffset(self, dt, is_dst=True)``
> >>
> >> * ``tzinfo.dst(self, dt, is_dst=True)``
> >>
> >> * ``tzinfo.tzname(self, dt, is_dst=True)``
> >>
> >> The ``is_dst`` parameter can be ``True`` (default), ``False``, or
> >> ``None``.
> >>
> >> ``True`` will specify that the given datetime should be interpreted
> >> as happening during daylight savings time, ie that the time
> specified
> >> is before the change from DST.
> >
> > Why is it True by default? Do we have statistics showing that Python
> > gets more use in summer?
> 
> My question exactly.

"Summer" in the USA, at least, is 238 days in 2012, while "Winter" into 2013 is 
only 126 days:

>>> import datetime
>>> datetime.date(2012, 11, 4) - datetime.date(2012, 3, 11)
datetime.timedelta(238)
>>> datetime.date(2013, 3, 10) - datetime.date(2012, 11, 4)
datetime.timedelta(126)


Robert Brewer
fuman...@aminus.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Draft PEP for time zone support.

2012-12-11 Thread Barry Warsaw
Great work, Lennart.  I really like this PEP.  Feedback follows (I haven't yet
read the rest of the messages in this thread ;).

On Dec 11, 2012, at 04:23 PM, Lennart Regebro wrote:

>This PEP proposes to add these ``is_dst`` parameters to the relevant methods
>of the ``datetime`` API, and therefore add this functionality directly to
>``datetime``. This is likely the hardest part of this PEP as this
>involves updating
>the

Oops, something got cut off there.

>The new ``timezone``-module
>---
>
>The public API of the new ``timezone``-module contains one new class, one new
>function and one new exception.

Why add a new module instead of putting all this into the existing datetime
module, either directly or as a submodule?  Seems like the obvious place to
put it instead of claiming another top-level module name.

>* New class: ``DstTzInfo``
>
>  This class provides a concrete implementation of the ``zoneinfo`` base
>  class that implements DST support.

Is this a subclass of datetime.tzinfo?

>* New function :``get_timezone(name=None, db=None)``
>
>  This function takes a name string that must be a string specifying a
>  valid zoneinfo timezone, ie "US/Eastern", "Europe/Warsaw" or "Etc/GMT+11".
>  If not given, the local timezone will be looked up. If an invalid zone name
>  are given, or the local timezone can not be retrieved, the function raises
>  `UnknownTimeZoneError`.
>
>  The function also takes an optional path to the location of the zoneinfo
>  database which should be used. If not specified, the function will check if
>  the `timezonedata` module is installed, and then use that location or
>  otherwise use the database in ``/usr/share/zoneinfo``.

I'm bikeshedding, but can we find a better name than `db` for the second
argument?  Something that makes it obvious we're looking for file system path?

>* New Exception: ``UnknownTimeZoneError``

I'd really like to see a TimeZoneError base class from which all these new
exceptions inherit.

>A new ``is_dst`` parameter is added to several of the `tzinfo` methods to
>handle time ambiguity during DST changeovers.
>
>* ``tzinfo.utcoffset(self, dt, is_dst=True)``

I lied a little bit - I did skim the other messages, so I'll reserve comment
on the default value of is_dst for follow ups.

>* ``AmbiguousTimeError``
>
>* ``NonExistentTimeError``

I'm not positive we need separate exceptions here, but I guess it can't hurt,
and with the base class idea above, we can catch both either explicitly, or by
catching the base class.
>
>The ``timezonedata``-package
>-

Just to be clear, this doesn't expose any new modules, right?

Cheers,
-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Draft PEP for time zone support.

2012-12-11 Thread Barry Warsaw
On Dec 11, 2012, at 03:48 PM, Paul Moore wrote:

>I agree. Also, in corporate or similar environments where each
>individual package installation must be approved, having at least some
>timezone data in the base install ensures that all Python code can
>assume the *existence* of timezone support (if not necessarily the
>accuracy of that data).

One other thing that the PEP should describe is what happens on a distro that
has timezone data, but which you also pip install the PyPI tzdata package.
Which one wins?  Is there a way to control it, other than providing an
explicit path?  Is there a way to uninstall the PyPI package?  Does the API
need to provide a method which tells you where the database it is using by
default lives?

Cheers,
-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Draft PEP for time zone support.

2012-12-11 Thread Barry Warsaw
On Dec 11, 2012, at 03:37 PM, Brandon W Maister wrote:

>Barry you want github raw:
>https://raw.github.com/regebro/tz-pep/master/pep-04tz.txt

I found that out.  I was mostly just complaining. ;)

-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Draft PEP for time zone support.

2012-12-11 Thread Barry Warsaw
On Dec 11, 2012, at 03:31 PM, Barry Warsaw wrote:

>I've assigned this PEP 431, reformatted a few extra wide paragraphs, committed
>and pushed.

Unfortunately, it looks like the online PEP updater isn't working.

-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Draft PEP for time zone support.

2012-12-11 Thread Barry Warsaw
On Dec 11, 2012, at 04:23 PM, Lennart Regebro wrote:

>A new ``is_dst`` parameter is added to several of the `tzinfo` methods to
>handle time ambiguity during DST changeovers.

>``None`` will raise an ``AmbiguousTimeError`` exception if the time specified
>was during a DST change over. It will also raise a ``NonExistentTimeError``
>if a time is specified during the "missing time" in a change to DST.

I think None should be the default.

-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Draft PEP for time zone support.

2012-12-11 Thread Nick Coghlan
On Wed, Dec 12, 2012 at 12:58 PM, Barry Warsaw  wrote:

> On Dec 11, 2012, at 04:23 PM, Lennart Regebro wrote:
>
> >A new ``is_dst`` parameter is added to several of the `tzinfo` methods to
> >handle time ambiguity during DST changeovers.
>
> >``None`` will raise an ``AmbiguousTimeError`` exception if the time
> specified
> >was during a DST change over. It will also raise a
> ``NonExistentTimeError``
> >if a time is specified during the "missing time" in a change to DST.
>
> I think None should be the default.
>

That's a backwards compatibility risk, though - many applications are
likely coping just fine with the slightly corrupted time values, but would
fall over if an exception was raised instead. The default should probably
be chosen so that the single argument form of these calls continues to
behave the same in 3.4 as it does in 3.3, emitting a DeprecationWarning to
say that the default behaviour is going to change in 3.5 (so the *actual*
default would be sentinel value, in order to tell the difference between an
explicit True being passed and relying on the default behaviour).


Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Draft PEP for time zone support.

2012-12-11 Thread Barry Warsaw
On Dec 12, 2012, at 01:14 PM, Nick Coghlan wrote:

>That's a backwards compatibility risk, though - many applications are
>likely coping just fine with the slightly corrupted time values, but would
>fall over if an exception was raised instead. The default should probably
>be chosen so that the single argument form of these calls continues to
>behave the same in 3.4 as it does in 3.3, emitting a DeprecationWarning to
>say that the default behaviour is going to change in 3.5 (so the *actual*
>default would be sentinel value, in order to tell the difference between an
>explicit True being passed and relying on the default behaviour).

+1

Cheers,
-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Draft PEP for time zone support.

2012-12-11 Thread Guido van Rossum
On Tue, Dec 11, 2012 at 7:19 PM, Barry Warsaw  wrote:
> On Dec 12, 2012, at 01:14 PM, Nick Coghlan wrote:
>
>>That's a backwards compatibility risk, though - many applications are
>>likely coping just fine with the slightly corrupted time values, but would
>>fall over if an exception was raised instead.

Right.

>>The default should probably
>>be chosen so that the single argument form of these calls continues to
>>behave the same in 3.4 as it does in 3.3, emitting a DeprecationWarning to
>>say that the default behaviour is going to change in 3.5 (so the *actual*
>>default would be sentinel value, in order to tell the difference between an
>>explicit True being passed and relying on the default behaviour).
>
> +1

I don't think it's worth deprecating the old behavior.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Draft PEP for time zone support.

2012-12-11 Thread Guido van Rossum
On Tue, Dec 11, 2012 at 5:11 PM, Robert Brewer  wrote:
> Guido van Rossum wrote:
>> Sent: Tuesday, December 11, 2012 4:11 PM
>> To: Antoine Pitrou
>> Cc: python-dev@python.org
>> Subject: Re: [Python-Dev] Draft PEP for time zone support.
>>
>> On Tue, Dec 11, 2012 at 8:07 AM, Antoine Pitrou 
>> wrote:
>> > Le Tue, 11 Dec 2012 16:23:37 +0100,
>> > Lennart Regebro  a écrit :
>> >>
>> >> Changes in the ``datetime``-module
>> >> --
>> >>
>> >> A new ``is_dst`` parameter is added to several of the `tzinfo`
>> >> methods to handle time ambiguity during DST changeovers.
>> >>
>> >> * ``tzinfo.utcoffset(self, dt, is_dst=True)``
>> >>
>> >> * ``tzinfo.dst(self, dt, is_dst=True)``
>> >>
>> >> * ``tzinfo.tzname(self, dt, is_dst=True)``
>> >>
>> >> The ``is_dst`` parameter can be ``True`` (default), ``False``, or
>> >> ``None``.
>> >>
>> >> ``True`` will specify that the given datetime should be interpreted
>> >> as happening during daylight savings time, ie that the time
>> specified
>> >> is before the change from DST.
>> >
>> > Why is it True by default? Do we have statistics showing that Python
>> > gets more use in summer?
>>
>> My question exactly.
>
> "Summer" in the USA, at least, is 238 days in 2012, while "Winter" into 2013 
> is only 126 days:
>
 import datetime
 datetime.date(2012, 11, 4) - datetime.date(2012, 3, 11)
> datetime.timedelta(238)
 datetime.date(2013, 3, 10) - datetime.date(2012, 11, 4)
> datetime.timedelta(126)

Very funny, but that can't be the real reason. *Most* datetime values
aren't ambiguous, so in those cases the parameter should be ignored,
right? There's only one hour per year where you need to specify it
(two, if we want to artificially assign a meaning to values falling
the impossible hour). And during those times it's equally likely that
you meant either of the possibilities. I think the meaning of the
parameter must be clarified, perhaps as follows:

- ignored except during the ambiguous hour and during the impossible hour
- during the ambiguous or impossible hour:
  - if True, prefer/pretend DST
  - if False, prefer/pretend non-DST
  - if None, raise an error

Here I'd prefer the default to be None if I had to do it over again,
but given that the current behavior is one of the first two (which
one?) we probably can't do that. Still, it's slightly confusing that
passing None is not the same as omitting the parameter altogether --
there aren't many APIs that explicitly support passing None but don't
use it as the default (though there probably are some precedents).
Maybe requesting an error should be done through some other special
value, and None should be the same as omitted and the same as the old
behavior? But where would the special value come from? It should be
made as easy as possible to "do the right thing" (i.e. raise an
error). Or maybe have a separate Boolean flag to request an error?

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com