Hello,

This week, I've been working on a related topic that I had missed entirely in 
my initial proposal: serialization.

Developers will obtain aware datetimes from Django when USE_TZ = True. We must 
ensure that they serialize correctly.

Currently, the serialization code isn't very consistent with datetimes:
        - JSON: the serializer uses the '%Y-%m-%d %H:%M:%S' format, losing 
microseconds and timezone information. This dates back to the initial commit at 
r3237. See also #10201.
        - XML: the serializer delegates to DateTimeField.value_to_string, who 
also uses the '%Y-%m-%d %H:%M:%S' format.
        - YAML: the serializer handles datetimes natively, and it includes 
microseconds and UTC offset in the output.

I've hesitated between converting datetimes to UTC or rendering them as-is with 
an UTC offset. The former would be more consistent with the database and it's 
recommended in YAML. But the latter avoids modifying the data: not only is it 
faster, but it's also more predictable. Serialization isn't just about storing 
the data for further retrieval, it can be used to print arbitrary data in a 
different format. Finally, when the data comes straight from the database (the 
common case), it will be in UTC anyway.

Eventually, I've decided to serialize aware datetimes without conversion. The 
implementation is here:
https://bitbucket.org/aaugustin/django/compare/..django/django

Here are the new serialization formats for datetimes:
        - JSON: as described in the specification at 
http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-262.pdf > 
15.9.1.15 Date Time String Format.
        - XML: as produced by datetime.isoformat(), ISO8601.
        - YAML: unchanged, compatible with http://yaml.org/type/timestamp.html 
— the canonical representation uses 'T' as separator and is in UTC, but it's 
also acceptable to use a space and include an offset like pyyaml does.
These formats follow the best practices described in 
http://www.w3.org/TR/NOTE-datetime.

This fix is backwards-incompatible for the JSON and XML serializers: it 
includes fractional seconds and timezone information, and it uses the 
normalized separator, 'T', between the date and time parts. However, I've made 
sure that existing fixtures will load properly with the new code. I'll mention 
all this in the release notes.

Unrelatedly, I have switched the SQLite backend to supports_timezones = False, 
because it really doesn't make sense to write the UTC offset but ignore it when 
reading back the data.

Best regards,

-- 
Aymeric Augustin.

On 17 sept. 2011, at 09:59, Aymeric Augustin wrote:

> Hello,
> 
> This week, I've gathered all the information I need about how the database 
> engines and adapters supported by Django handle datetime objects. I'm 
> attaching my findings.
> 
> The good news is that the database representations currently used by Django 
> are already optimal for my proposal. I'll store data in UTC:
> - with an explicit timezone on PostgreSQL,
> - without timezone on SQLite and MySQL because the database engine doesn't 
> support it,
> - without timezone on Oracle because the database adapter doesn't support it.
> 
> 
> Currently, Django sets the "supports_timezones feature" to True for SQLite. 
> I'm skeptical about this choice. Indeed, the time zone is stored: SQLite just 
> saves the output of "<datetime>.isoformat(), which includes the UTC offset 
> for aware datetime objects. However, the timezone information is ignored when 
> reading the data back from the database, thus yielding incorrect data when 
> it's different from the local time defined by settings.TIME_ZONE.
> 
> As far as I can tell, the "supports_timezones" and the 
> "needs_datetime_string_cast" database features are incompatible, at least 
> with the current implementation of "typecast_timestamp". There's a comment 
> about this problem that dates back to the merge of magic-removal, possibly 
> before:
> https://code.djangoproject.com/browser/django/trunk/django/db/backends/util.py?annotate=blame#L79
> 
> SQLite is the only engine who has these two flags set to True. I think 
> "supports_timezones" should be False. Does anyone know why it's True? Is it 
> just an historical artifact?
> 
> 
> Finally, I have read the document that describes "to_python", 
> "value_to_string", and r"get_(db_)?prep_(value|save|lookup)". The next step 
> is to adjust these functions in DateFieldField, depending on the value of 
> settings.USE_TZ.
> 
> Best regards,
> 
> -- 
> Aymeric Augustin.
> 
> <DATABASE-NOTES.html>
> 
> On 11 sept. 2011, at 23:18, Aymeric Augustin wrote:
> 
>> Hello,
>> 
>> Given the positive feedback received here and on IRC, I've started the 
>> implementation.
>> 
>> Being most familiar with mercurial, I've forked the Bitbucket mirror. This 
>> page that compares my branch to trunk:
>> https://bitbucket.org/aaugustin/django/compare/..django/django
>> 
>> I've read a lot of code in django.db, and also the documentation of 
>> PostgreSQL, MySQL and SQLite regarding date/time types.
>> 
>> I've written some tests that validate the current behavior of Django. Their 
>> goal is to guarantee backwards-compatibility when USE_TZ = False.
>> 
>> At first they failed because runtests.py doesn't set os.environ['TZ'] and 
>> doesn't call time.tzset() , so the tests ran with my system local time. I 
>> fixed that in setUp and tearDown. Maybe we should call them in runtests.py 
>> too for consistency?
>> 
>> By the way, since everything is supposed to be in UTC internally when USE_TZ 
>> is True, it is theoretically to get rid of os.environ['TZ'] and 
>> time.tzset(). They are only useful to make timezone-dependant functions 
>> respect the TIME_ZONE setting. However, for backwards compatibility (in 
>> particular with third-party apps), it's better to keep them and interpret 
>> naive datetimes in the timezone defined by settings.TIME_ZONE (instead of 
>> rejecting them outright). For this reason, I've decided to keep 
>> os.environ['TZ'] and time.tzset() even when USE_TZ is True.
>> 
>> Best regards,
>> 
>> -- 
>> Aymeric Augustin.
>> 
>> 
>> On 3 sept. 2011, at 17:40, Aymeric Augustin wrote:
>> 
>>> Hello,
>>> 
>>> The GSoC proposal "Multiple timezone support for datetime representation" 
>>> wasn't picked up in 2011 and 2010. Although I'm not a student and the 
>>> summer is over, I'd like to tackle this problem, and I would appreciate it 
>>> very much if a core developer accepted to mentor me during this work, 
>>> GSoC-style.
>>> 
>>> Here is my proposal, following the GSoC guidelines. I apologize for the 
>>> wall of text; this has been discussed many times in the past 4 years and 
>>> I've tried to address as many concerns and objections as possible.
>>> 
>>> Definition of success
>>> ---------------------
>>> 
>>> The goal is to resolve ticket #2626 in Django 1.4 or 1.5 (depending on when 
>>> 1.4 is released).
>>> 
>>> Design specification
>>> --------------------
>>> 
>>> Some background on timezones in Django and Python
>>> .................................................
>>> 
>>> Currently, Django stores datetime objects in local time in the database, 
>>> local time being defined by the TIME_ZONE setting. It retrieves them as 
>>> naive datetime objects. As a consequence, developers work with naive 
>>> datetime objects in local time.
>>> 
>>> This approach sort of works when all the users are in the same timezone and 
>>> don't care about data loss (inconsistencies) when DST kicks in or out. 
>>> Unfortunately, these assumptions aren't true for many Django projects: for 
>>> instance, one may want to log sessions (login/logout) for security 
>>> purposes: that's a 24/7 flow of important data. Read tickets #2626 and 
>>> #10587 for more details.
>>> 
>>> Python's standard library provides limited support for timezones, but this 
>>> gap is filled by pytz <http://pytz.sourceforge.net/>. If you aren't 
>>> familiar with the topic, strongly recommend reading this page before my 
>>> proposal. It explains the problems of working in local time and the 
>>> limitations of Python's APIs. It has a lot of examples, too.
>>> 
>>> Django should use timezone-aware UTC datetimes internally
>>> .........................................................
>>> 
>>> Example : datetime.datetime(2011, 09, 23, 8, 34, 12, tzinfo=pytz.utc)
>>> 
>>> In my opinion, the problem of local time is strikingly similar to the 
>>> problem character encodings. Django uses only unicode internally and 
>>> converts at the borders (HTTP requests/responses and database). I propose a 
>>> similar solution: Django should always use UTC internally, and conversion 
>>> should happen at the borders, i.e. when rendering the templates and 
>>> processing POST data (in form fields/widgets). I'll discuss the database in 
>>> the next section.
>>> 
>>> Quoting pytz' docs: "The preferred way of dealing with times is to always 
>>> work in UTC, converting to localtime only when generating output to be read 
>>> by humans." I think we can trust pytz' developers on this topic.
>>> 
>>> Note that a timezone-aware UTC datetime is different from a naive datetime. 
>>> If we were using naive datetimes, and assuming we're using pytz, a 
>>> developer could write:
>>> 
>>> mytimezone.localize(datetime_django_gave_me)
>>> 
>>> which is incorrect, because it will interpret the naive datetime as local 
>>> time in "mytimezone". With timezone-aware UTC datetime, this kind of errors 
>>> can't happen, and the equivalent code is:
>>> 
>>> datetime_django_gave_me.astimezone(mytimezone)
>>> 
>>> Django should store datetimes in UTC in the database
>>> ....................................................
>>> 
>>> This horse has been beaten to death on this mailing-list so many times that 
>>> I'll  keep the argumentation short. If Django handles everything as UTC 
>>> internally, it isn't useful to convert to anything else for storage, and 
>>> re-convert to UTC at retrieval.
>>> 
>>> In order to make the database portable and interoperable:
>>>  - in databases that support timezones (at least PostgreSQL), the timezone 
>>> should be set to UTC, so that the data is unambiguous;
>>>  - in databases that don't (at least SQLite), storing data in UTC is the 
>>> most reasonable choice: if there's a "default timezone", that's UTC.
>>> 
>>> I don't intend to change the storage format of datetimes. It has been 
>>> proposed on this mailing-list to store datetimes with original timezone 
>>> information. However, I suspect that in many cases, datetimes don't have a 
>>> significant "original timezone" by themselves. Furthermore, there are many 
>>> different ways to implemented this outside of Django's core. One is to 
>>> store a local date + a local time + a place or timezone + is_dst flag and 
>>> skip datetime entirely. Another is to store an UTC datetime + a place or 
>>> timezone. In the end, since there's no obvious and consensual way to 
>>> implement this idea, I've chosen to exclude it from my proposal. See the 
>>> "Timezone-aware storage of DateTime" thread on this mailing list for a long 
>>> and non-conclusive discussion of this idea.
>>> 
>>> I'm expecting to take some flak because of this choice :) Indeed, if you're 
>>> writing a multi-timezone calendaring application, my work isn't going to 
>>> resolve all your problems — but it won't hurt either. It may even provide a 
>>> saner foundation to build upon. Once again, there's more than one way to 
>>> solve this problem, and I'm afraid that choosing one would offend some 
>>> people sufficiently to get the entire proposal rejected.
>>> 
>>> Django should convert between UTC and local time in the templates and forms
>>> ...........................................................................
>>> 
>>> I regard the problem of local time (in which time zone is my user?) as very 
>>> similar to internationalization (which language does my user read?), and 
>>> even more to localization (in which country does my user live?), because 
>>> localization happens both on output and on input.
>>> 
>>> I want controllable conversion to local time when rendering a datetime in a 
>>> template. I will introduce:
>>>  - a template tag, {% localtime on|off %}, that works exactly like {% 
>>> localize on|off %}; it will be available with {% load tz %};
>>>  - two template filters, {{ datetime|localtime }} and {{ datetime|utctime 
>>> }}, that work exactly like {{ value|localize }} and {{ value|unlocalize }}.
>>> 
>>> I will convert datetimes to local time when rendering a DateTimeInput 
>>> widget, and also handle SplitDateTimeWidget and SplitHiddenDateTimeWidget 
>>> which are more complicated.
>>> 
>>> Finally, I will convert datetimes entered by end-users in forms to UTC. I 
>>> can't think of cases where you'd want an interface in local time but user 
>>> input in UTC. As a consequence, I don't plan to introduce the equivalent of 
>>> the `localize` keyword argument in form fields, unless someone brings up a 
>>> sufficiently general use case.
>>> 
>>> How to set each user's timezone
>>> ...............................
>>> 
>>> Internationalization and localization are based on the LANGUAGES setting. 
>>> There's a widely accepted standard to select automatically the proper 
>>> language and country, the Accept-Language header.
>>> 
>>> Unfortunately, some countries like the USA have more than one timezone, so 
>>> country information isn't enough to select a timezone. To the best of my 
>>> knowledge, there isn't a widely accepted way to determine the timezones of 
>>> the end users on the web.
>>> 
>>> I intend to use the TIME_ZONE setting by default and to provide an 
>>> equivalent of `translation.activate()` for setting the timezone. With this 
>>> feature, developers can implement their own middleware to set the timezone 
>>> for each user, for instance they may want to use 
>>> <http://pytz.sourceforge.net/#country-information>.
>>> 
>>> This means I'll have to introduce another thread local. I know this is 
>>> frowned upon. I'd be very interested if someone has a better idea.
>>> 
>>> It might be no longer necessary to set os.environ['TZ'] and run 
>>> time.tzset() at all. That would avoid a number of problems and make Windows 
>>> as well supported as Unix-based OSes — there's a bunch of tickets in Trac 
>>> about this.
>>> 
>>> I'm less familiar with this part of the project and I'm interested in 
>>> advice about how to implement it properly.
>>> 
>>> Backwards compatibility
>>> .......................
>>> 
>>> Most previous attempts to resolve have stumbled upon this problem.
>>> 
>>> I propose to introduce a USE_TZ settings (yes, I know, yet another setting) 
>>> that works exactly like USE_L10N. If set to False, the default, you will 
>>> get the legacy (current) behavior. Thus, existing websites won't be 
>>> affected. If set to True, you will get the new behavior described above.
>>> 
>>> I will also explain in the release notes how to migrate a database — which 
>>> means shifting all datetimes to UTC. I will attempt to develop a script to 
>>> automate this task.
>>> 
>>> Dependency on pytz
>>> ..................
>>> 
>>> I plan to make pytz a mandatory dependency when USE_TZ is True. This would 
>>> be similar to the dependency on on gettext when USE_I18N is True.
>>> 
>>> pytz gets a new release every time the Olson database is updated. For this 
>>> reason, it's better not to copy it in Django, unlike simplejson and 
>>> unittest2.
>>> 
>>> It was split from Zope some time ago. It's a small amount of clean code and 
>>> it could be maintained within Django if it was abandoned (however unlikely 
>>> that sounds).
>>> 
>>> Miscellaneous
>>> .............
>>> 
>>> The following items have caused bugs in the past and should be checked 
>>> carefully:
>>> 
>>>  - caching: add timezone to cache key? See #5691.
>>>  - functions that use LocalTimezone: naturaltime, timesince, timeuntil, 
>>> dateformat.
>>>  - os.environ['TZ']. See #14264.
>>>  - time.tzset() isn't supported on Windows. See #7062.
>>> 
>>> Finally, my proposal shares some ideas with 
>>> https://github.com/brosner/django-timezones; I didn't find any 
>>> documentation, but I intend to review the code.
>>> 
>>> About me
>>> --------
>>> 
>>> I've been working with Django since 2008. I'm doing a lot of triage in 
>>> Trac, I've written some patches (notably r16349, r16539, r16548, also some 
>>> documentation improvements and bug fixes), and I've helped to set up 
>>> continuous integration (especially for Oracle). In my day job, I'm 
>>> producing enterprise software based on Django with a team of ten developers.
>>> 
>>> Work plan
>>> ---------
>>> 
>>> Besides the research that's about 50% done, and discussion that's going to 
>>> take place now, I expect the implementation and tests to take me around 
>>> 80h. Given how much free time I can devote to Django, this means three to 
>>> six months.
>>> 
>>> Here's an overview of my work plan:
>>> 
>>> - Implement the USE_TZ flag and database support — this requires checking 
>>> the capabilities of each supported database in terms of datetime types and 
>>> time zone support. Write tests, especially to ensure backwards 
>>> compatibility. Write docs. (20h)
>>> 
>>> - Implement timezone localization in templates. Write tests. Write docs. 
>>> (10h)
>>> 
>>> - Implement timezone localization in widgets and forms. Check the admin 
>>> thoroughly. Write tests. Write docs. (15h)
>>> 
>>> - Implement the utilities to set the user's timezone. Write tests. Write 
>>> docs. (15h)
>>> 
>>> - Reviews, etc. (20h)
>>> 
>>> What's next?
>>> ------------
>>> 
>>> Constructive criticism, obviously :) Remember that the main problems here 
>>> are backwards-compatibility and keeping things simple.
>>> 
>>> Best regards,
>>> 
>>> -- 
>>> Aymeric.
>>> 
>>> 
>>> 
>>> 
>>> 
>>> Annex: Research notes
>>> ---------------------
>>> 
>>> Wiki
>>> ....
>>> 
>>> [GSOC] 
>>> https://code.djangoproject.com/wiki/SummerOfCode2011#Multipletimezonesupportfordatetimerepresentation
>>> 
>>> Relevant tickets
>>> ................
>>> 
>>> #2626: canonical ticket for this issue
>>> 
>>> #2447: dupe, an alternative solution
>>> #8953: dupe, not much info
>>> #10587: dupe, a fairly complete proposal, but doesn't address backwards 
>>> compatibility for existing data
>>> 
>>> Relevant related tickets
>>> ........................
>>> 
>>> #14253: how should "now" behave in the admin when "client time" != "server 
>>> time"?
>>> 
>>> Irrelevant related tickets
>>> ..........................
>>> 
>>> #11385: make it possible to enter data in a different timezone in 
>>> DateTimeField
>>> #12666: timezone in the 'Date:' headers of outgoing emails - independant 
>>> resolution
>>> 
>>> Relevant threads
>>> ................
>>> 
>>> 2011-05-31  Timezone-aware storage of DateTime
>>> http://groups.google.com/group/django-developers/browse_thread/thread/76e2b486d561ab79
>>> 
>>> 2010-08-16  Datetimes with timezones for mysql
>>> https://groups.google.com/group/django-developers/browse_thread/thread/5e220687b7af26f5
>>> 
>>> 2009-03-23  Django internal datetime handling
>>> https://groups.google.com/group/django-developers/browse_thread/thread/ca023360ab457b91
>>> 
>>> 2008-06-25  Proposal: PostgreSQL backends should *stop* using 
>>> settings.TIME_ZONE
>>> http://groups.google.com/group/django-developers/browse_thread/thread/b8c885389374c040
>>> 
>>> 2007-12-02  Timezone aware datetimes and MySQL (ticket #5304)
>>> https://groups.google.com/group/django-developers/browse_thread/thread/a9d765f83f552fa4
>>> 
>>> Relevant related threads
>>> ........................
>>> 
>>> 2009-11-24  Why not datetime.utcnow() in auto_now/auto_now_add
>>> http://groups.google.com/group/django-developers/browse_thread/thread/4ca560ef33c88bf3
>>> 
>>> Irrelevant related threads
>>> ..........................
>>> 
>>> 2011-07-25  "c" date formating and Internet usage
>>> https://groups.google.com/group/django-developers/browse_thread/thread/61296125a4774291
>>> 
>>> 2011-02-10  GSoC 2011 student contribution
>>> https://groups.google.com/group/django-developers/browse_thread/thread/0596b562cdaeac97/585ce1b04632198a?#585ce1b04632198a
>>> 
>>> 2010-11-04  Changing settings per test
>>> https://groups.google.com/group/django-developers/browse_thread/thread/65aabb45687e572e
>>> 
>>> 2009-09-15  What is the status of auto_now and auto_now_add?
>>> https://groups.google.com/group/django-developers/browse_thread/thread/cd1a76bca6055179
>>> 
>>> 2009-03-09  TimeField broken in Oracle
>>> https://groups.google.com/group/django-developers/browse_thread/thread/bba2f80a2ca9b068
>>> 
>>> 2009-01-12  Rolling back tests -- status and open issues
>>> https://groups.google.com/group/django-developers/browse_thread/thread/1e4f4c840b180895
>>> 
>>> 2008-08-05  Transactional testsuite
>>> https://groups.google.com/group/django-developers/browse_thread/thread/49aa551ad41fb919
>>> 
>> 
> 

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.

Reply via email to