#35011: Queryset union can fail depending on the field type and/or order
-------------------------------------+-------------------------------------
     Reporter:  Thierry Bastian      |                    Owner:  Simon
                                     |  Charette
         Type:  Bug                  |                   Status:  assigned
    Component:  Database layer       |                  Version:  5.0
  (models, ORM)                      |
     Severity:  Normal               |               Resolution:
     Keywords:                       |             Triage Stage:  Accepted
    Has patch:  0                    |      Needs documentation:  0
  Needs tests:  0                    |  Patch needs improvement:  0
Easy pickings:  0                    |                    UI/UX:  0
-------------------------------------+-------------------------------------
Changes (by Simon Charette):

 * cc: Simon Charette (added)
 * owner:  nobody => Simon Charette
 * status:  new => assigned
 * stage:  Unreviewed => Accepted


Comment:

 It seems that your provide example has been failing since at least 3.2 so
 it's not a recent regression in Django 5.

 Running your test on Django 3.2 and 4.2 I get

 {{{
 ValueError: invalid literal for int() with base 10: '2023-12-05
 02:46:28.751461'
 }}}

 The fundamental issue here is that the order of specified fields in
 `values` has never been a 1:1 match to the order of the `SELECT` columns
 but the latter is a necessity to do proper `UNION` between querysets.

 The order is, and has always been, `SELECT *extra, *fields, *annotations`
 and within each groups the order of `values` is respected since #28553
 which landed in d6b6e5d0fd4e6b6d0183b4cf6e4bd4f9afc7bf67 so I guess this
 is what broke things on your side during the attempted upgrade but that
 your tests failed to capture. In this sense I guess this could be
 considered a release blocker as it changed the ordering from an ambiguous
 ordering to the other.

 I tried [https://github.com/django/django/pull/16703 solving the
 fundamental issue] but I got in a rabbit hole with regards to the
 deprecation of `extra(fields)` which
 [https://github.com/django/django/pull/16681 I tried to focus on first].
 I'd like to find time to get this across the finish line during the
 holidays so I'll assign to me in the mean time.

 This is really what #28553 should have been about in my opinion; making
 sure the order of `values/values_list` is always respected even with it
 contains members that cross groups (`extra`, `fields`, `annotations`).

 I'm torn on how to proceed here wrt/ to
 d6b6e5d0fd4e6b6d0183b4cf6e4bd4f9afc7bf67 as while it fixed some queries it
 broke others as described here.

 Until we figure out our plan here Thierry your best bet is to stick
 entirely to annotations. That means doing something like

 {{{#!diff
 diff --git a/bugreport/test.py b/bugreport/test.py
 index 4f66fa2..f6fc98b 100644
 --- a/bugreport/test.py
 +++ b/bugreport/test.py
 @@ -1,12 +1,12 @@
  from django.test import TestCase
 -from django.db.models import Value, IntegerField, DateTimeField
 +from django.db.models import Value, IntegerField, DateTimeField, F
  from bugreport.models import Leaf, Branch


  class TreeTest(TestCase):

      def test_repro(self):
 -        column_names = ['size', 'created_date']
 +        column_names = ['size', 'created_date_ann']
          #column_names = ['created_date', 'size']  # this one works...

          leaf = Leaf.objects.create()
 @@ -16,13 +16,13 @@ def test_repro(self):
              branch_qs
              .annotate(
                  size=Value(1, output_field=IntegerField()),
 -                created_date=Value(None, output_field=DateTimeField()),
 +                created_date_ann=Value(None,
 output_field=DateTimeField()),
              ).values_list(*column_names))

          leaf_qs = Leaf.objects.all()
          leaf_qs = (
              leaf_qs
 -            .annotate(size=Value(2, output_field=IntegerField()))
 +            .annotate(size=Value(2, output_field=IntegerField()),
 created_date_ann=F("created_date"))
              .values_list(*column_names)
          )
          print("this works", list(leaf_qs))
 }}}

 Which will have all the `values` reference in the same `SELECT` group and
 due to d6b6e5d0fd4e6b6d0183b4cf6e4bd4f9afc7bf67 the order will be
 respected.

-- 
Ticket URL: <https://code.djangoproject.com/ticket/35011#comment:1>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-updates/0107018c37f57b4f-293d121a-a090-4a1b-92af-2cba7616a893-000000%40eu-central-1.amazonses.com.

Reply via email to