Re: "Unify by values" setting in Oracle's base.py

2021-04-12 Thread NPB
Hi Simon,

Thanks for tracking that down.  I see - an interesting problem that lacks 
an obvious solution. Nevertheless, there is a generic way to deal with this 
and it will optimize OK in Oracle (at least in the vast majority of cases). 

select x, sum(sal) from (select :arg0 * deptno as x, sal from emp) group by 
x;

There would be a need to generate suitably obscure column aliases. So - 
not "x"! Also, more than one alias per SQL statement might be needed too of 
course.

I don't think this should be used for all cases of aggregation. IMHO it 
would be best to target it based on how the Django model is used. I wonder 
if it would be possible to catch cases where a bind variable would normally 
be included in both the SELECT list and the GROUP BY.

Regards,
Nigel

On Monday, 5 April 2021 at 15:45:31 UTC+1 charettes wrote:

> Hello Nigel,
>
> Through git blame for unify_by_values I figured it was introduced[0] to 
> deal with an issue during aggregation[1].
>
> Mariusz might be able to provide more context here as I don't have much 
> knowledge around Oracle cursor cache but it seems you'll have to find 
> another way to address the aggregation issue[1] if you'd like to change 
> this behaviour.
>
> Cheers,
> Simon
>
> [0] 
> https://github.com/django/django/commit/6dbe56ed7855f34585884a2381fb1cec22ddc824
> [1] https://code.djangoproject.com/ticket/27632
>
> Le samedi 3 avril 2021 à 18:30:28 UTC-4, NPB a écrit :
>
>> Hi,
>>
>> Can you tell me why *execute* in .../backends/oracle/base.py sets 
>> *unify_by_values=True* when it calls *_fix_for_params*? It has an 
>> interesting effect on the Oracle cursor cache.
>>
>> For example, if I use a Django model called Logger like this:
>>
>> from . import models
>> ...
>>a = models.Logger(t1="1", t2="2", t3="3", i1=1, i2=2, i3=3)
>>a.save()
>>b = models.Logger(t3="3", t2="2", t1="1", i3=3, i2=2, i1=1)
>>b.save()
>>c = models.Logger(t1="1", t2="2", t3="3", i1=1,i2=1,i3=3)
>>c.save()
>>d = models.Logger(t1="1", t2="2", t3="3", i1=3,i2=1,i3=3)
>>d.save()
>>e = models.Logger(t1="1", t2="1", t3="1", i1=1,i2=1,i3=1)
>>e.save()
>>
>> It results in the following SQL statements in the Oracle cursor cache:
>>
>> INSERT INTO "POLLS_LOGGER" ("T1", "T2", "T3", "I1", "I2", "I3") VALUES 
>> (:arg0, :arg1, :arg2, :arg3, :arg4, :arg3) RETURNING "POLLS_LOGGER"."ID" 
>> INTO :arg5
>> INSERT INTO "POLLS_LOGGER" ("T1", "T2", "T3", "I1", "I2", "I3") VALUES 
>> (:arg0, :arg0, :arg0, :arg1, :arg1, :arg1) RETURNING "POLLS_LOGGER"."ID" 
>> INTO :arg2
>> INSERT INTO "POLLS_LOGGER" ("T1", "T2", "T3", "I1", "I2", "I3") VALUES 
>> (:arg0, :arg1, :arg2, :arg3, :arg4, :arg5) RETURNING "POLLS_LOGGER"."ID" 
>> INTO :arg6
>> INSERT INTO "POLLS_LOGGER" ("T1", "T2", "T3", "I1", "I2", "I3") VALUES 
>> (:arg0, :arg1, :arg2, :arg3, :arg3, :arg4) RETURNING "POLLS_LOGGER"."ID" 
>> INTO :arg5
>>
>> Bind variable names are re-used and shuffled around if any share the same 
>> value. This results in multiple SQL IDs.
>>
>> If, instead, execute used  *unify_by_values=False*, then the Oracle 
>> buffer cache would have a single version of the statement irrespective of 
>> bind values:
>>
>> INSERT INTO "POLLS_LOGGER" ("T1", "T2", "T3", "I1", "I2", "I3") VALUES 
>> (:arg0, :arg1, :arg2, :arg3, :arg4, :arg5) RETURNING "POLLS_LOGGER"."ID" 
>> INTO :arg6
>>
>> On the face of it, the use of unify_by_values=False is more efficient (at 
>> least from the database's perspective) because there is only one hard 
>> parse. I guess it doesn't look like a big deal in this example, but I have 
>> seen cases where the cursor cache has (literally) tens of thousands of 
>> repeated cursor cache entries where one would have been used if bind names 
>> were kept consistent and independent of bind value.
>>
>> Do you think there is a valid case for this default behavior to be 
>> changed?
>>
>> Regards,
>> Nigel
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/3bf99bc3-d08b-4a16-9b44-7cf2fc749370n%40googlegroups.com.


Re: "Unify by values" setting in Oracle's base.py

2021-04-12 Thread charettes
Hello Nigel,

> I wonder if it would be possible to catch cases where a bind variable 
would normally be included in both the SELECT list and the GROUP BY.

It should be possible at the compiler level which is overriddable by 
backend (e.g. here's how it's done for a different purpose on MySQL backend 
[0]).

Thinking more about it while writing this response I do wonder if changing 
BaseExpression.get_group_by_cols to return Ref(alias) when an alias is 
specified could make things easier[1].

diff --git a/django/db/models/expressions.py 
b/django/db/models/expressions.py
index bdd23617ac..8dbfaa13fe 100644
--- a/django/db/models/expressions.py
+++ b/django/db/models/expressions.py
@@ -348,6 +348,8 @@ def copy(self):

 def get_group_by_cols(self, alias=None):
 if not self.contains_aggregate:
+if alias:
+return [Ref(alias, self)]
 return [self]
 cols = []
 for source in self.get_source_expressions():

Taking the ticket's query as an example

Book.objects.annotate(
discount_price=F('price') * 0.75
).values(
'discount_price'
).annotate(sum_discount=Sum('price'))

It should change its generated SQL from

SELECT (book.price * :arg0) discount_price, SUM(book.price) sum_discount
FROM book
GROUP BY (book.price * :arg1)

to

SELECT (book.price * :arg0) discount_price, SUM(book.price) sum_discount
FROM book
GROUP BY discount_price

Which circumvents the reported issue.

I'm not sure Oracle supports or require a subquery pushdown as you 
demonstrated in your response. If it does require adjustment though they'll 
be way easier to make if a proper list of references need to be pushed down 
is already provided by the expression layer.

I cannot commit to writing a full patch myself but hopefully this gives you 
enough pointers to give it a stab yourself.

FWIW the queries, aggregation, and aggregation_regress parts of the suite 
all pass with the proposed patch on both SQLite and PostgreSQL. I don't 
have a local Oracle setup anymore but I spun up a PR to give our CI a run 
against it[2].

Cheers,
Simon

[0] 
https://github.com/django/django/blob/e4430f22c8e3d29ce5d9d0263fba57121938d06d/django/db/backends/mysql/compiler.py#L5
[1] 
https://github.com/django/django/blob/e4430f22c8e3d29ce5d9d0263fba57121938d06d/django/db/models/expressions.py#L349-L351
[2] https://github.com/django/django/pull/14251
Le lundi 12 avril 2021 à 07:15:25 UTC-4, NPB a écrit :

> Hi Simon,
>
> Thanks for tracking that down.  I see - an interesting problem that lacks 
> an obvious solution. Nevertheless, there is a generic way to deal with this 
> and it will optimize OK in Oracle (at least in the vast majority of cases). 
>
> select x, sum(sal) from (select :arg0 * deptno as x, sal from emp) group 
> by x;
>
> There would be a need to generate suitably obscure column aliases. So - 
> not "x"! Also, more than one alias per SQL statement might be needed too of 
> course.
>
> I don't think this should be used for all cases of aggregation. IMHO it 
> would be best to target it based on how the Django model is used. I wonder 
> if it would be possible to catch cases where a bind variable would normally 
> be included in both the SELECT list and the GROUP BY.
>
> Regards,
> Nigel
>
> On Monday, 5 April 2021 at 15:45:31 UTC+1 charettes wrote:
>
>> Hello Nigel,
>>
>> Through git blame for unify_by_values I figured it was introduced[0] to 
>> deal with an issue during aggregation[1].
>>
>> Mariusz might be able to provide more context here as I don't have much 
>> knowledge around Oracle cursor cache but it seems you'll have to find 
>> another way to address the aggregation issue[1] if you'd like to change 
>> this behaviour.
>>
>> Cheers,
>> Simon
>>
>> [0] 
>> https://github.com/django/django/commit/6dbe56ed7855f34585884a2381fb1cec22ddc824
>> [1] https://code.djangoproject.com/ticket/27632
>>
>> Le samedi 3 avril 2021 à 18:30:28 UTC-4, NPB a écrit :
>>
>>> Hi,
>>>
>>> Can you tell me why *execute* in .../backends/oracle/base.py sets 
>>> *unify_by_values=True* when it calls *_fix_for_params*? It has an 
>>> interesting effect on the Oracle cursor cache.
>>>
>>> For example, if I use a Django model called Logger like this:
>>>
>>> from . import models
>>> ...
>>>a = models.Logger(t1="1", t2="2", t3="3", i1=1, i2=2, i3=3)
>>>a.save()
>>>b = models.Logger(t3="3", t2="2", t1="1", i3=3, i2=2, i1=1)
>>>b.save()
>>>c = models.Logger(t1="1", t2="2", t3="3", i1=1,i2=1,i3=3)
>>>c.save()
>>>d = models.Logger(t1="1", t2="2", t3="3", i1=3,i2=1,i3=3)
>>>d.save()
>>>e = models.Logger(t1="1", t2="1", t3="1", i1=1,i2=1,i3=1)
>>>e.save()
>>>
>>> It results in the following SQL statements in the Oracle cursor cache:
>>>
>>> INSERT INTO "POLLS_LOGGER" ("T1", "T2", "T3", "I1", "I2", "I3") VALUES 
>>> (:arg0, :arg1, :arg2, :arg3, :arg4, :arg3) RETURNING "POLLS_LOGGER"."ID" 
>>> INTO :arg5
>>> INSERT INTO "POLLS_LOGGER" ("T1", "T2", "T3", "I1", "I2", "I3") VAL

Allow turning off fuzzy matching for makemessages (Reopen #10852)

2021-04-12 Thread gw...@fusionbox.com
Fuzzy matching is not useful for me and confuses my translators. I would 
like to be able to pass an option to makemessages that passes the 
--no-fuzzy-matching option to msgmerge to turn it off. I understand it is 
useful for some people, so fuzzy matching should remain the default. I 
propose allowing makemessages to take the same option name as msgmerge and 
pass it through to the msgmerge invocation.

This was discussed in https://code.djangoproject.com/ticket/10852. I do not 
believe the usefulness of fuzzy matching in some situations should preclude 
allowing it to be turned off in others.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/20ec8f88-8a89-449a-bf01-a29e19e5cd01n%40googlegroups.com.


Re: "Unify by values" setting in Oracle's base.py

2021-04-12 Thread Mariusz Felisiak
Hi Nigel,

Creating a subquery only looks like a good solution for a limited 
number of cases. As far as I'm aware it will also change an execution plan 
and can cause a performance regression in complicated queries. We also 
cannot group by column aliases on Oracle as proposed in PR14251 
. You can try to tune cursor 
sharing 

 
but I don't have any advice here. I would be happy to review a patch in 
this area, so feel-free to prepare an optimization.

Best,
Mariusz

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/ad5e58c5-9b36-4e74-92ce-ee3148e72386n%40googlegroups.com.


Re: Allow turning off fuzzy matching for makemessages (Reopen #10852)

2021-04-12 Thread Claude Paroz
As you can see in the ticket you mention (see last comment), it is already 
possible to do so by adding a custom version of the makemessages command in 
your project.

Hope this helps,

Claude

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/57856ae7-3c31-4da1-9915-5a3dd35ed016n%40googlegroups.com.