Re: About the ORM icontains operator's disadvantage on PostgreSQL performance and query results.

2021-03-01 Thread Hannes Ljungberg
I kind of agree that using `UPPER` instead of `ILIKE` for `__icontains` on 
PostgreSQL isn’t optimal. But it is quite easy to create a functional 
trigram GIN-index which use `UPPER` to allow these lookups to use an index. 
This will be even easier in Django 3.2 where you can create functional 
indexes in your model definitions.

The recommended way to handle case insensitive searches in PostgreSQL is by 
using functional indexes 
(https://www.postgresql.org/docs/current/indexes-expressional.html).  But 
with trigram indexes this isn’t needed and they are required to get index 
scans for leading wildcard searches. Note that trigram GIN indexes do have 
drawbacks, for example they don’t support the `=`-operator and will not 
support search strings with less than 3 characters.

I’ve been in a similar position as you and to solve this I created a custom 
lookup which used `ILIKE` and used that instead of `__icontains`.

It might make sense to change the `__icontains`  and `__iendswith` lookups 
to use `ILIKE` instead of `UPPER` but I’m not really sure that it’s 
justified to break the indexed queries people already have in place. That 
is where people have the GIN trigram `UPPER` index opposed from the regular 
GIN trigram index. The performance issue you’re describing doesn’t really 
change if we use `ILIKE` over `UPPER`. You need to install an index anyway.

Regarding your issue with Turkish characters I think that it works because 
`ILIKE` internally uses some form of `LOWER` and `LOWER('İstanbul') = 
LOWER('istanbul')` would’ve worked in your case. As James wrote this 
behaviour depends on your configured locale. I think the one way to do 
these kind of searches without changing the locale is to use `tsvector` and 
`tsquery` with a Turkish configuration, you can even make them unaccented 
to allow matching both “Istanbul” and “İstanbul” with the search string 
“istanbul”.


måndag 1 mars 2021 kl. 07:06:59 UTC+1 skrev mesuto...@gmail.com:

> Hi James,
> Thanks for your explanations. However, I wanted to explain the 
> disadvantage of using "UPPER like"  instead of "ilike" in icontains, 
> istartswith and iendswith. The performance problem should not be ignored.
>
> James Bennett , 1 Mar 2021 Pzt, 04:05 tarihinde şunu 
> yazdı:
>
>> On Sun, Feb 28, 2021 at 2:39 AM Tom Forbes  wrote:
>> >
>> > Thank you for the clarification! I think the biggest argument for this 
>> change is the fact that uppercasing Unicode can cause incorrect results to 
>> be returned.
>> >
>> > Given that we now have much better support for custom index types, 
>> perhaps we should change this? We need a custom expression index anyway, so 
>> it might not be a huge ask to say “now you should use a gin index”?
>>
>> It's worth pointing out that case mapping and transformation in
>> Unicode is difficult and complex. I wrote up an intro to the problem a
>> while back:
>>
>> https://www.b-list.org/weblog/2018/nov/26/case/
>>
>> One thing that's important to note is that there is no generic
>> one-size-fits-all-languages option that Django can just do by default
>> and get the right results. For example, a case mapping that does the
>> right thing for Turkish will do the wrong thing for (to pick a random
>> example) French, and vice-versa. Unicode itself provides a basic "hope
>> for the best" set of default case mappings that do the right thing for
>> many cased scripts, but also is clear in saying that you may need to
>> use a locale-specific mapping to get what you really want.
>>
>> Postgres has the ability to configure locale, and when configured it
>> does the "right thing" -- for example, when the locale is tr_TR or
>> another Turkish locale variant, the UPPER() function should correctly
>> handle dotted versus dotless 'i' as required for Turkish. But Postgres
>> also warns that this will have performance impact, which I think is
>> what's being noted in the ticket.
>>
>> I'm not sure there will be an easy or obvious solution here.
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Django developers  (Contributions to Django itself)" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to django-develop...@googlegroups.com.
>>
> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/django-developers/CAL13Cg9nYMJZwm2XcsCcWG5Fqn8gqqE93FM11Xcfs4TXsmTbZQ%40mail.gmail.com
>> .
>>
>
>
> -- 
> İyi çalışmalar. Saygılarımla.
>
> *Mesut Öncel*
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/124180ad-dc7e-439d-ac4a-166e40e5d921n%40googlegroups.com.


Re: About the ORM icontains operator's disadvantage on PostgreSQL performance and query results.

2021-03-01 Thread Mesut Öncel
Thank you for your detailed explanation. You are right, they have tried to
shape the database suitable for this structure for a long time, but
removing the expression indexes will not cause a crisis. People and
products using the database created by Django will already have to create a
standard index and then an expression index for "UPPER". I have presented
the gin index as an example, but indexes can be produced for full text
search in different structures. But regardless, an expression index must be
added for the "UPPER" function. Creating an expression gin index in
addition to a standard gin index can be costly and even cause performance
problems in large databases. Of course, it is also necessary to consider
its reindex. As a result, my main expectation is not to ignore the current
structure of PostgreSQL. Or if you are using "icontains", be sure to
include an expression string because we should say we prefer not to use
"ilike". :)

Hannes Ljungberg , 1 Mar 2021 Pzt, 11:31
tarihinde şunu yazdı:

> I kind of agree that using `UPPER` instead of `ILIKE` for `__icontains` on
> PostgreSQL isn’t optimal. But it is quite easy to create a functional
> trigram GIN-index which use `UPPER` to allow these lookups to use an index.
> This will be even easier in Django 3.2 where you can create functional
> indexes in your model definitions.
>
> The recommended way to handle case insensitive searches in PostgreSQL is
> by using functional indexes (
> https://www.postgresql.org/docs/current/indexes-expressional.html).  But
> with trigram indexes this isn’t needed and they are required to get index
> scans for leading wildcard searches. Note that trigram GIN indexes do have
> drawbacks, for example they don’t support the `=`-operator and will not
> support search strings with less than 3 characters.
>
> I’ve been in a similar position as you and to solve this I created a
> custom lookup which used `ILIKE` and used that instead of `__icontains`.
>
> It might make sense to change the `__icontains`  and `__iendswith` lookups
> to use `ILIKE` instead of `UPPER` but I’m not really sure that it’s
> justified to break the indexed queries people already have in place. That
> is where people have the GIN trigram `UPPER` index opposed from the regular
> GIN trigram index. The performance issue you’re describing doesn’t really
> change if we use `ILIKE` over `UPPER`. You need to install an index anyway.
>
> Regarding your issue with Turkish characters I think that it works because
> `ILIKE` internally uses some form of `LOWER` and `LOWER('İstanbul') =
> LOWER('istanbul')` would’ve worked in your case. As James wrote this
> behaviour depends on your configured locale. I think the one way to do
> these kind of searches without changing the locale is to use `tsvector` and
> `tsquery` with a Turkish configuration, you can even make them unaccented
> to allow matching both “Istanbul” and “İstanbul” with the search string
> “istanbul”.
>
>
> måndag 1 mars 2021 kl. 07:06:59 UTC+1 skrev mesuto...@gmail.com:
>
>> Hi James,
>> Thanks for your explanations. However, I wanted to explain the
>> disadvantage of using "UPPER like"  instead of "ilike" in icontains,
>> istartswith and iendswith. The performance problem should not be ignored.
>>
>> James Bennett , 1 Mar 2021 Pzt, 04:05 tarihinde şunu
>> yazdı:
>>
>>> On Sun, Feb 28, 2021 at 2:39 AM Tom Forbes  wrote:
>>> >
>>> > Thank you for the clarification! I think the biggest argument for this
>>> change is the fact that uppercasing Unicode can cause incorrect results to
>>> be returned.
>>> >
>>> > Given that we now have much better support for custom index types,
>>> perhaps we should change this? We need a custom expression index anyway, so
>>> it might not be a huge ask to say “now you should use a gin index”?
>>>
>>> It's worth pointing out that case mapping and transformation in
>>> Unicode is difficult and complex. I wrote up an intro to the problem a
>>> while back:
>>>
>>> https://www.b-list.org/weblog/2018/nov/26/case/
>>>
>>> One thing that's important to note is that there is no generic
>>> one-size-fits-all-languages option that Django can just do by default
>>> and get the right results. For example, a case mapping that does the
>>> right thing for Turkish will do the wrong thing for (to pick a random
>>> example) French, and vice-versa. Unicode itself provides a basic "hope
>>> for the best" set of default case mappings that do the right thing for
>>> many cased scripts, but also is clear in saying that you may need to
>>> use a locale-specific mapping to get what you really want.
>>>
>>> Postgres has the ability to configure locale, and when configured it
>>> does the "right thing" -- for example, when the locale is tr_TR or
>>> another Turkish locale variant, the UPPER() function should correctly
>>> handle dotted versus dotless 'i' as required for Turkish. But Postgres
>>> also warns that this will have performance impact, which I think is
>>> what's being noted in the ticke

Re: Fellow Reports - February 2021

2021-03-01 Thread Mariusz Felisiak
Week ending February 28, 2021

*Triaged: *
   https://code.djangoproject.com/ticket/32471 - Document the return value 
of EmailMessage.send() (accepted) 
   https://code.djangoproject.com/ticket/32472 - runserver prematurely 
closes connection for large response body (accepted) 
   https://code.djangoproject.com/ticket/32468 - Admin never_cache 
decorators needs method_decorator (accepted) 
   https://code.djangoproject.com/ticket/32470 - django.test.Client ignores 
request.urlconf when setting response.resolver_match (accepted) 
   https://code.djangoproject.com/ticket/32473 - no additional info is 
printed even runserver with --verbosity 3 (duplicate) 
   https://code.djangoproject.com/ticket/32474 - JSONField and 
register_default_jsonb all json column return as str (duplicate) 
   https://code.djangoproject.com/ticket/32475 - datadump and dataload 
subcommands documentation error (invalid) 
   https://code.djangoproject.com/ticket/32476 - Django Query Generating 
Style Error (invalid) 
   https://code.djangoproject.com/ticket/32479 - LocaleMiddleware not 
recognising properly zh-Hant-HK from the accept-language header (accepted) 
   https://code.djangoproject.com/ticket/32480 - Outdated docstring in 
permission_denied() and redundant comments in default error pages. 
(accepted) 
   https://code.djangoproject.com/ticket/32482 - Overriding model and/or 
application order in Admin application is complex and incosistent 
(duplicate) 
   https://code.djangoproject.com/ticket/25671 - Arrange models and apps 
order in Admin. (accepted) 
   https://code.djangoproject.com/ticket/32467 - django admin widget.attrs 
not work with ForeignKey (invalid) 
   https://code.djangoproject.com/ticket/32486 - Migration which removes 
UniqueTogether constraint containing a ForeignKey fails with MariaDB 
(MySQL) (duplicate) 
   https://code.djangoproject.com/ticket/32485 - Django ORM icontains 
Operator Performance and Response Problem (wontfix) 

*Reviewed/committed: *
   https://github.com/django/django/pull/14027 - Fixed #32469 -- Made 
assertQuerysetEqual() respect maxDiff when ordered=False. 
   https://github.com/django/django/pull/14021 - Refs #16117 -- Made 
@action and @display decorators importable from django.contrib.gis.admin. 
   https://github.com/django/django/pull/14023 - Refs #4027 -- Added 
Model._state.adding to docs about copying model instances. 
   https://github.com/django/django/pull/14012 - Fixed #32445 -- Fixed 
LiveServerThreadTest.test_closes_connections() for non-in-memory database 
on SQLite. 
   https://github.com/django/django/pull/14032 - Fixed #32471 -- Doc'd the 
return value of EmailMessage.send(). 
   https://github.com/django/django/pull/14028 - Fixed #32470 -- Fixed 
ResolverMatch instance on test clients when request.urlconf is set. 
   https://github.com/django/django/pull/13983 - Fixed #30916 -- Added 
support for functional unique constraints. 
   https://github.com/django/django/pull/14039 - Fixed #32478 -- Included 
nested columns referenced by subqueries in GROUP BY on aggregations. 
   https://github.com/django/django/pull/14030 - Fixed #32468 -- Corrected 
usage of never_cache in contrib.admin. 
   https://github.com/django/django/pull/14010 - Fixed #32446 -- Deprecated 
SERIALIZE test database setting. 
   https://github.com/django/django/pull/14024 - Fixed #32480 -- Corrected 
docstring and removed redundant comments in django/views/defaults.py. 
   https://github.com/django/django/pull/13966 - Refs #24121 -- Added 
__repr__() to FilterExpression, Lexer, Parser, and Token. 
   https://github.com/django/django/pull/14053 - Fixed #28607 -- Prevented 
duplicates in HashedFilesMixin post-processing results. 
   https://github.com/django/django/pull/13990 - Fixed #20423 -- Doc'd that 
DTL variable names may not be a number. 

*Authored: *
   https://github.com/django/django/pull/14052 - Used GitHub actions for 
docs tests. 
   https://github.com/django/django/pull/14054 - Refs #32292 -- Made 
dbshell do not use 'postgres' database when service name is set.

Best,
Mariusz

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/c0c50751-fff9-40b2-877f-bec9ac45d253n%40googlegroups.com.


Need Help

2021-03-01 Thread Mhd Ali
Hello, this must be a stupid question to you guys
I found a ticket I'll like to work on but I can't find any option to add a 
comment to tell everyone I'll like to work on it, Please help

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/bec42211-c151-4082-8124-535b0808713bn%40googlegroups.com.


Re: Need Help

2021-03-01 Thread 'Adam Johnson' via Django developers (Contributions to Django itself)
Click the "GitHub Login" button at the top of the ticket tracker pages to
create an account

[image: Screenshot 2021-03-01 at 19.16.51.png]

Do read the "first patch" tutorial:
https://docs.djangoproject.com/en/dev/intro/contributing/


On Mon, 1 Mar 2021 at 18:54, Mhd Ali  wrote:

> Hello, this must be a stupid question to you guys
> I found a ticket I'll like to work on but I can't find any option to add a
> comment to tell everyone I'll like to work on it, Please help
>
> --
> You received this message because you are subscribed to the Google Groups
> "Django developers (Contributions to Django itself)" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to django-developers+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/django-developers/bec42211-c151-4082-8124-535b0808713bn%40googlegroups.com
> 
> .
>


-- 
Adam

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/CAMyDDM3gTGLbQaspvOPRxK7ioYCq-mR-HU3OU9ki_Gi3C5vzpA%40mail.gmail.com.


Invitation to participate in a survey about Django

2021-03-01 Thread Tan, J.
Dear Django contributor,

We are doing research on understanding how developers manage a special kind
of Technical Debt in *Python.*

We kindly ask 15-20 minutes of your time to fill out our survey. To help
you decide whether to fill it in, we clarify two points.

“Why should I answer this survey?”

Your participation is essential for us to correctly understand how
developers manage Technical Debt.

“What is in it for me?”

Your valuable contributions to *Django* are part of the information we
analyzed for this study. Thus, if you help us further by answering
this survey, there are two immediate benefits:

   - you help to improve the efficiency of maintaining the quality of
   *Django*.
   - the results will be used to propose recommendations to manage
   technical debt and create tool support.

Here is the link to the survey

.

Thank you for your time and attention.

Kind regards,
Jie Tan, Daniel Feitosa and Paris Avgeriou

Software Engineering and Architecture group 
Faculty of Science and Engineering
University of Groningen, the Netherlands

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/CAJvbMuPzdbmzUtv1qC5SXdRG7yM5R3Gq70VOQivJFY69Jxz-PA%40mail.gmail.com.


Re: Invitation to participate in a survey about Django

2021-03-01 Thread Tom Forbes
I receive a few of these kinds of emails privately (I’m assuming they
scrape my email from git history). On one hand I think it might be
appropriate to post something like this to the developers mailing list if
it was specifically targeted to Django, and it’s definitely good to help
with research.

On the other hand, like many of the other emails I receive this looks to be
an impersonal and auto-generated one (the bold really doesn’t help) with a
completely vague call to action: your requirement to “understand how
technical debt is managed” means nothing to me and doesn’t entice me to
spend time answering 18+ questions on a Google form for you.

The “what’s in it for me” section is also completely meaningless - how will
answering this help me maintain the quality of Django? So to me this is
basically spam and I don’t think it’s appropriate to post here. I’m
replying publicly in case anyone else has any opinions on if this
constitutes as (academic) spam or not.

In the hopes of being constructive I’d suggest perhaps a paragraph
explaining who you are, what the actual concrete work is that you are
doing, how our contributions can help further that and the outcomes you
want to achieve. Because right now you could replace startlingly few words
to turn this into an email about people’s thoughts on the new Pepsi flavor.

Tom

On Mon, 1 Mar 2021 at 23:17, Tan, J.  wrote:

> Dear Django contributor,
>
> We are doing research on understanding how developers manage a special
> kind of Technical Debt in *Python.*
>
> We kindly ask 15-20 minutes of your time to fill out our survey. To help
> you decide whether to fill it in, we clarify two points.
>
> “Why should I answer this survey?”
>
> Your participation is essential for us to correctly understand how
> developers manage Technical Debt.
>
> “What is in it for me?”
>
> Your valuable contributions to *Django* are part of the information we
> analyzed for this study. Thus, if you help us further by answering
> this survey, there are two immediate benefits:
>
>- you help to improve the efficiency of maintaining the quality of
>*Django*.
>- the results will be used to propose recommendations to manage
>technical debt and create tool support.
>
> Here is the link to the survey
> 
> .
>
> Thank you for your time and attention.
>
> Kind regards,
> Jie Tan, Daniel Feitosa and Paris Avgeriou
>
> Software Engineering and Architecture group 
> Faculty of Science and Engineering
> University of Groningen, the Netherlands
>
> --
> You received this message because you are subscribed to the Google Groups
> "Django developers (Contributions to Django itself)" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to django-developers+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/django-developers/CAJvbMuPzdbmzUtv1qC5SXdRG7yM5R3Gq70VOQivJFY69Jxz-PA%40mail.gmail.com
> 
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/CAFNZOJNMA8KdBWE5uOrdKeHbWmtVGesVsUc1eXOn35kinmu92g%40mail.gmail.com.