I've had a deeper look at this now and think I have an API proposal. First,
the state of supported vendors:

1. All vendors support adding a collation to text/varchar fields.
2. The syntax is more or less the same.
3. However, the collation names themselves are different.
4. PostgreSQL is the only vendor that allows creating custom collations at
runtime.

So I'm thinking we add a new `collation` parameter to `CharField` and
`TextField`, that simply takes a string of the collation ID. I'm not quite
sure on the implementation as I don't know the ORM that well, but my naive
approach would be to just add a new format string to the `data_types` dict
that is calculated during the field __init__(), either an empty string to
use the default collation, or e.g. ' collate <collation_id>'. There's may
well be a better approach.

By using this - because the collation names are not the same across vendors
- the user is saying "I'm okay with this only working on one database
vendor", so there should be a warning in the docs. There is perhaps some
scope in the future to make this take a callable that can figure out the
collation per-database. This would be useful for getting case-insensitive
lookups working across all backends, for example. But I want to keep that
out of the scope because it's some extra work and I'm not sure on the
implementation.

Another downside is that people like to use CharField as a base class for
other column types that might not support collations, but I think this
should be in the user's hands to make sure they aren't doing that.

We should also add a `CreateCollation` operation for Postgres, similar to
the `CreateExtension` operation that currently exists. If the user wants to
use a custom collation they must create it first, similar to using
extensions currently.

The advantage to this is that users can use collations without having to
make SQL migrations, which I think would be nice. The really nice thing is
the ability to have case-insensitive lookups that work across all database
vendors, rather than only Postgres as it currently is. And as I mentioned
in the previous message, Postgres is discouraging our current method of
using the citext extension in favour of this approach.

Cheers,
Tom

On Wed, 27 May 2020 at 14:17, Tom Carrick <t...@carrick.eu> wrote:

> I think it would be useful to be able to create collations and use them
> with model fields. The motivation here is mostly that citext is somewhat
> discouraged <https://www.postgresql.org/docs/12/citext.html> in favour of
> creating a collation. I'm not sure how this would work on other
> databases,or what the API would look like. I didn't see a ticket for it or
> another discussion, but maybe there is one.
>
> Perhaps a Collation class would make sense, and it could be added to a
> Field with a new parameter. I'm not sure how easy it would be to only
> create a collation once, so perhaps it would need a CreateCollation
> migration as well, but I know very little about the internals of migrations.
>
> Cheers,
> Tom
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/CAHoz%3DMaantwsNGn%3DntARPdcPPQNSJDUCXG8bU7A6N9vJdAhT4Q%40mail.gmail.com.

Reply via email to