Re: general interest in faster bulk_update implementation

2022-04-26 Thread Jörg Breitbart

@Florian

Thx for your response.

Looking through the release notes and the listed databases I find these 
version requirements:

- PostgreSQL 10+
- MariaDB 10.2+
- MySQL 5.7+
- Oracle 19c+
- SQLite 3.9.0+

Compared to the UPDATE FROM VALUES pattern requirements:
- MariaDB 10.3.3+
- MySQL 8.0.19+
- Oracle currently no impl at all
- SQLite 3.33+

thus only postgres would work out of the box. Question then is, whether 
to raise version requirements for django. Imho a good indicator for that 
might be the age of a db release, its EOL state, and whether it is still 
part of LTS distros:

- MariaDB 10.2: EOL 04/2022
- MariaDB 10.3: released in 04/2018
  (Ubuntu 20.04 LTS is on 10.3 line, 18.04 LTS on 10.1)
  --> prolly safe to raise to 10.3 line?

- MySQL 5.7: EOL 10/2023
- MySQL 8.0: released in 04/2018
  (Ubuntu 20.04 LTS contains 8.0 line, 18.04 LTS on 5.7)
  --> 5.7 is still within lifetime for 1.5ys
  --> Cut old ropes early here?

- SQLite: 3.22 on ubuntu 18.04, 3.31 on ubuntu 20.04
  --> imho 3.33+ cannot be requested here, as upgrading sqlite3
  packages is much more of a hassle for peeps

- others non supported (incl. oracle):
Should there be a default fallback within django? Or should db vendors
be bugged for implementing an abstract interface for UPDATE FROM VALUES?

Especially the last 2 points (sqlite and general db vendor compat) are 
tricky. Here I think a general fallback within django ORM might be the 
only way to not let db version issues surface its way to the user. Not 
sure yet, how practical/maintainable that would be in the end, as it 
would have to provide 2 internal code paths for the same bulk_update API 
endpoint:

- fast one, if UPDATE FROM VALUES pattern is supported
- fallback for backends not supporting the fast update pattern

Thinking the problem from a db vendor perspective, it could look like 
this in a db package (just brainstorming atm):

- a flag indicating support for UPDATE FROM VALUES pattern
- the flag result might be active code, if the db driver has
  to test support on server side (thats the currently case for mysql)
- to provide an easy upgrade path for db vendors, the flag might be 
missing on the db backend at first (hasattr is your friend)

- if supported: implementation of an abstract ORM interface

While writing this down it kinda became clear to me, that for easy 
transition of the db backends, a fallback impl in the ORM always would 
be needed. Furthermore with that flag scheme in the db backends a strict 
version match is not needed at all, as the db backend could always say 
"nope, cannot do that" and the fallback would kick in. This fallback 
could be the current bulk_update impl.


The downside of such an approach is clearly the needed code complexity, 
furthermore the ORM would leave its mostly(?) ISO/ANSI grounds and have 
to delegate the real sql creation to vendor specific implementation in 
the backends.



@f-expressions
Yes the current .bulk_update implementation inherits the expression 
support from .update (kinda passes things along). I am currently not 
sure, if that can be mimicked 100% around an optimized implementation by 
pre-/post executing updates for those fields, as col/row ordering might 
have weird side effects. I'd first need to do some tests how the current 
implementation deals with field refs while the ref'ed field itself gets 
updated before/after the ref usage. (Not even sure if thats guaranteed 
to always do the same thing across db engines)
Pre-/posthandling of f-expressions will slow down the code, as those 
most likely have to go into the SET clause of a second update statement. 
There might be several faster UNION tricks possible here, but I have not 
tested those.


Cheers,
jerch

--
You received this message because you are subscribed to the Google Groups "Django 
developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/1defa341-9b2a-e716-0319-ff29582ca77e%40netzkolchose.de.


Re: general interest in faster bulk_update implementation

2022-04-26 Thread Mariusz Felisiak
Support for MariaDB 10.2 is already dropped in Django 4.1. We will drop 
support for MySQL 5.7 in Django 4.2 or 5.0 (probably 4.2).

See https://code.djangoproject.com/wiki/SupportedDatabaseVersions for more 
details.

Best,
Mariusz

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/d4576ed7-b8d1-4158-a7b7-9069e7099adcn%40googlegroups.com.


Re: Provide a way to pass kwargs when initializing the storage class

2022-04-26 Thread Carlton Gibson
Hi Jarosław. Thanks for picking this up. 

There seems to be enough support for the general idea here, so worth 
pressing on. 

Let's think about any required deprecations on the PR. (It's easier there 
😅). 

Kind Regards,

Carlton

On Sunday, 24 April 2022 at 01:25:28 UTC+2 jaro...@wygoda.me wrote:

>  I'd like to introduce a file storage registry similar to 
> BaseConnectionHandler (django/utils/connection.py) and EngineHandler 
> (django/template/utils.py).
>
> Example settings.py snippet:
>
> STORAGES = {  # rename to FILE_STORAGES to make it more explictit?
> 'example': {
> 'BACKEND': 'django.core.files.storage.FileSystemStorage',
> 'OPTIONS': {
> 'location': '/example',
> 'base_url': '/example/',
> },
> },
> }
>
> Changes introduced by this pr are backward compatible. Users can still use 
> existing settings to configure static and media storages.
>
> Currently storages can be retrieved from the following objects:
>
> django/core/files/storage.py:
>
> get_storage_class
> DefaultStorage
> default_storage 
>
> django/contrib/staticfiles/storage.py:
>
> ConfiguredStorage
> staticfiles_storage 
>
> What do you think about deprecating them?
>
> ​https://github.com/django/django/pull/15610
>
> FileField can be tackled in a separate pr.
>
> czwartek, 12 listopada 2015 o 11:25:57 UTC+1 ja...@tartarus.org 
> napisał(a):
>
>> On 8 Nov 2015, at 08:31, Marc Tamlyn  wrote: 
>>
>> > I'm definitely in favour of a format allowing multiple storage back 
>> ends referred to by name. For larger sites it is not unusual to manage 
>> files across multiple locations (eg several S3 buckets). The storage param 
>> to FileField would be allowed to be the key, and there would be a 
>> get_storage function like get_cache. 
>>
>> It would remove the assymetry between the default backends and per-field 
>> ones, which does feel a little odd. However I don’t think that’s a strong 
>> enough reason to go for more complicated. Ballooning dictionaries can feel 
>> overwhelming when looking at modern Django settings (for instance, the new 
>> templates configuration is more daunting than it used to be), and as 
>> pointed out, overriding is more fiddly. 
>>
>> For testing, you need to be explicit per-field no matter what, so it’s a 
>> change from an instance to a symbolic reference. The instance is probably a 
>> variable anyway by declaration of the test model, which I suspect is 
>> slightly easier to chase. 
>>
>> So I’d be slightly more in favour of the terse, tuple-based syntax. 
>>
>> J 
>>
>> -- 
>> James Aylett 
>> I make: devfort.com, spacelog.org 
>> Films: talktorex.co.uk 
>> Everything else: tartarus.org/james/ 
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/48fed52b-4ef4-49d2-8e8d-c3f8550a4731n%40googlegroups.com.