Re: Question regarding a possible bug

2018-11-23 Thread Tim Graham
Generally, "is it a bug?" questions should be directed toward our support 
channels (linked from 
https://code.djangoproject.com/wiki/TicketClosingReasons/UseSupportChannels). 
Otherwise, Trac becomes a second level support channel which adds a lot of 
noise.

On Thursday, November 22, 2018 at 7:10:39 PM UTC-5, bernie2004 wrote:
>
> I am pretty but not 100% sure i found a bug in 2.1.2.
>
> Would it be appropriate to post a description of what i think i found to 
> this mailing-list before starting a ticket or is there a better place for 
> posting things like that?
>
> --
> Bernie
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/02010665-d2e1-43d3-ab66-8a69a5cc2672%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: QuerySet.iterator together with prefetch_related because of chunk_size

2018-11-23 Thread Tim Graham
https://code.djangoproject.com/ticket/29984 suggests to support 
prefetch_related() with QuerySet.iterator(). I accepted the ticket to do 
something and linked back to this discussion.

On Friday, October 26, 2018 at 8:12:02 PM UTC-4, charettes wrote:
>
> Josh, I agree that silently not working is problematic but it has been 
> this way since prefetch_related() was introduced.
>
> Something to keep in mind as well is that silently turning it on would 
> also perform P * C extra queries where P is the number of prefetches 
> requested through prefetch_related() and C the number of chunks the results 
> contains. This is non negligible IMO.
>
> What I'd be in favor off is raising an exception on 
> prefetch_related(*prefetches).iterator() in the next release at least to 
> allow users using this pattern to adjust their code and then ship the 
> optimization with all the warnings related to the interactions between 
> prefetch_related(*prefetches) and iterator(chunk_size) in the next one.
>
> I'm not completely completely against skipping the exception release phase 
> entirely given there might be users out there accessing attributes expected 
> to be prefetched on iterated instances and inadvertently performing tons of 
> queries but the exception phase just feels safer given iterator() is 
> usually used in memory constrained contexts.
>
> Simon
>
> Le vendredi 26 octobre 2018 19:27:55 UTC-4, Josh Smeaton a écrit :
>>
>> I tend to agree with Tobi. Prefetching silently not working on iterator 
>> can be quite confusing, unless you have a good understanding of both APIs. 
>> It might be possible to do what you're asking, but it'd mean that django is 
>> now actually caching the result when it explicitly says it isn't - even if 
>> the result is a much smaller moving cache. Prefetching chunk_size results 
>> per chunk is unlikely to make a material difference to memory usage. Users 
>> are usually concerned about the entire result set of the primary table.
>>
>> I don't know if you can change the API to make these suggested changes 
>> without also impacting how we iterate over result sets - but I'd be 
>> interested in seeing a proof of concept at the very least.
>>
>>
>>
>> On Monday, 15 October 2018 20:41:13 UTC+11, tobias@truffls.com wrote:
>>>
>>> Thank you for your feedback. I would like to answer some statements to 
>>> either convince you or make it more clear, where my idea stems from:
>>>
>>> The fundamental reason why iterator() cannot be used with 
 prefetch_related() is that the latter requires a set of model instance to 
 be materialized to work appropriately which chunk_size doesn't control at 
 all.
 In other words chunk_size only controls how many rows should be fetched 
 from the database cursor and kept into memory at a time. Even when this 
 parameter is used, iterator() will only materialize a single model 
 instance 
 per yield.

>>>  
>>> It should be easily possible to change the involved code of 
>>> ModelIterable to materialize the retrieved rows in batches. After 
>>> materializing the batch / chunk, it could do the prefetching.
>>>  
>>>
 Given that iterator() always ignored prefetch related lookups instead 
 of erroring out when they were specified make me feel like turning such a 
 feature on by default could be problematic as it could balloon the memory 
 usage which is the main reason why iterator is useful anyway.

>>>
>>> I would argue, that users who thoughtlessly applied prefetching together 
>>> with iterator now actually get, what they thought of: less DB query round 
>>> trips traded against a little more memory usage.
>>>
>>> Best,
>>> Tobi
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/32bbd4fa-40be-45fb-9008-ac25e9033bae%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


TestCase.setUpTestData in-memory data isolation.

2018-11-23 Thread charettes
Dear developers,

Django 1.8 introduced the `TestCase.setUpTestData()` class method as a mean 
to
speed up test fixtures initialization as compared to using `setUp()`[0].

As I've come to use this feature and review changes from peers using it in
different projects the fact that test data assigned during its execution
couldn't be safely altered by test methods without compromising test 
isolation
has often be the source of confusion and frustration.

While the `setUpTestData` documentation mentions this limitation[1] and 
ways to
work around it by using `refresh_from_db()` in `setUp()` I believe it 
defeats
the whole purpose of the feature; avoiding unnecessary roundtrips to the
database to speed up execution. Given `TestCase` goes through great lengths 
to
ensure database level data isolation I believe it should do the same with 
class
level in-memory data assigned during `setUpTestData`.

In order to get rid of this caveat of the feature I'd like to propose an
adjustment to ensure such in-memory test data isolation.

What I suggest doing is wrapping all attributes assigned during 
`setUpTestData`
in descriptors that lazily return `copy.deepcopy()`ed values on instance
attribute accesses. By attaching the `deepcopy()`'s memo on test instances 
we
can ensure that the reference graph between objects is preserved and thus
backward compatible.

In other words, the following test would pass even if `self.book` is a deep
copy of `cls.book`.

class BookTests(TestCase):
@classmethod
def setUpTestData(cls):
cls.author = Author.objects.create()
cls.book = cls.author.books.create()

def test_relationship_preserved(self):
self.assertIs(self.book.author, self.author)

Lazily returning `deepcopy'ies and caching returned values in `__dict__` à 
la
`cached_property` should also make sure the slight performance overhead this
incurs is minimized.

>From a check against a few projects and Django's test suite[2] I have only
identified a single issue which is that attributes assigned during
`setUpTestData` would now have to be `deepcopy()`able but it shouldn't be
a blocker given `Model` instance are.

In order to allow other possible issues from being identified against 
existing
projects I packaged the proposed feature[3] and made it available on 
pypi[4]. It
requires decorating `setUpTestData` methods but it shouldn't be too hard to
apply to your projects if you want to give it a try.

Given this reaches consensus that this could be a great addition I'd file
a ticket and finalize what I have so far[2].

Thank your for your time,
Simon

[0] https://docs.djangoproject.com/en/1.8/releases/1.8/#testcase-data-setup
[1] 
https://docs.djangoproject.com/en/2.1/topics/testing/tools/#django.test.TestCase.setUpTestData
[2] 
https://github.com/charettes/django/compare/setuptestdata...charettes:testdata
[3] https://github.com/charettes/django-testdata
[4] https://pypi.org/project/django-testdata/

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/7563b4b6-2708-4614-a74a-2b63ecad2f67%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.