Hi Vjiay,

My apologies if the scenario is not clear. Following are the details:

Lets say there is a Model A (with fields, foreignkeys and ManyToMany 
relationships with other models). There are 10k records for A. Lets say 
following is the pseudo code for the report:

As = A.objects.all()

for a in As:
     retrieve other related data from associated models.
     Write Data in a csv report file

There's no CPU intensive work above - its just fetching data. I had used 
select_related and prefetch_related to reduce DB queries - otherwise MySQL 
CPU usage was going up.

The above was run in Celery as separate task. The Python CPU was hitting 
almost 100% and it was taking time to generate the report - more than 300s.

To debug the issue, I broke the above code and made it simple to narrow 
down the issue.

So, I just did the following:

As = A.objects.all()

if As:
     print "hello"

In the above, the CPU was hitting almost 100% and was taking almost a 
second or  more before Hello was printed. I also did select_related and 
prefetch_related to check further.

Hence, the conclusion that the query was creating the CPU spike. 

Hope I am clear.

Thanks,


On Sunday, March 12, 2017 at 6:00:00 AM UTC+5:30, Vijay Khemlani wrote:
>
> "But the CPU usage and time taken are high" <- I'm assuming high 
> enough to be problematic for OP. 
>
> I'm seriously not following. Why are people suggesting reporting and 
> export software when OP hasn't even described the problem in detail. 
> It's not even clear whether the high cpu and time taken are due to the 
> basic query ("Model.objects.all()") or the further processing of the 
> report. 
>
> It could easily be a missing "select_related" which causes thousands 
> of joins inside a for loop. 
>
> On 3/11/17, James Schneider <[email protected] <javascript:>> wrote: 
> > On Mar 11, 2017 12:01 PM, "Vijay Khemlani" <[email protected] 
> <javascript:>> wrote: 
> > 
> > Am I the only one who thinks that generating a report over a set of 
> > just 10.000 records could be done in 10 - 20 secs unless there are 
> > some serious computations going on with that data? 
> > 
> > For a report I have to query around 200.000 records, with 
> > aggregations, and it takes less than a minute using the ORM. 
> > 
> > 
> > The OP never mentioned a time interval that I can find in this thread, 
> only 
> > CPU utilization. I can only imagine that the query is taking long enough 
> to 
> > notice the CPU utilization, which would be at least a few seconds. 
> > 
> > Querying and aggregating 200K records within the DB is not comparable to 
> > pulling 10K individual records and performing processing on each one. An 
> > ORM call with aggregation will perform a large majority of the work in 
> the 
> > DB, and the ORM simply wraps the response accordingly. 
> > 
> > -James 
> > 
> > -- 
> > You received this message because you are subscribed to the Google 
> Groups 
> > "Django users" group. 
> > To unsubscribe from this group and stop receiving emails from it, send 
> an 
> > email to [email protected] <javascript:>. 
> > To post to this group, send email to [email protected] 
> <javascript:>. 
> > Visit this group at https://groups.google.com/group/django-users. 
> > To view this discussion on the web visit 
> > 
> https://groups.google.com/d/msgid/django-users/CA%2Be%2BciWZFoHQD%3D9UpSQzmpzO70_7MXuw6J01myYrAQ4ZN-uX4g%40mail.gmail.com.
>  
>
> > For more options, visit https://groups.google.com/d/optout. 
> > 
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/django-users.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-users/4f3b29f5-b800-4bbc-bdb4-9b3e41ad2656%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to