Hi Vjiay,
My apologies if the scenario is not clear. Following are the details:
Lets say there is a Model A (with fields, foreignkeys and ManyToMany
relationships with other models). There are 10k records for A. Lets say
following is the pseudo code for the report:
As = A.objects.all()
for a in As:
retrieve other related data from associated models.
Write Data in a csv report file
There's no CPU intensive work above - its just fetching data. I had used
select_related and prefetch_related to reduce DB queries - otherwise MySQL
CPU usage was going up.
The above was run in Celery as separate task. The Python CPU was hitting
almost 100% and it was taking time to generate the report - more than 300s.
To debug the issue, I broke the above code and made it simple to narrow
down the issue.
So, I just did the following:
As = A.objects.all()
if As:
print "hello"
In the above, the CPU was hitting almost 100% and was taking almost a
second or more before Hello was printed. I also did select_related and
prefetch_related to check further.
Hence, the conclusion that the query was creating the CPU spike.
Hope I am clear.
Thanks,
On Sunday, March 12, 2017 at 6:00:00 AM UTC+5:30, Vijay Khemlani wrote:
>
> "But the CPU usage and time taken are high" <- I'm assuming high
> enough to be problematic for OP.
>
> I'm seriously not following. Why are people suggesting reporting and
> export software when OP hasn't even described the problem in detail.
> It's not even clear whether the high cpu and time taken are due to the
> basic query ("Model.objects.all()") or the further processing of the
> report.
>
> It could easily be a missing "select_related" which causes thousands
> of joins inside a for loop.
>
> On 3/11/17, James Schneider <[email protected] <javascript:>> wrote:
> > On Mar 11, 2017 12:01 PM, "Vijay Khemlani" <[email protected]
> <javascript:>> wrote:
> >
> > Am I the only one who thinks that generating a report over a set of
> > just 10.000 records could be done in 10 - 20 secs unless there are
> > some serious computations going on with that data?
> >
> > For a report I have to query around 200.000 records, with
> > aggregations, and it takes less than a minute using the ORM.
> >
> >
> > The OP never mentioned a time interval that I can find in this thread,
> only
> > CPU utilization. I can only imagine that the query is taking long enough
> to
> > notice the CPU utilization, which would be at least a few seconds.
> >
> > Querying and aggregating 200K records within the DB is not comparable to
> > pulling 10K individual records and performing processing on each one. An
> > ORM call with aggregation will perform a large majority of the work in
> the
> > DB, and the ORM simply wraps the response accordingly.
> >
> > -James
> >
> > --
> > You received this message because you are subscribed to the Google
> Groups
> > "Django users" group.
> > To unsubscribe from this group and stop receiving emails from it, send
> an
> > email to [email protected] <javascript:>.
> > To post to this group, send email to [email protected]
> <javascript:>.
> > Visit this group at https://groups.google.com/group/django-users.
> > To view this discussion on the web visit
> >
> https://groups.google.com/d/msgid/django-users/CA%2Be%2BciWZFoHQD%3D9UpSQzmpzO70_7MXuw6J01myYrAQ4ZN-uX4g%40mail.gmail.com.
>
>
> > For more options, visit https://groups.google.com/d/optout.
> >
>
--
You received this message because you are subscribed to the Google Groups
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/django-users.
To view this discussion on the web visit
https://groups.google.com/d/msgid/django-users/4f3b29f5-b800-4bbc-bdb4-9b3e41ad2656%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.