Hello everyone. I started using Django about a week ago. I have a particular app which more or less accepts CSV files, Excel files, DBase files, and whatever else, takes the uploaded file, converts it to a tab-delimited format, and then does statistical work over it.
Originally this application was written in PHP by someone else. I've since taken it and rewrote it in Rails. Rails didn't have good libraries (I had to call out to Python programs anyway), and it was slow, so I rewrote it in Python, in the Pylons web framework. As I said, about a week ago I started using Django, and I more or less directly converted this app from Pylons to Django. Pretty much all the web app itself does is display a form to upload the file, it uploads the file, and then passes if off to an outside Python library which does the conversion to tab. Once that is done, the framework (in this case a Django view function) takes control again and does the statistical work over the tab-delimited file. The code more or less for the Pylons and Django versions is identical. There are obviously small changes here and there to fit each framework, but the controller or view code has changed very little. Monday I got the entire app converted over to Django. I uploaded my first file, a DBase file, and immediately noticed that it was taking *forever*. After 24 seconds, I finally got my stats page. This same process takes 3 seconds in Pylons. Something definitely wasn't right here - my Django app was actually slower than both my Rails and the PHP app. So I started doing some tests to isolate the problem. I brought the issue up in #django on freenode IRC, and someone immediately suggested that the problem might be the dev server. Know that the dev server very well could be at fault, I took a little script and got my Django app running on the exact same Paste WSGI server that my Pylons app was running on. Again, it took 24 seconds for it to run this 6 MB file. Continuing on this testing this morning, I realized that a very good way to test out exactly where bottleneck existed was to cut out the uploading process alltogether - if the process finished very quickly on a file that was already uploaded to the local filesystem, then the problem existed within how Django's actual upload process. Sure enough, when I had the process run on an already-uploaded file, the process took 3 seconds. So uploading the file was taking 21 out of 24 seconds. Again I brough this up in the IRC chat. Someone told me that nobody was going to take my serious because I wasn't running Django the 'preferred' way, ie, on mod_python/Apache. I really didn't think this was the limiting factor, but I installed Apache and mod_python and got it all setup anyway. Again, it took 24 seconds for this file to upload and process. That was 8x as long as my Pylons app running on its dinky little WSGI Python server. At this point I was able to narrow down the issue: * it had to do with Django's upload process * it was an equal problem on any server, whether Django's dev server, the Paste server, or Apache I ran some profiling in order to narrow the problem even further. This first link is a profile of the view that displays the form. This view actual doesn't do much, as I said, it pretty much just displays the form. When the form is submitted via a POST request, it is sent to this second view (the second link). This is where the upload takes place, the processing happens, and the stats are finally displayed. http://paste.e-scribe.com/1564/ http://paste.e-scribe.com/1565/ Someone suggested that an already pending patch would fix the problem. Ticket 1484, which has been superseeded by Ticket 2070 (http://code.djangoproject.com/ticket/2070) has to do with streaming uploads. This afternoon I applied the most recent patch in Ticket 2070, and suprisingly, not only did it work, it also didn't have any effect on the upload issue. Still the same 24 seconds. I also discovered some other strange stuff. The 6 MB file which I had been uploading was a DBase file. I uploaded a 7 MB Excel file, and it took 17 seconds. I uploaded a 1 MB Excel file and it took 2 seconds. I tried to upload a 13 MB CSV file and it was at 70+ seconds and still not finished. There doesn't seem to be any common pattern between all this. The filetype really shouldn't make any difference, because as I said earlier, both my Pylons app and Django app were using the same outside library in the same way in order to conver t the file. So I'm a bit stuck here. I'd love to use Django, but I cannot have it running 3x slower than another Python framework. We do a lot of file processing here. Hopefully with all this data someone will be able to come up with some kind of idea as to what the problem might be and what solution can be applied. Thanks, jp --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers -~----------~----~----~----~------~----~------~--~---