Re: Http handler may assign urlresolver-related data to request
I added a patch to https://github.com/django/django/pull/399 -- Let me know what you think, if I don't get any negative feedback I'll commit it before the feature freeze. Cheers, Florian On Tuesday, September 25, 2012 4:00:48 PM UTC+2, Florian Apolloner wrote: > > Hi Benoit, > > as a matter of fact I want to add that to 1.5, and I started playing with > a small testapp to see what's needed: > https://github.com/apollo13/django-locale-switcher -- My conclusion is > also that stuffing the resolver_match on the request would be the best > option. We have another five days till feature freeze and my time is > currently needed somewhere else, so I probably only have time the last 1 or > 2 days. If you can open a pull request with a fix + tests + docs I'll > happily review and apply it. If you don't, I still might be able to do it > the last days but that depends on my time, so no promises there. > > Looking forward to see some code from you, > Florian > -- You received this message because you are subscribed to the Google Groups "Django developers" group. To view this discussion on the web visit https://groups.google.com/d/msg/django-developers/-/FlEX2jlSnuoJ. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Schema Alteration update
So, the patch [1] is looking alright, but after some consideration I think it's going to be best to leave this until just after the 1.5 branch has happened and then merge it in as part of the 1.6 cycle. My reasoning is thus: - The whole point of getting something into 1.5 was so I could build migrations as a third-party app. Since I now plan to make it more tightly integrated, that's no longer useful. - The current patch is a bit ugly with the way it deals with AppCache - discussion has led to a better, less hacky solution of implementing models in a way where they can register into a different appcache. - It makes more sense to have the schema alteration, Field API changes and migration runners all come out at once as otherwise it's not really a user-facing solution. Thus, I'd like to wait until 1.5 is branched off, and then merge in an improved patch which has better support for making models that don't register themselves, following that up at some point in the 1.6 cycle with patches that implement a new field description API (so fields can be serialised in a more human-readable way than pickle) and then finally the migration runner/dependency solver, meaning 1.6 would ship with a full migrations system in place. Input on this plan is welcome - I just didn't want to rush something in when it can be more polished and useful next release! It also means that this can hopefully be integrated more smoothly with app-loading when that lands. Andrew -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Feature request: collectstatic shouldn't recopy files that already exist in destination
Hey all! This is a feature request / proposal (one which I'm willing to build out, given that I've already developed a solution for my own uploader). I run a consulting business that helps small startups build initial MVPs. When the time ultimately comes to deciding how to store static assets, my preference (as is that of many others) is to use Amazon S3, with Amazon CloudFront acting as a CDN to the appropriate buckets. For the purposes of this ticket, s/S3/your object storage of choice/g. Now, Django has an awesome mechanism for making sure static assets are up on S3. With the appropriate static storage backend, running ./manage.py collectstatic just searches through your static media folders and copies the files. The problem I've run into is that collectstatic copies all files, regardless of whether they already exist on the destination. Uploading 5-10MB of files is pretty wasteful (and time consuming) when none of the files have changed and no new files have been added. As I wrote in the trac ticket (https://code.djangoproject.com/ticket/19021), my current solution was to write a management command that does essentially the same thing that collectstatic does. But, there are a few differences. Here's a rundown (copied straight from the trac ticket). I currently solve this problem by creating a file containing metadata of > all the static media at the root of the destination. This file is a JSON > object that contains file paths as keys and checksum as values. When an > upload is started, the uploader checks to see if the file path exists as a > key in the dictionary. If it does, it checks to see if the checksums have > changed. If they haven't changed, the uploader skips the file. At the end > of the upload, the checksum file is updated on the destination. I'll contribute the patch. I know there is not a lot of time before the feature freeze, but I'm willing to make this happen if there's interest. If we don't want to change behavior, perhaps adding a flag such as --skip-unchanged-files to the collectstatic command is the way to go? All the best, Dan -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: New predicate functionality for Q objects
So a number of issues have come up in the review of this feature that I'd like to summarize here. The first boils down to feature basically being another case of an: http://en.wikipedia.org/wiki/Object-relational_impedance_mismatch That is, the idea of a lookup matching against an instance in memory, will never completely mimic the behavior of that lookup when generating a DB query through the ORM. It is a situation where most of the common use cases can probably be handled - but there are probably a number of edge cases lurking that would add some degree of maintenance burden were this feature to land. The second issue is that this adds more code that relies on a currently imperfect lookup system. Current query_terms, and now, for predicates, matching_functions live as part of the query object - a non-public object that, in part, determines what is a valid lookup. This all could be refactored to be more associated with the fields themselves. This is mostly an internal issue, as I feel the feature could always be refactored to take advantage of improvements to the internals without changing the public API of the predicate feature. While explicit manager passing would probably no longer be required with improved internals - it could still be supported/ignored, and the explicit use of a manager is already in and of itself a pretty uncommon edge case requirement in the current implementation. For a ticket that tracks the lookup refactoring ideas, see: https://code.djangoproject.com/ticket/16187 Finally - along with the ORM mismatch issue, there exists some potential for abuse of this feature that would result in very poor performance. This is a case of: can the documentation make this clear enough that reasonable people won't shoot themselves in the foot. I'm not overly concerned about giving people enough rope to hurt themselves - but even a reasonable dev not doing any profiling could find themselves using this feature with either large arrays of instances, or with extensive conditions that follow deep relationships where the ORM and querysets would do a better job leveraging your database. Exactly where that line exists depends on too many factors to lay out out clearly in documentation. So that is an update on where this feature stands If it does not make it into 1.5 - I can probably factor out most of the improvements back into the external django-predicate package, with a couple somewhat ugly hacks to support the GIS related matches. There was some brief debate on IRC about how Django can best unearth edge case bugs in features that are candidates for inclusion into core, without the exposure that being in core offers. Thanks to everyone who has participated in the discussion so far - most of which is occurring on the PR, not the ticket: https://github.com/django/django/pull/388 -Preston On Saturday, September 22, 2012 12:55:26 PM UTC-7, ptone wrote: > > I've implemented predicate test like functionality for Q objects. > > https://code.djangoproject.com/ticket/18931 > > In brief, this lets you define a condition in a Q object and then test > whether a model instance matches the condition. > > I believe this to be a relatively complete patch, and would appreciate any > review people can offer. > > To be clear, a review of the documentation as a user is also helpful, so > don't be shy ;-) > > I'm hoping to land this for 1.5 > > Thanks, > > -Preston > > -- You received this message because you are subscribed to the Google Groups "Django developers" group. To view this discussion on the web visit https://groups.google.com/d/msg/django-developers/-/T1mOyjTIjKUJ. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Feature request: collectstatic shouldn't recopy files that already exist in destination
I like this feature and have recently been thinking of implementing such on my own myself. +1 for the feature request. On 9/27/12, Dan Loewenherz wrote: > Hey all! > > This is a feature request / proposal (one which I'm willing to build out, > given that I've already developed a solution for my own uploader). > > I run a consulting business that helps small startups build initial MVPs. > When the time ultimately comes to deciding how to store static assets, my > preference (as is that of many others) is to use Amazon S3, with Amazon > CloudFront acting as a CDN to the appropriate buckets. For the purposes of > this ticket, s/S3/your object storage of choice/g. > > Now, Django has an awesome mechanism for making sure static assets are up > on S3. With the appropriate static storage backend, running ./manage.py > collectstatic just searches through your static media folders and copies > the files. > > The problem I've run into is that collectstatic copies all files, > regardless of whether they already exist on the destination. Uploading > 5-10MB of files is pretty wasteful (and time consuming) when none of the > files have changed and no new files have been added. > > As I wrote in the trac ticket > (https://code.djangoproject.com/ticket/19021), > my current solution was to write a management command that does essentially > the same thing that collectstatic does. But, there are a few differences. > Here's a rundown (copied straight from the trac ticket). > > I currently solve this problem by creating a file containing metadata of >> all the static media at the root of the destination. This file is a JSON >> object that contains file paths as keys and checksum as values. When an >> upload is started, the uploader checks to see if the file path exists as >> a >> key in the dictionary. If it does, it checks to see if the checksums have >> changed. If they haven't changed, the uploader skips the file. At the end >> of the upload, the checksum file is updated on the destination. > > > I'll contribute the patch. I know there is not a lot of time before the > feature freeze, but I'm willing to make this happen if there's interest. > > If we don't want to change behavior, perhaps adding a flag such as > --skip-unchanged-files to the collectstatic command is the way to go? > > All the best, > Dan > > -- > You received this message because you are subscribed to the Google Groups > "Django developers" group. > To post to this group, send email to django-developers@googlegroups.com. > To unsubscribe from this group, send email to > django-developers+unsubscr...@googlegroups.com. > For more options, visit this group at > http://groups.google.com/group/django-developers?hl=en. > > -- Sent from my mobile device -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Django Oracle backend vs. numbers
Hi all, On Sunday 23 September 2012 15:05:21 Anssi Kääriäinen wrote: > > Doing final polish for Ian's patch and providing benchmark results for it > will get this patch closer to commit. > I made some initial benchmarks, and they do not reflect the change I've seen in the actual application. I don't yet know what to make of this. With this model: class AllSorts(models.Model): start = models.IntegerField() square = models.IntegerField() sqrt = models.FloatField() cubicroot = models.DecimalField(decimal_places=20, max_digits=25) name = models.CharField(max_length=100) slug = models.SlugField() Calling this function for fetching: @transaction.commit_on_success def getAllSorts(): errors = 0 print "Getting %d objects" % AllSorts.objects.all().count() for alls in AllSorts.objects.all(): if alls.cubicroot.is_nan(): errors+=1 print "Got %d errors in the decimal" % errors I made 10,000 records. Getting them all (and making sure the decimal did not return a NaN) takes ~1.95 seconds on the old code, ~1.75 seconds on the new code, and ~1.6 seconds on the new-new code (where we use a new float_is_enough option, which tells the backend to only use Decimals for DecimalFields). Adding a calculated expression: @transaction.commit_on_success def getAllSorts(): errors = 0 print "Getting %d objects" % AllSorts.objects.all().count() for alls in \ AllSorts.objects.all().extra(select={'e':'"START"+(2*"CUBICROOT")'}): if alls.cubicroot.is_nan(): errors+=1 print "Got %d errors in the decimal" % errors Did not change the results significantly, despite my expectations. I'm not sure yet what we did in the application that made this make a huge difference. I see a significant difference here, but not amazing (though ~20% is nothing to scoff at). Perhaps making more request of fewer rows, counter- intuitively makes more of a difference. Perhaps more strings. But I know I saw a much bigger difference than this. The new-new code (sans documentation, at this point) is available for your review and benchmarking as Pull Request 402, or the oracle-float-exprs branch in my fork. The benchmarking project is attached to this message. Thanks for your attention, Shai. -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en. bench.tgz Description: application/compressed-tar
Re: Feature request: collectstatic shouldn't recopy files that already exist in destination
On Thu, Sep 27, 2012 at 6:51 PM, Dan Loewenherz wrote: > Hey all! > > This is a feature request / proposal (one which I'm willing to build out, > given that I've already developed a solution for my own uploader). > > [...] > > I'll contribute the patch. I know there is not a lot of time before the > feature freeze, but I'm willing to make this happen if there's interest. > Yes, please. I've been wanting this myself. This should IMHO be the default for collectstatic, but having a flag to *force* copying all files. -- *Anders Steinlein* *Eliksir AS* http://e5r.no E-post: and...@e5r.no Mobil: +47 926 13 069 Twitter: @asteinlein -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Feature request: collectstatic shouldn't recopy files that already exist in destination
Good idea, but shouldn't it be a per-storage thing? Perhaps this could be done with a couple of callbacks in the collectstatic run: - Before collectstatic starts, so the storage backend can pick up its inventory from the remote - One called for each file that would be copied, and that can veto a copy operation - At the end of the collectstatic run, to update the inventory That would make things customisable even for strange setups like storing files in a database or whatnot? Cheers, mjl -- You received this message because you are subscribed to the Google Groups "Django developers" group. To view this discussion on the web visit https://groups.google.com/d/msg/django-developers/-/DLpjFU0NS_4J. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Feature request: collectstatic shouldn't recopy files that already exist in destination
On Thu, Sep 27, 2012 at 12:51 PM, Dan Loewenherz wrote: > The problem I've run into is that collectstatic copies all files, > regardless of whether they already exist on the destination. No, as noted in the ticket, which has been closed needsinfo, staticfiles already only copies modified files. And I don't think that's a recently-added feature, so I suspect the behavior you are seeing has more to do with the S3 side than the staticfiles code. This post: http://stackoverflow.com/questions/6618013/django-staticfiles-and-amazon-s3-how-to-detect-modified-files looks enlightening though even that is not all that new. It sounds like the problem here is efficiently getting the last-modified time for the files out in S3, not with staticfiles. Karen -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Feature request: collectstatic shouldn't recopy files that already exist in destination
Just updated the ticket. As I commented, the heuristic for checking if a file has been modified lies in line 282 of collectstatic.py: *if not prefixed_path in self.copied_files:* * return self.log("Skipping '%s' (already copied earlier)" % path) * https://github.com/django/django/blob/master/django/contrib/staticfiles/management/commands/collectstatic.py#L282 This seems off, since a path may stay the same but a file's contents may change. Also, the existing functionality doesn't work when multiple people are running collectstatic (or the same person is running it on multiple computers). Dan On Thu, Sep 27, 2012 at 3:12 PM, Karen Tracey wrote: > On Thu, Sep 27, 2012 at 12:51 PM, Dan Loewenherz wrote: > >> The problem I've run into is that collectstatic copies all files, >> regardless of whether they already exist on the destination. > > > No, as noted in the ticket, which has been closed needsinfo, staticfiles > already only copies modified files. And I don't think that's a > recently-added feature, so I suspect the behavior you are seeing has more > to do with the S3 side than the staticfiles code. This post: > > > http://stackoverflow.com/questions/6618013/django-staticfiles-and-amazon-s3-how-to-detect-modified-files > > looks enlightening though even that is not all that new. It sounds like > the problem here is efficiently getting the last-modified time for the > files out in S3, not with staticfiles. > > Karen > > -- > You received this message because you are subscribed to the Google Groups > "Django developers" group. > To post to this group, send email to django-developers@googlegroups.com. > To unsubscribe from this group, send email to > django-developers+unsubscr...@googlegroups.com. > For more options, visit this group at > http://groups.google.com/group/django-developers?hl=en. > -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Feature request: collectstatic shouldn't recopy files that already exist in destination
Hi Dan, On 09/27/2012 04:47 PM, Dan Loewenherz wrote: > Just updated the ticket. > > As I commented, the heuristic for checking if a file has been modified > lies in line 282 of collectstatic.py: > > *if not prefixed_path in self.copied_files:* > * > return self.log("Skipping '%s' (already copied earlier)" % path) > > * > https://github.com/django/django/blob/master/django/contrib/staticfiles/management/commands/collectstatic.py#L282 > > This seems off, since a path may stay the same but a file's contents may > change. That's not checking whether the file has been modified, that's checking whether the same source file was previously copied in the same collectstatic run (due to overlapping/duplicate file sources of some kind). The check for modification date is up on line 234 in the delete_file method, which is called by both link_file and copy_file. Carl -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Schema Alteration update
On Wed, Sep 26, 2012 at 7:14 PM, Andrew Godwin wrote: > So, the patch [1] is looking alright, but after some consideration I think > it's going to be best to leave this until just after the 1.5 branch has > happened and then merge it in as part of the 1.6 cycle. > > My reasoning is thus: > > - The whole point of getting something into 1.5 was so I could build > migrations as a third-party app. Since I now plan to make it more tightly > integrated, that's no longer useful. Have I missed part of the discussion here? At DjangoCon, South was still going to exist (as the "smarts" part of the problem) -- has this changed? > - The current patch is a bit ugly with the way it deals with AppCache - > discussion has led to a better, less hacky solution of implementing models > in a way where they can register into a different app cache. This point by itself strikes me as a good reason to punt to 1.6; Preston's patch for the app refactor is *almost* ready, and it's going to have some pretty profound implications for this point. > - It makes more sense to have the schema alteration, Field API changes and > migration runners all come out at once as otherwise it's not really a > user-facing solution. Agreed. > Thus, I'd like to wait until 1.5 is branched off, and then merge in an > improved patch which has better support for making models that don't > register themselves, following that up at some point in the 1.6 cycle with > patches that implement a new field description API (so fields can be > serialised in a more human-readable way than pickle) and then finally the > migration runner/dependency solver, meaning 1.6 would ship with a full > migrations system in place. > > Input on this plan is welcome - I just didn't want to rush something in when > it can be more polished and useful next release! It also means that this can > hopefully be integrated more smoothly with app-loading when that lands. I has a sad, but I understand why. Sounds like a reasonable approach to me. Russ %-) -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Schema Alteration update
On Fri, Sep 28, 2012 at 4:55 AM, Russell Keith-Magee wrote: > Have I missed part of the discussion here? At DjangoCon, South was > still going to exist (as the "smarts" part of the problem) -- has this > changed? Obviously nothing's really decided, but I've been asking Andrew to push for getting full-blown solution into core. I just don't see the point in beating around the bush any more. The lack of a built-in schema migration tool in Django is a major wart, so let's just fix it. Jacob -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Schema Alteration update
On Fri, Sep 28, 2012 at 8:29 AM, Jacob Kaplan-Moss wrote: > On Fri, Sep 28, 2012 at 4:55 AM, Russell Keith-Magee > wrote: >> Have I missed part of the discussion here? At DjangoCon, South was >> still going to exist (as the "smarts" part of the problem) -- has this >> changed? > > Obviously nothing's really decided, but I've been asking Andrew to > push for getting full-blown solution into core. I just don't see the > point in beating around the bush any more. The lack of a built-in > schema migration tool in Django is a major wart, so let's just fix it. No disagreement from me that this is a big wart that needs to be addressed. I just missed the memo about us merging the smarts bit into trunk. Russ %-) -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Feature request: collectstatic shouldn't recopy files that already exist in destination
On Thu, Sep 27, 2012 at 4:13 PM, Carl Meyer wrote: > Hi Dan, > > On 09/27/2012 04:47 PM, Dan Loewenherz wrote: > > Just updated the ticket. > > > > As I commented, the heuristic for checking if a file has been modified > > lies in line 282 of collectstatic.py: > > > > *if not prefixed_path in self.copied_files:* > > * > > return self.log("Skipping '%s' (already copied earlier)" % path) > > > > * > > > https://github.com/django/django/blob/master/django/contrib/staticfiles/management/commands/collectstatic.py#L282 > > > > This seems off, since a path may stay the same but a file's contents may > > change. > > That's not checking whether the file has been modified, that's checking > whether the same source file was previously copied in the same > collectstatic run (due to overlapping/duplicate file sources of some kind). > > The check for modification date is up on line 234 in the delete_file > method, which is called by both link_file and copy_file. > Thanks, I missed that. I still see an issue here. In any sort of source control, when a user updates their repo, local files that were updated remotely show up as modified at the time the repo is cloned or updated, not when the file was actually last saved by the last author. You then have the same scenario I pointed to earlier: when multiple people work on a project, they will re-upload the same files multiple times. Don't get me wrong here--I'm happy to see there is some sort of check here to avoid collectstatic'ing the same file, but I think it's warranted to push back on the use of last modified as the heuristic. I think using a checksum would work much better, and solves the problem this is trying to solve in a much more foolproof way. With all this said, I don't think this logic belongs in a "delete_file" method. IMO it'd be better to separate out the logic of "does this file already exist, if so, skip it" from "delete this file if it exists on the target" (delete_file). @Karen--thanks for digging up that SO post. It actually is super relevant here. As mentioned, when uploading files to S3, it's quite time consuming to perform a network round trip to pick up file metadata (such as last modified time) for each file you're uploading. This was initial reason I chose to store this data in a single location that I'd only need to grab once. Dan -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Feature request: collectstatic shouldn't recopy files that already exist in destination
+1 I like this implementation. -- You received this message because you are subscribed to the Google Groups "Django developers" group. To view this discussion on the web visit https://groups.google.com/d/msg/django-developers/-/eKXUlc0TKgYJ. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.