-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi all,
In the spirit of making Django behave better as a Python library (c.f. Glyph's keynote at djangocon.us), I'd like to finally tackle removing the sys.path hacks in django.core.management.setup_environ. I'll give the full detailed rundown here on the current behavior and how I propose to change it. Fortunately, the fix isn't that complicated, and I think it's a no-brainer. For the impatient, you can skip straight to my proposed patch [2]. The tl;dr summary is that I think with some small changes to manage.py and the default project layout, we can fix some very common bugs and deployment problems, dramatically reduce the extent to which manage.py is "unknown magic" for newcomers to Django, and take a significant step towards being "just a Python library." This issue is tracked in ticket #15372 [1], though there's a lot more detailed info in this message than on that ticket. The problem =========== Right now "django-admin.py startproject mysite" generates this: mysite/ __init__.py manage.py settings.py urls.py This layout is the root of the problem, because it commingles manage.py (a command-line script entry point that should not be imported) with importable modules inside a package. When Python runs a command-line script it puts the script's directory onto sys.path (even if you run "python /full/path/to/mysite/manage.py" from somewhere else entirely). So with this layout, via manage.py, "import settings" and "import urls" will both work, and if we add any other modules or packages to this directory, they are also importable as top-level modules/packages. This is a standard feature of Python. But "mysite" is a package, and Django wants us to be able to import "mysite.settings" and "mysite.urls" (in fact, the latter is the default ROOT_URLCONF setting). In order to make that possible, setup_environ does some magic: it finds the parent directory of "mysite", temporarily adds it to sys.path, imports mysite, and then reverts sys.path. Now we can either "import settings" or "import mysite.settings", and both will work. Same is true for any other modules or packages (e.g. apps) we add under mysite/. How nice! Not really. This bit of clever causes a boat-load of problems: 1. Python's importer doesn't know that the same module imported under two different names is the same module, so it imports it twice. Which means module-level code can run twice, which causes really confusing bugs. 2. People write code that imports things inconsistently (sometimes with the project prefix, sometimes without it), and then when they go to deploy (without manage.py or setup_environ), their imports break and they don't understand why. This comes up frequently in #django. In order to fix it they either have to make all their imports consistent, or add both the inner and outer directories to sys.path/PYTHONPATH in their deployment setup. Inconsistent imports should fail fast, not when you move to production. 2a. Even worse, our tutorial now encourages people to make their imports inconsistent! ROOT_URLCONF is "mysite.urls" but we tell people to use just "polls" rather than "mysite.polls" - so any tutorial-based project now requires the double import paths in order to even run. 3. Since it's not obvious what Django is doing, people don't always realize that the name of the outer directory matters, they think it's just a container (which is what it ought to be). If they commit the startproject layout at the top of a VCS repo, then check it out under a different name, imports break. (If they check it out under a name that isn't a valid Python module name, setup_environ breaks entirely). 4. The temporary addition to sys.path can cause unrelated packages adjacent to the project to be imported, if the import of the project module triggers an import of a name that happens to exist adjacent. This is extremely difficult to track down; since Django restores sys.path, it appears the adjacent package is somehow imported without even being on sys.path. This can catch very experienced Python devs; I once spent an hour on IRC working with Ned Batchelder to track this down in his project. 5. Understanding Python import paths is hard enough for beginners. The Django tutorial could help them on the road to understanding, with no loss in usability; pulling a trick like this behind the scenes helps them stay confused instead. The solution ============ The key is to fix the "startproject" layout: mysite/ manage.py mysite/ __init__.py settings.py urls.py The outer "mysite/" is now just a container, not a Python package. It can be renamed without consequence. The inner "mysite/" is the Python package, and contains settings and urls. This means they are no longer importable as top-level modules; they must always be "mysite.settings" and "mysite.urls", which is better than having them pollute the top-level namespace. Python's built-in sys.path handling for scripts means that sys.path will be correct using manage.py as the entry-point (or the wsgi.py entry point that may be added shortly for #16360), and people not using either of those entry points only need to add one sys.path entry rather than two. With that simple change, setup_environ and execute_manager (in django.core.management) both become superfluous, and manage.py can be made much simpler. Since it no longer needs to muck with sys.path, its only remaining function is to set a default value for DJANGO_SETTINGS_MODULE, if it isn't already set. This also gets rid of the double import of the settings module that manage.py currently causes, which Graham Dumpleton discusses here [3]. And it's now obvious from inspection what exactly manage.py does, which certainly wasn't true before. All win so far. But what about... Backwards compatibility - ----------------------- For new projects, there's no backward-compat issue. For existing projects that upgrade to Django 1.4, we leave setup_environ and execute_manager in place (so the old-style manage.py will continue to work as it always did), but we start them on a deprecation path. When the project decides to make the update (possibly when they read the 1.4 release notes, probably when they start getting loud DeprecationWarning in 1.5), here's what they need to do: 1. Replace the contents of manage.py with the new version (which is five lines long and can be included in its entirety in the release notes). 2. Make their imports consistent. This means that anything imported with the project-module prefix needs to live inside the project module, and anything imported as a top-level module (that isn't a separately-installed dependency) needs to live outside the project module next to manage.py. And, of course, a given module needs to consistently be imported one way or the other. I think this is a reasonable amount of work for a deprecation upgrade, and a good trade-off for the benefits. Really, step 2 is cleanup of broken code where manage.py is currently hiding its brokenness. Other issues ============ What to do about startapp? - -------------------------- Currently, startapp always creates the new app inside the "project directory", currently defined as "the directory where the settings module lives" (and found via a combination of baling wire and ugly hackery). If we want to maintain this behavior of startapp, we need to change the tutorial back to using "mysite.polls" in imports rather than "polls". I have a version of my proposed patch that does this [4]. My preference, however, and what I've implemented in my preferred patch [2], is to change startapp so that it always creates the app in the current directory, rather than jumping through hoops to find the "project directory". This makes startapp useful for starting reusable apps as well as project-tied apps, and allows us to keep "polls" in the tutorial rather than "mysite.polls" (because if the tutorial tells the user to run "python manage.py startapp polls", the polls app will be created next to manage.py, which means it's imported as a top-level package). Removing the restrictions on startproject - ----------------------------------------- My patch also removes the current restriction that "startproject" cannot be run if there is a DJANGO_SETTINGS_MODULE environment variable. I don't think there's any good reason for this restriction, and it can be really irritating and confusing. (This also means you can now run startproject via manage.py, which doesn't make a ton of sense, but also doesn't actually hurt anything). What's the impact on adding a standard WSGI entrypoint? - ------------------------------------------------------- #16360 [5] has a patch to add a "wsgi.py" file to the default project, which will serve as a standard WSGI entrypoint that can be used both by runserver and by production WSGI servers. I love the idea and the patch - - except that its approach to easing the deployment path is to extend the current sys.path-hacking of setup_environ beyond just manage.py and into every WSGI deployment. But if we first fix the sys.path-hacking as I propose, the wsgi.py file will go next to manage.py in the default project layout. Since WSGI servers chdir() to the directory of the wsgi file you point them at, no sys.path additions (whether manually or magically in Django) will be needed in order to deploy a Django project under a production WSGI server. Again, win. Next steps ========== Unless there are significant objections, I'd like to commit this in the next few days and get it into 1.4. I'll also modify the patch on #16360 to take advantage of the new layout; hopefully that can get into 1.4 as well. Let me know if you see something I've overlooked! Thanks, Carl [1] https://code.djangoproject.com/ticket/15372 [2] https://github.com/django/django/pull/62 [3] http://blog.dscpl.com.au/2010/03/improved-wsgi-script-for-use-with.html [4] https://github.com/carljm/django/compare/master...t15372-app-in-project [5] https://code.djangoproject.com/ticket/16360 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk6TbDIACgkQ8W4rlRKtE2dhCQCgzoXxt245wB7UVkaDeeK0U7NO 2hsAn2Wx90fdbJte2lPfW4onyoIqbyf0 =MUge -----END PGP SIGNATURE----- -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.