Hi, >2. Once this hits the trunk we'll need documentation about any >backwards incompatibilities this introduces and what the upgrade path >looks like (in the style of the docs on the >BackwardsIncompatibleChanges page). The translation document is great
Actually I don't know any backward-incompatible changes, at least none I am aware of. I think it should be possible to just run your project after switching to the i18n branch without any _required_ changes. >3. I don't like that the translation context can be specified in the >GET string (i.e. http://www.example.com/?django_language=en). I hate ... >whatever, but I'd rather that be relegated to a simple view (which >sets the language and then redirect back to the previous page) then >have it be an aspect of the middleware as it is now. Done. I added a view that can be easily hooked into the project and that will either redirect to an explicitely given URL or to the URL in the referrer header or to / (if none of both is given). This view will set the language into either the cookie or the session and then just redirect. >Similarly, >exposing LANGUAGES in DjangoContext seems like overkill; a template >tag seems like the best idea:: Done. Without the "load i18n", though - it just went into the defaulttags.py. I added a unittest for it, too. >4. I don't like that you have to explicitly provide the verbose_name >for fields if you want the field names translated; we specifically >removed verbose_name as a required attribute because it was a PITA, >so if there's a way around doing:: > > name = meta.CharField(_('Name')) Sorry, no can do. At least no way to do it really clean - the reason is, I need the _() hooks for the xgettext tool. It searches explicitely for function calls with a given name and string contents - so I can't just use the stuff that's in the assignement to the left of the = operator. A change like this would require writing our own xgettext replacement - but then, what with all those other modules that happen to use the same syntax, but don't require a translation in that place? If people want translated modules, they need to provide translation hooks. The only way out would be to do model introspection additionally to the xgettext run - but that's problematic, too. Because you can only introspect models that are loaded. But if I for example want to create translations, I might just not have a installed and _running_ django, I might just have a checkout of the subversion tree. So introspection is out, too. The main problem here is, I need to be able to pull out all translation hooks by just inspecting the sourcecode. And so I need explicit and unambiguous markers. >5. Similarly, I don't like the changes to the template code. Yeah, that part is the biggest problem with translations. The main problem here is: translation of sourcecode is easy. Translation of templates is hard. It's because templates are "turned around" with regard of code/text ratio. More text, less (hopefully) code. > <title>{{ _('Title') }}</title> Again, this is needed to mark the part that needs to be translated. If I don't mark that part, but just use HTML, I would need the string puller to know about HTML. But what if the user writes XHTML templates? Or when this is a YAML template? Or a simple CSV format? >And *I* barely understand what:: > > <p>{% i18n ngettext('There is %(count)d file', 'There are % >(count)d files', files|count) %}</p> Yes, that one usually puts people off, if they first work with translations. The problem here: string translation is easy. Pluralization is a real bastard. The main reason why pluralazation needs this rather clumsy format with two _full_ translations (and not just part translations) is that languages handle pluralizations differently. Some languages have only two forms: plural and singular. But even those languages might behave differently with regard to the zero: what if there are _no_ objects? Is zero a singular, or is it a plural? Depends on your language. But then there are languages that have a different concept. I call it "troll counting" after the nice description by Terry Prattchet how trolls count: one, two, three, many. :-) There are many languages around that not only have a singular and plural, but have special forms for one element, two elements, more than two elements. So you need to provide the sentence and the count. gettext provides the singular and plural form because gettext takes the stand that the source language must be english (or at least a language whose pluralization is identical to the one of english). So you provide the sentence in singular and plural form and provide the count. The translators will provide the forms that are needed by their language and will provide a tag in the .po file that tells the system to do the different pluralization. And if you think that's a rather weird case, have a look at the "sr" language file in conf/locale/sr/LC_MESSAGES/django.po - it's using pluralization. It provides the "Plural-Forms" header for that purpose that defines the rules when to pull what plural form. I don't even know wether the python gettext library supports that fully, as it looks rather complex ... >is doing, and "leaking" %-style string formatting into the template >code seems ugly. Specifically, I'd like the template language to be >as losely coupled to Python as possible; I'd like implementations of >the template language on other platforms to be possible. We need to pull text flow and named parameters together. Another problem of translations is: languages order words differently. Ever heard an asian guy speak english in a funny yoda-style? It's because order of words and concepts is differently. It's even very different between english and german - and those languages are quite near to each other. So you need to put parameters in a named way into the translation strings. That's needed, so that translators can reorder the whole sentence. You can't just construct a translation out of translated parts - that's one of the biggest problems, you allways _must_ take semantic blocks and translate them in one go. So the only option is some kind of string interpolation to take place. Wether the syntax should be pythons string interpolation or some faked django template interpolation is irrelevant for execution. But it's relevant for editors, because there are editors that support the formats. That's why the .po files carry funny comments like "#, python-format" - that's a tag to tell the system how to handle those. Strings that have similar text, but different placeholders, will be handled automatically by some tools. The gettext tools themselves do that - they produce "fuzzy" translations, when there is a string that is - after removing all placeholders - similar to some other string. It will just pull the translation from that other string in those cases. So changing this format might break some nice effects for translators and maybe make their work harder - it would be a deviation from the gettext formats, and I am especially reluctant to do _that_ (yes, gettext has explicit and official support for python string interpolation syntax). > <title>{% translate "Title" %}</title> That's already possible by {% i18n _('Title') %} - the {{ _('Title') }} is just a shortcut for that, because it became a bit tedious to add all those i18n tag invocations :-) And since there are other tags that can have string constants, I thought it would be best to allow the i18n translated string syntax everywhere where we have strings. At least everywhere where it's easy to add, like with resolve_variable and resolve_variable_with_filters (the tag itself would still need to be aware of possible translated strings - that's a problem that stemms from the fact that django doesn't have a central tag-bit-parser). > <p>{% translate %}Hello, {{ name }}, welcome to {{ site }}!{% >endtranslate %}</p> > <p>{% translate %}There are {{ count }} {% pluralize count >"file" "files" %}{% endtranslate %}</p> I could do that - I already have a template-transformer that turns the templates into something grokkable by xgettext. But as I said, the main problem here would be the deviation form the gettext syntax. And it would be another problem: the part in between the translate block (or some i18n block in my case) would have to be first collected _without_ rendering the interpolation elements - because we can only translate strings with placeholders. And what should happen if there are things in the blocktag like other block tags? Like this: {% translate %} This is a sentence {% if doit %}with something weird{% else %}without it{% endif %} {% endtranslate %} How should I handle this? I would have to pull _all_ text within a translate block tag and pull it together - uninterpreted! - as a string and store it in the .po files for translation. And what if that inner block tag is another translate tag? And what is a full non-option would be the pluralize thingy - in the light of different pluralizations it _can't_ possibly work. Actually I really dislike the pluralze tag ;-) We would still need something like: {% translate %} this is the singular case {% plural %} this is the plural case {% endtranslate %} But then there are still the problems about what to do with the inner block and possible block tags that are in there. The behaviour of the translate tag would be weirdly different from other tags: it wouldn't run it's inner nodelists through the template engine, but would first pull them together into a string, run that string through the translation engine and then reparses the resulting string and run _that_ through the template engine. Doesn't sound like an afternoon project ;-) >Thoughts? Yup. Loads of them. ;-) bye, Georg