Sjur N�rsteb� Moshagen wrote:
Yesterday Upayavira wrote:
Introducing the I18NMatcher --------------------------- Here's a sample sitemap snippet:
<map:match pattern="**.html"> <map:match type="i18n" src="content/*/{1}.xml"> <map:generate src="{source}"/> <map:transform src="foo.xsl"/> <map:transform type="i18n"> <map:parameter name="locale" value="{locale}"/> </map:transform> <map:serialize type="html"/> </map:match> </map:match>
Once an ordinary wildcard matcher has done its job, in comes the i18n matcher. Its job is to see whether it can find a suitable source document for the requested page. The * is used to symbolise the place where the locale is to be placed. If a match is successful, it will make sitemap variables available for the source that was found, and the locale that matched.
Now, this seems to be quite in keeping with the Cocoon sitemap model, and gives some rather nice, flexible functionality.
What do you all think?
This looks great! Exactly what I need;)
Firstly, thank you for your thorough response!
A couple of small questions for clarification:
1) The requested locale might be complex in the sense that it contains more information than merely the preferred language (e.g. en_GB, de_AU, sv_FI, ru_EE-KOI8). The {source} returns the best match given the available documents, but what would the {locale} be, given that {source} is not a complete match for the requested locale? One beneficial approach would IMHO be to return the "complete" locale that matched the returned document. To illustrate:
The locales requested as above: en_GB, de_AU, sv_FI, ru_EE-KOI8 Available documents: content/de/foo.xml, content/sv/foo.xml {source} = content/de/foo.xml {locale} = de_AU
The way it is currently coded, if de_AU was not found, it wouldn't find de. Which is not ideal. I'll extend it so that for ru_EE-KOI8 it will try:
* ru_EE-KOI8
* ru_EE
* ru
in turn. Is that the correct behaviour?
This way it is possible for the subsequent i18n processing to cater for country-(or whatever)-specific requirements in menus etc. without the document itself necessarily being adjusted for such variation.Then, we can have {matched-locale} being the one that was actually matched, e.g. ru_EE, {full-locale} being the full, original locale that caused the match, e.g. ru_EE-KOI8. That's easy, it is just a question of chosing decent names.
2) What happens if the locale of the returned document is not available in the rest of the i18n chain? There are at least two possible solutions:Making various bits of the locale available would allow the site developer handle this as they please. {lang}, {locale}, {encoding}, etc, etc as well as the above.
- just use the default locale
- provide in {locale} not only the returned locale of the document, but the whole list of preferred locales _from the document match on_. In the example above that would have given:
{locale} = de_AU, sv_FI, ru_EE-KOI8
That would be {locales}Pros: This might be considered a better aproach by the user, as the further i18n processing will return locales more preferred than the defaultSite default locale is defined within the config of the matcher at the top of the sitemap. It could easily be overridden with a <map:parameter name="default-locale" value="xxxx"/> within the matcher itself.
Cons: This might appear confusing to the user, as it might not seem to be any connection between the locale of the document and the locale of the surrounding widgets (menus, dates, etc.).
One could even let this choice be an option, and let the webmaster/developer decide - in the end, it is up to her/him to decide whether such a situation will arise in the first place;) (by allowing document languages that are not available in other parts of the site/webapp/i18n processing).
3) How do you define the 'site default locale'? In Cocoon's existing i18n it is done through a translation file without locale tags, but that's hardly appropriate in this case. At the same time it would be nice to define the default language within Cocoon, and not rely on external settings (e.g. in the servlet, web server, etc.). Suggestions?
When I've finished implementing this, I'll go onto extend the CLI to be able to work effectively with this kind on i18n site, enableing it to crawl a site for each of a range of locales. But that's for another time.
Looking forward to it;)))
Let's get this first bit working, first!
Thanks. I'm glad to have got this one off the ground, at last, and to have at last worked out a way that isn't just way too complicated for sitemap developers to understand.Great work, Upayavira, keep up!
Regards, Upayavira
