In article <[EMAIL PROTECTED]>, Boris Zbarsky <[EMAIL PROTECTED]> wrote:
> Henri Sivonen wrote: > > > More often in practice such anchors get styled as links which isn't cool > > at all. Depends on the style sheet in use, of course. > > This will be OUR stylesheet. I trust we can avoid such mistakes. > > > Suggested replacement: > > <li><p><strong>Use the ISO-8859-1 aka. Latin-1 character > > encoding.</strong> > > Why is this there, if I may ask? Back in 1998 almost everyone creating an English-language site was using US-ASCII or ISO-8859-1, because ISO-8859-1 was *the* character encoding for HTML before anything else was officially taken into account. The original requirement is mostly about not using the non-ISO CP1252 characters that Windows users might accidentally insert in documents and about warning clueless Mac users who don't realize MacRoman is not ISO-8859-1. > Why not use something like UTF8 instead? Some reasons for not using UTF-8 on www.mozilla.org: * Many American and European contributors use Emacs but haven't bothered to figure out how to make it use UTF-8. If UTF-8 documents were allowed, Emacs users could easily introduce invalid byte sequences to the files. * Currently the pages served by www.mozilla.org don't come with a proper charset parameter, which is bad. Using any encoding besides ISO-8859-1 while at the same time banning the <meta> thingy would make matters even worse, because then even Americans and Western Europeans would have an unpleasant encounter with the Character Encoding menu (which would not need to exist if people used HTTP features right). Of course, the right way to approach the issue would be migrating to Apache with contributor- writable .htaccess files, but I've been around for long enough to remember the time when Gerv was drafting the newsgroup reorg document, so I'm not overly optimistic. * Doctor isn't UTF-8-aware. It isn't ISO-8859-1-aware, either. (Dodging this same issue early on has come to haunt Bugzilla later...) (Personally, if I were to write a content management system from scratch now, I'd go with UTF-8 all the way.) > If I'm authoring a Mozilla.org page and need some non-English > text on it (eg testcases for rendering or something), am I supposed to > encode every single char as an entity? If you need only some non-English text, then using NCRs is workable and even appropriate given the problems with UTF-8 outline above. Test cases are out of the scope of the style guide. For documents in language other than English (such documents are ofter hosted elsewhere anyway) I'd go with UTF-8 or an encoding commonly used for the language *and* I'd use the <meta> thingy (assuming that no migration to Apache has happened). > > | Unfortunately, that tag makes 3.0-vintage Navigators load > > | the document twice and generally lose their minds. > > > > Come on. That's an obsolete excuse. > > This goes in the category of doing things you know will break browsers > for no good reason other than "I can do it." We should not be doing > such things, imo. Using the <meta> thing would be useful. Considering that the real HTTP headers will continue to be broken indefinitely, the <meta> workaround is the only thing that helps relieving the users of using the Character Encoding menu. (People who routinely read badly served non-ISO-8859-1 pages may not have ISO-8859-1 or Windows-1252 as the default.) > > | Add meta description and keywords to help indexing. > > > > What's the concrete use case that justifies this requirement? > > The fact that we may want to write Is it appropriate to require author effort until the piece of software justifying the requirement has actually been written? > an indexing tool that does a better > job of searching _documentation_ in particular than Google does. Google > indexes a whole lot of non-documentation crap on the Mozilla.org site. PageRank should take care of less relevant documents appearing later in the results. > Google doesn't use such metadata because out in the wide world it is > unreliable. On our own website, it will be reliable, since we control > all of it. It won't be reliable because authors are too lazy to include useful metadata, authors forget to keep the metadata up to date and keywords without a controlled thesaurus of accepted keywords aren't particularly useful and badgering authors *and* the people doing the searches to use one properly is too difficult. (If you wanted to look up some documentation, would you want to learn the controlled set of keywords first?) -- Henri Sivonen [EMAIL PROTECTED] http://www.iki.fi/hsivonen/ Mozilla Web Author FAQ: http://mozilla.org/docs/web-developer/faq.html
