On Mon, May 19, 2014 at 01:17:00AM -0500, Bruce Dubbs wrote:
> Ken Moffat wrote:
> >  Somebody asked on lfs-dev about changing LFS to produce UTF-8 html.
> >I think we ought to do that, but arguably we have a greater need in
> >BLFS - the changelog entry for 31st March includes Igor's surname,
> >but the page is described as 8859-1 so firefox defaults to
> >displaying Živković instead of Živković (that can be changed in
> >View -> Character Encoding but it needs to be done for each page.
> >
> >  The encoding is in each xml file, and also in some stylesheets.
> >The following looks as if it does the right thing:
> >
> >  In the BOOK/ directory (trunk/BOOK), or svn copy trunk/BOOK
> >branches/unicode for any editors who want to try this -
> >
> >find -type f | xargs sed -i
> >'s/encoding="ISO-8859-1"/encoding="UTF-8"/'
> >
> >  Note: this does NOT change the stylesheets, or the images, which
> >are still marked as 8859-1.  The regular html is now all unicode, so
> >firefox knows how to handle any non-ASCII characters. I don't have
> >the tools to attempt to build the PDF.
> >
> >  Guys, I think we ought to do this, what do you think (for BLFS) ?
> 
> Is this needed on all pages or would just the Changelog page do?  I don't
> believe there is any requirement for non-Ascii characters except on the
> changelog page.
> 
> I will note that when I change the page to UTF-8, I get Živkovi�‡.  I'm not
> user that will come out correctly in this email, but it's all correct except
> the last acute-c.  Perhaps that's just a font issue on my system.
> 
>   -- Bruce

 For me, the final c-acute of Igor's surname in your pasting has
become two characters - reverse-video question mark which is U+FFFD
and used to indicate invalid unicode, followed by a double dagger.

 That suggests that something in the rendering chain is not
producing correct UTF-8.  I guess you have tidy installed ?  If so,
Lei Tong later reported that he had to change input-encoding and
output-encoding in tidy.conf from latin1 to utf8, and also change
stylesheets/lfs-xsl/chunk-slave.xsl from select="'ISO-8859-1'" to
select="'UTF-8'".

 I agree that non-unicode in the rest of the book is, for the
moment, unlikely.  But I dislike the idea of different encodings in
different parts of the book.  If tidy or the stylesheet is involved
in the error you are seeing, that implies that the whole book needs
to be changed.

ĸen
-- 
das eine Mal als Tragödie, dieses Mal als Farce
-- 
http://lists.linuxfromscratch.org/listinfo/blfs-dev
FAQ: http://www.linuxfromscratch.org/blfs/faq.html
Unsubscribe: See the above information page

Reply via email to