As one with 10+ years with CSS, XHTML1.x and now HTML 5 I have to ask which versions of the XHTML specification you plan on supporting.

I would assume you would target XHTML1.1 Strict and leave the notion of the XHTML 1.1 Modular alone as we've all departed on to HTML 5.

Which brings me to the question, pdftohtml should include output to HTML 5, and since it's on all platforms perhaps one should utilize the WebKit HTML 5 Parser, especially since GTK+ and Qt are all in. GTK+ is even modularizing out their work so to separate the JavaScript engine to be reusable within other GTK+ projects.

From GTK+ Changelog:

2011-06-20  Carlos Garcia Campos <[email protected]>

        Reviewed by Xan Lopez.

        [GTK] Split libWebCore into two libWebCore and libWebCoreGtk
        https://bugs.webkit.org/show_bug.cgi?id=60539

        * GNUmakefile.am: Link to libWebCoreGtk.la too.

================
WebKitGTK+ 1.5.1
================

What's new in WebKitGTK+ 1.5.1?

  - The JSC library is now available independently. It's called
    "libjavascriptcoregtk", and it comes with its own pkg-config file.
  - New spellchecking APIs, useful to implement spellchecking features
    in the UAs.
  - New DOM methods to check if editable areas have been modified by
    the user (webkit_dom_html_{input,text_area}_is_edited).
  - Lots of improvements in the WebKit2GTK+ port.
  - Lots of bugfixes.

Since XHTML is a good citizen with HTML 5 I'd assume information on the WebKit HTML 5 Parser would be useful for the long haul.

http://www.webkit.org/blog/1273/the-html5-parsing-algorithm/

If I'm off base, just ignore.

Sincerely Yours,

Marc J. Driftmeyer


On 06/21/2011 07:47 PM, Josh Richardson wrote:
Experienced web developers always separate their CSS from their HTML file This makes maintenance and overriding of the styling much easier, as well as keeping the HTML file itself (nearly) completely content / semantics focused.

In the complex mode, I would like to separate out the styling into a separate CSS file, referenced from the output HTML file. Any objections to this?

I am also cleaning up the tags so that they are all balanced and XHTML, hence XML-compliant. Once this is done along with CSS separated out, I'm not sure of a need for a separate –xml mode for pdftohtml. Thoughts on this?

Thanks, --josh


_______________________________________________
poppler mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/poppler

--
Marc J. Driftmeyer
Email :: [email protected] <mailto:[email protected]>
Web :: http://www.reanimality.com
Cell :: (509) 435-5212

<<attachment: mjd.vcf>>

_______________________________________________
poppler mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/poppler

Reply via email to