The current systems appears to try and use system-available fonts as 
approximations for whatever font is in the PDF.  For pdftohtml, I am 
considering adding in a preferred behavior:

1.  Extract the original font from the PDF
2.  Create a font file for that font
3.  Reference the font file, using "@font-face" in the generated HTML.

This should give us an exact representation of the original font in the PDF, 
though it will only work with modern browsers, since earlier browsers don't 
support "@font-face".  For IE, I'll have to convert the font to EOT, and for 
the others I'll probably use regular OpenType (not TrueType) format.

If I only use the extracted font to display the original document in it's 
original form, and not to draw additional glyphs in any document, I believe 
I'll be in compliance with "fair use" and digital copyright rules for the font.

Does anyone see an issue with the approach, or have any advice?  For instance, 
I'm not sure how much luck I'll have with converting especially Type 3 fonts to 
OpenType/EOT.

Thanks, --josh
_______________________________________________
poppler mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/poppler

Reply via email to