A Dijous, 22 de setembre de 2011, Leonard Rosenthol vàreu escriure: > Boy, your lawyer needs to read up on IP law :). > > Since you do NOT have a license for the font data contained in the PDF, > your software has NO RIGHTS to use that information for anything other > than rendering the glyphs in the PDF. You certainly have NO rights to > convert the format - in fact, doing so is a clear and distinct violation > of the font licenses. > > As such, if your patches to pdf2html extract the font data for use in the > HTML - I STRONGLY recommend that the code NOT be accepted into the master > repository.
We do already have code that extracts the font stream. Saying this is illegal is insane, after all it is just a series of bits in a given file. Are you saying cat or less or vi are illegal? Albert > > Leonard > > On 9/22/11 6:40 PM, "Josh Richardson" <[email protected]> wrote: > >I'm not a lawyer, but I did check with one. I don't think software can > >violate your IP/licenses, at least as long as that software doesn't > >contain unauthorized copyrighted material -- which pdftohtml does not > >AFAIK -- I certainly didn't add any to it. > > > >Best, --josh > > > >On 9/22/11 3:08 PM, "Leonard Rosenthol" <[email protected]> wrote: > >>I can't recall what you said about this in the past, but since I was > >>just > >>dealing with it today. > >> > >>What do you do about embedded fonts? > >> > >>As my company (Adobe) sells/creates fonts, I want to make sure that > >>pdftohtml won't be violating our IP/licenses. > >> > >>Thanks in advance, > >>Leonard > >> > >>On 9/22/11 5:51 PM, "Josh Richardson" <[email protected]> wrote: > >>>On 9/22/11 12:20 PM, "Jonathan Kew" <[email protected]> wrote: > >>>>More generally, it is not possible to recreate useful XHTML (or > >>>>similar) > >>>>documents from arbitrary PDF files with anything like 100% > >>>>reliability, > >>>>because many PDF files do not contain adequate information to > >>>>accurately > >>>>map the rendered glyphs back to correct Unicode text, or to reliably > >>>>reconstruct the proper flow of text. Constructs such as ActualText > >>>>may > >>>>help, but are often lacking from real-world PDF documents. > >>> > >>>W.r.t. rendering glyphs, we get around the problem of missing unicode > >>>mappings by taking any glyph without a unicode mapping and assigning > >>>it > >>>an > >>>offset in the private space of Unicode. This produces the correct > >>>visual > >>>result in the XHTML, but not a full semantic representation. If > >>>someone's > >>>interested, they could get the semantics right too by pattern-matching > >>>the > >>>glyph against an appropriate Unicode font. > >>> > >>>W.r.t. the flow of text, there have been other threads on this topic, > >>>but > >>>pdftohtml does make some attempt, and I believe it's possible to do > >>>this > >>>to a high degree of accuracy, maybe >99% -- that said, noone has done > >>>it > >>>yet, so either it's harder than I think, or no-one has cared enough to > >>>really try (and I still fall into that camp.) > >>> > >>>Best, --josh > >>> > >>>_______________________________________________ > >>>poppler mailing list > >>>[email protected] > >>>http://lists.freedesktop.org/mailman/listinfo/poppler > > > >_______________________________________________ > >poppler mailing list > >[email protected] > >http://lists.freedesktop.org/mailman/listinfo/poppler > > _______________________________________________ > poppler mailing list > [email protected] > http://lists.freedesktop.org/mailman/listinfo/poppler _______________________________________________ poppler mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/poppler
