Sure, but I reproduce there are (I believe) two issues here :  
1) justification is more complicated with webkit due to not (really) working 
optimizeLegibility in WebKit and the fact that WebKit handles poorly decimal in 
word-spacing and not at all in letter-spacing
2) due to kerning (I can send you a screenshot comparing in Photoshop two texts 
one over the other) / letter-spacing / word-spacing (?), lines are much longer 
in WebKit => hence, if you have for example "footnotes" as in this PDF, they 
don't get at the right place in the text (all the more so as if you have a PDF 
from an InDesign export, there may be "metrics" which cause some text to go 
over another — yet, you can always remove all metrics before exporting in 
PDF…it avoids part of the issue)  

NB : I don't manage to get the fonts extracted to work, but I can send those to 
you in otf if you want (don't know if extraction is not working due to my 
installation ?)  

PDF file : ​BugWebkit.pdf (http://cl.ly/0L3g2I1r3G2a0T0o3622)  

--  
Clément Wehrung
06 88 10 65 91

Le mercredi 26 octobre 2011 à 14:35, Clément Wehrung a écrit :

> You can understand better the issue here (Firefox vs Safari on Mac/iOS)
>  
> http://dev.nurves.com/pdf2html/-6.html
>  
> Cf. footnotes
>  
> WebKit.png (http://cl.ly/3c1B2V1X2u2C2f0M2L0L)
> Firefox.png (http://cl.ly/0Q111C3u2g3T2U1D3U2u)
> --  
> Clément Wehrung
> 06 88 10 65 91
>  
>  
>  
> Le mercredi 26 octobre 2011 à 14:26, Clément Wehrung a écrit :
>  
> > Hi Josh,
> >  
> > Thanks for all this. I'm already looking at the code now, but I've run into 
> > some issues with webkit rendering compared to Firefox (where it looks 
> > really amazing !). I know webkit has a bug with letter-spacing (does not 
> > take decimal into account) but there's more to it since 
> > text-rendering:optimizeLegibility; only partly works. I try to see how we 
> > could get text boxes not to end up one over the other. I can show you some 
> > screenshots if you want.  
> >  
> > btw, when have you chosen not to use only the background image for all 
> > graphics ? is it in order to achieve some image over text ?
> >  
> > Thanks,  
> >  
> > Clement  
> >  
> > --  
> > Clément Wehrung
> > 06 88 10 65 91
> >  
> > Le mardi 25 octobre 2011 à 00:41, Josh Richardson a écrit :
> >  
> > > Ok, sent you a read-only access invitation for now.  Thanks for your 
> > > offer to help.  Here is my bigger issues list to get a flavor – a lot of 
> > > fun things to do.  Let me know what you want to do with pdftohtml!
> > >  
> > > Translate drawing operations into canvas with SVG
> > > Find better way to calculate vertical positioning, by looking at browser 
> > > source code
> > > z-index handling -- currently text is never masked by graphics
> > > Algorithmic extraction of TOC
> > > Algorithmic extraction of page numbering (Alec may be working on this)
> > > Algorithmic identification of chapters
> > > Right-to-left text, proper display (e.g. Arabic, Hebrew)
> > > Algorithmic detection of text flow (Stephen may be working on this)
> > > Detection / removal of duplicate images
> > > Jpg vs. png selection; automatically choose the best format for each image
> > >  
> > >  
> > > --josh
> > >  
> > > From:  Clément Wehrung <[email protected] (mailto:[email protected])>
> > > Date:  Mon, 24 Oct 2011 15:27:23 -0700
> > > To:  Josh Richardson <[email protected] (mailto:[email protected])>
> > > Cc:  "[email protected] 
> > > (mailto:[email protected])" <[email protected] 
> > > (mailto:[email protected])>, Alec Taylor 
> > > <[email protected] (mailto:[email protected])>
> > > Subject:  Re: [poppler] pdftohtml does not preserve fonts
> > >  
> > > Sure ! Do you have a link for the repo so that I can already have a look 
> > > (I didn't figure out which one it is right now) ? I'm really interested 
> > > in helping you, if you need something on any specific topic don't 
> > > hesitate. Many thanks again,
> > >  
> > > Clément
> > >  
> > >  
> > > On Mon, Oct 24, 2011 at 8:01 PM, Josh Richardson <[email protected] 
> > > (mailto:[email protected])> wrote:
> > > > Can you give me a couple of days?  I want to try to get a repo hosted 
> > > > on,
> > > >  e.g. bitbucket, which is connected to my repo, so that it's easier to 
> > > > keep
> > > >  everything in synch.  Alec Taylor set up a repo there already, which 
> > > > you
> > > >  can use to get an immediate snapshot if needed.
> > > >  
> > > >  Best, --josh
> > > >  
> > > >  On 10/24/11 10:45 AM, "iclems" <[email protected] 
> > > > (mailto:[email protected])> wrote:
> > > >  
> > > > >
> > > > >Dear Josh,
> > > > >
> > > > >Being working on a pdftohtml project which requires font preservation, 
> > > > >I'd
> > > > >be really interested in getting this too. Do you think it's possible ?
> > > > >
> > > > >Thanks,
> > > > >
> > > > >Clement
> > > > >[email protected] (mailto:[email protected])
> > > > >
> > > > >
> > > > >Josh Richardson wrote:
> > > > >>
> > > > >> Preserving fonts is not integrated into the master repository yet.  
> > > > >> If
> > > > >>you
> > > > >> like, I can send you a patched version of Poppler which will do it.
> > > > >> You'll still have to run your own process (like Fontforge) to convert
> > > > >>the
> > > > >> fonts into a web-usable format, but it's straightforward as long as 
> > > > >> the
> > > > >> fonts have mapping to unicode, and doable even without.
> > > > >>
> > > > >> --josh
> > > > >>
> > > > >> From: M Naveed Akram <[email protected] 
> > > > >> (mailto:[email protected])<mailto:[email protected]>>
> > > > >> Date: Fri, 30 Sep 2011 06:52:14 -0700
> > > > >> To:
> > > > >>"[email protected] 
> > > > >>(mailto:[email protected])<mailto:[email protected]>"
> > > > >> <[email protected] 
> > > > >> (mailto:[email protected])<mailto:[email protected]>>
> > > > >> Subject: [poppler] pdftohtml does not preserve fonts
> > > > >>
> > > > >> Hi,
> > > > >>
> > > > >> I have been using 0.16 release of poppler-utils, but I am facing a
> > > > >> problem. When converting pdf to html using pdftohtml it does not
> > > > >>preserve
> > > > >> fonts in the output html. How can I solve this issue. Please help
> > > > >>
> > > > >>
> > > > >> _______________________________________________
> > > > >> poppler mailing list
> > > > >> [email protected] (mailto:[email protected])
> > > > >> http://lists.freedesktop.org/mailman/listinfo/poppler
> > > > >>
> > > > >>
> > > > >
> > > > >--
> > > > >View this message in context:
> > > > >http://old.nabble.com/pdftohtml-does-not-preserve-fonts-tp32569116p3271208
> > > > >4.html
> > > > >Sent from the Free Desktop - poppler mailing list archive at 
> > > > >Nabble.com (http://Nabble.com).
> > > > >
> > > > >_______________________________________________
> > > > >poppler mailing list
> > > > >[email protected] (mailto:[email protected])
> > > > >http://lists.freedesktop.org/mailman/listinfo/poppler
> > > > >
> > > >  
> > >  
> >  
>  

_______________________________________________
poppler mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/poppler

Reply via email to