A Diumenge 05 Octubre 2008, Warren Toomey va escriure:
> pdftohtml used to have a "raw" mode which has been removed. In "raw" mode,
> text from a PDF document is processed in the order that it occurs. However,
> the current version of pdftohtml reorders the text to be in increasing
> y-value, i.e. from the top of a page going down to the bottom.
>
> This text reordering plays merry havoc with multi-column pages, as the text
> from the columns becomes interleaved instead of remaining separate.
> The attached patch restores the -raw command-line option to pdftohtml. The
> program retains its current behaviour if the -raw option is not used, but
> reverts to the "text as it appears" behaviour with the -raw option enabled.

I've had a look at all the pdftohtml tarballs present at 
http://sourceforge.net/project/showfiles.php?group_id=45839 and none of them 
had the raw option enabled for the user to use. Are you sure this is ok to 
enable?

Albert

>
> Cheers,
>         Warren


_______________________________________________
poppler mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/poppler

Reply via email to