On 23 Sep 2011, at 12:44, Peter A. Kerzum wrote: > Actually consistent To-Unicode mapping should be a good compromise, as higher > level software can really segment text into regions of different languages > based solely on their alphabets and then detect and correct text flow for > each > particular region > > This way the example > > english WERBEH > > should generaly work being decomposed into 2 regions with the latter reversed
But what is the order of those "2 regions"? You cannot tell unless you have some higher-level info... the purely visual presentation is inherently ambiguous. JK _______________________________________________ poppler mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/poppler
