I'm running Cuneiform 1.0 and the behaviour that I'm observing is also, like 
others already have mentioned, that a dash gets translated into three 
characters: —
(that's U+00E2, U+20AC, U+201D).
This happens not only when I use "smarttext" for format, also with "html", 
"hocr" and "text".
When using Czech, Dutch, English, French, German, etc. Cuneiform will produce 
—.
However, when using Bulgarian, Russian, etc. Cuneiform will produce: — 
(that's U+0432, U+0402, U+201D).

** Attachment added: "input image for Cuneiform, producing three strange 
characters instead of a dash"
   
https://bugs.launchpad.net/cuneiform-linux/+bug/324256/+attachment/1488954/+files/1976-11.pag40.pbm

-- 
Strange symbols instead '-' in smart text output
https://bugs.launchpad.net/bugs/324256
You received this bug notification because you are a member of Cuneiform
Linux, which is the registrant for Cuneiform for Linux.

Status in Linux port of Cuneiform: New

Bug description:
At line 1, 33, 55 in smarttext output (see attach). It looks like bad 
substitution (0xE2,0x80,0x94) for dash.

I use 'smarttext' output format.



_______________________________________________
Mailing list: https://launchpad.net/~cuneiform
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~cuneiform
More help   : https://help.launchpad.net/ListHelp

Reply via email to