On 2020-11-02 20:41:02 +0100, Vincent Lefevre wrote: > With en_US.utf8, no issues: > > $ export LC_ALL=en_US.utf8 > $ echo a─b | iconv -f utf-8 -t ascii//TRANSLIT > a-b > > But with C.UTF-8, the character "─" is regarded as invalid: > > $ export LC_ALL=C.UTF-8 > $ echo a─b | iconv -f utf-8 -t ascii//TRANSLIT > aiconv: illegal input sequence at position 1
Note that this is not related to the charset used by the file, but the locale under which iconv runs. For instance, there is no issue to convert a UTF-8 files under the C locale, but using C.UTF-8 instead yields an error: $ echo a─b | LC_ALL=C iconv -f utf-8 -t ascii//TRANSLIT a-b $ echo a─b | LC_ALL=C.UTF-8 iconv -f utf-8 -t ascii//TRANSLIT aiconv: illegal input sequence at position 1 -- Vincent Lefèvre <vinc...@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)