Thank you very much for explaining this. I had indeed overlooked the
use of encoding in 'file'. I also appreciate how unsatisfactory
guessing at the encoding can be, and that scanning the entire file is
not appropriate for large files or general connections.
Sorry that 'burden' came across as nega
I think you need to delimit a bit more what you want to do. It is
difficult in general to tell what encoding a text file is in, and very
much harder if this is a data file containing only a small proportion of
non-ASCII text, which might not even be words in a human language (but
abbreviations
R-developers,
I'm looking for some 'best practices', or perhaps an upstream solution
(I have a deja vu about this, so sorry if it's already been asked).
Problems occur when a file is encoded as latin1, but the user has a
UTF-8 locale (or I guess more generally when the input locale does not
match