This library purports to be a way to approach the problem ... https://www.autoitconsulting.com/site/development/utf-8-utf-16-text-encoding-detection-library/
UTF-8 and UTF-16 Text Encoding Detection Library by Jonathan Bennett | Aug 23, 2014 | Development | This post shows how to detect UTF-8 and UTF-16 text and presents a fully functional C++ and C# library that can be used to help with the detection. I recently had to upgrade the text file handling feature of AutoIt to better handle text files where no byte order mark (BOM) was present. The older version of code I was using worked fine for UTF-8 files (with or without BOM) but it wasn't able to detect UTF-16 files without a BOM. I tried to the the IsTextUnicode Win32 API function but this seemed extremely unreliable and wouldn't detect UTF-16 Big-Endian text in my tests. Note, especially for UTF-16 detection, there is always an element of ambiguity. This post by Raymond shows that however you try and detect encoding there will always be some sequence of bytes that will make your guesses look stupid. Here are the detection methods I'm currently using for the various types of text file. The order of the checks I perform are: BOM UTF-8 UTF-16 (newline) UTF-16 (null distribution) : : -- Mike Bianchi