On 3/27/24 6:24 PM, Bruno Haible wrote: > Thanks! Applied, with one tweak: Let's continue to use 'rb' and 'wb' as > file open() modes, not 'r' and 'w'. If gnulib-tool ever gets used on > Windows, we don't want the trouble caused by Windows CRLF newlines. > We want all generated files to use Unix LF newlines. (Some of the > constants.nlconvert nonsense will have to go away as well.)
Oops, I've been using the the standard open() function since I'm not too familiar with the 'codecs' module. I believe they work a bit differently. With open() using binary mode with encoding='utf-8' causes a failure: with open('test.txt', 'wb', encoding='utf-8') as file: file.write('abc') Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: binary mode doesn't take an encoding argument The default encoding if not passed is None. I use it since the default encoding is None. I assume in that case it is left up to the operating system. I know previous versions of Windows liked UTF-16, but maybe it is different now. The codecs module doesn't seem to have that restriction, but Python says that the regular open() and 'io' module should be used for text files [1]. >From the documentation from open, it seems the best way to deal with this is for reading files [2]: # Accepts '\n', '\r', '\r\n' as newline. with open('file.txt', 'r', encoding='utf-8') as file: data = file.read() And then for writing files: # Write files with '\n' as newline character. with open('file.txt', 'w', encoding='utf-8', newline='\n') as file: file.write(data) These changes are pretty simple though. I can get a Windows virtual machine running at some point to test these changes. [1] https://docs.python.org/3/library/functions.html#open [2] https://docs.python.org/3/library/functions.html#open Collin