On Tue, 1 Jul 2025 00:01:21 GMT, Shaojin Wen <s...@openjdk.org> wrote:
>> BufferedWriter -> OutputStreamWriter -> StreamEncoder >> >> In this call chain, BufferedWriter has a char[] buffer, and StreamEncoder >> has a ByteBuffer. There are two layers of cache here, or the BufferedWriter >> layer can be removed. And when charset is UTF8, if the content of >> write(String) is LATIN1, a conversion from LATIN1 to UTF16 and then to >> LATIN1 will occur here. >> >> LATIN1 -> UTF16 -> UTF8 >> >> We can improve BufferedWriter. When the parameter Writer instanceof >> OutputStreamWriter is passed in, remove the cache and call it directly. In >> addition, improve write(String) in StreamEncoder to avoid unnecessary >> encoding conversion. > > Shaojin Wen has updated the pull request incrementally with one additional > commit since the last revision: > > Revert "BufferedWriter buffer use StringBuilder" > > This reverts commit da902ca0b0bd6acc003deb8ad1ca0d6485a29a27. My initial impression was that the point of this PR was that the BufferedWriter was forcing the conversion to utf-16 and bypassing that would avoid a conversion. However, it seems that it is actually the StreamEncoder/CharsetEncoder that is really forcing that - and the conversion to utf-16 is required for optimal encoder performance. The result (of this PR), then, seems to be that for OutputStreamWriter as a target (and maybe for specific character encodings) BufferedWriter no long buffers, but delegates that responsibility to the OutputStreamWriter (and its StreamEncoder). Are there any scenarios where wrapping an OutputStreamWriter with a BufferedWriter makes sense? Is it only to control the buffer size? If so, should OutputStreamWriter itself just allow consumers to control the buffer size? (And then just change the doc of OutputStreamWriter to discourage the use of BufferedWriter - and change PrintWriter to not [create this combo](https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/io/PrintWriter.java#L167).) Should the various encoders be optimized to work with a StringCharBuffer? Perhaps only if backed by a String or AbstractStringBuilder? It seems that there could be more target character encodings beyond utf-8 and utf-16 (i.e ascii, iso-8859-1, cp1252, etc.) which could benefit from the source already known whether it is latin 1. It feels strange to place the optimizations for specific character encodings directly in StreamEncoder. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26022#issuecomment-3024309269