On Tue, 1 Jul 2025 00:01:21 GMT, Shaojin Wen <s...@openjdk.org> wrote:
>> BufferedWriter -> OutputStreamWriter -> StreamEncoder >> >> In this call chain, BufferedWriter has a char[] buffer, and StreamEncoder >> has a ByteBuffer. There are two layers of cache here, or the BufferedWriter >> layer can be removed. And when charset is UTF8, if the content of >> write(String) is LATIN1, a conversion from LATIN1 to UTF16 and then to >> LATIN1 will occur here. >> >> LATIN1 -> UTF16 -> UTF8 >> >> We can improve BufferedWriter. When the parameter Writer instanceof >> OutputStreamWriter is passed in, remove the cache and call it directly. In >> addition, improve write(String) in StreamEncoder to avoid unnecessary >> encoding conversion. > > Shaojin Wen has updated the pull request incrementally with one additional > commit since the last revision: > > Revert "BufferedWriter buffer use StringBuilder" > > This reverts commit da902ca0b0bd6acc003deb8ad1ca0d6485a29a27. This latest prototype looks great! It means that we can get rid of the old `BufferedImpl` by using `WriterImpl` as the new code, and remove `StreamEncoder.UTF8Impl`. I think this prototype can be split this way: 1. Update ArrayEncoder to pass `dp`, open up StringBuilder in JLA, and make BufferedWriter + StreamEncoder use ArrayEncoder. We can use a benchmark writing encodings like CESU for a first step proof of concept. 2. Make UTF8/ISO88591 array encoders. This will open up a few String UTF8 encoding methods in JLA. 3. More array encoders. For example, GB18030 gets a new array encoder in your patch. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26022#issuecomment-3028104142