Re: RFR: 8361018: Re-examine buffering and encoding conversion in BufferedWriter [v6]

Chen Liang Wed, 02 Jul 2025 07:33:49 -0700

On Tue, 1 Jul 2025 00:01:21 GMT, Shaojin Wen <s...@openjdk.org> wrote:


>> BufferedWriter -> OutputStreamWriter -> StreamEncoder
>> 
>> In this call chain, BufferedWriter has a char[] buffer, and StreamEncoder 
>> has a ByteBuffer. There are two layers of cache here, or the BufferedWriter 
>> layer can be removed. And when charset is UTF8, if the content of 
>> write(String) is LATIN1, a conversion from LATIN1 to UTF16 and then to 
>> LATIN1 will occur here.
>> 
>> LATIN1 -> UTF16 -> UTF8
>> 
>> We can improve BufferedWriter. When the parameter Writer instanceof 
>> OutputStreamWriter is passed in, remove the cache and call it directly. In 
>> addition, improve write(String) in StreamEncoder to avoid unnecessary 
>> encoding conversion.
>
> Shaojin Wen has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Revert "BufferedWriter buffer use StringBuilder"
>   
>   This reverts commit da902ca0b0bd6acc003deb8ad1ca0d6485a29a27.

This latest prototype looks great! It means that we can get rid of the old 
`BufferedImpl` by using `WriterImpl` as the new code, and remove 
`StreamEncoder.UTF8Impl`.

I think this prototype can be split this way:
1. Update ArrayEncoder to pass `dp`, open up StringBuilder in JLA, and make 
BufferedWriter + StreamEncoder use ArrayEncoder. We can use a benchmark writing 
encodings like CESU for a first step proof of concept.
2. Make UTF8/ISO88591 array encoders. This will open up a few String UTF8 
encoding methods in JLA.
3. More array encoders. For example, GB18030 gets a new array encoder in your 
patch.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/26022#issuecomment-3028104142

Re: RFR: 8361018: Re-examine buffering and encoding conversion in BufferedWriter [v6]

Reply via email to