New submission from Ma Lin <malin...@163.com>:
BufferedReader's constructor has a `buffer_size` parameter, it's the size of this buffer: When reading data from BufferedReader object, a larger amount of data may be requested from the underlying raw stream, and kept in an internal buffer. The doc of BufferedReader[1] If call the BufferedReader.read(size) function: 1, When `size` is a positive number, it reads `buffer_size` bytes from the underlying stream. This is expected behavior. 2, When `size` is -1, it tries to call underlying stream's readall() function [2]. In this case `buffer_size` is not be respected. The underlying stream may be `RawIOBase`, its readall() function read `DEFAULT_BUFFER_SIZE` bytes in each read [3]. `DEFAULT_BUFFER_SIZE` currently only 8KB, which is very inefficient for BufferedReader.read(-1). If `buffer_size` bytes is read every time, will be the expected performance. Attached file demonstrates this problem. [1] doc of BufferedReader: https://docs.python.org/3/library/io.html#io.BufferedReader [2] BufferedReader.read(-1) tries to call underlying stream's readall() function: https://github.com/python/cpython/blob/v3.9.0b5/Modules/_io/bufferedio.c#L1538-L1542 [3] RawIOBase.readall() read DEFAULT_BUFFER_SIZE each time: https://github.com/python/cpython/blob/v3.9.0b5/Modules/_io/iobase.c#L968-L969 ---------- components: IO files: demo.py messages: 374652 nosy: malin priority: normal severity: normal status: open title: Inefficient BufferedReader.read(-1) type: performance versions: Python 3.10 Added file: https://bugs.python.org/file49354/demo.py _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue41452> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com