Re: Memory leak in function with local array
On 5/2/21 8:19 AM, thomas.gren...@aalto.fi wrote: Bash Version: 5.1 Patch Level: 4 Release Status: release Description: There seems to be a memory leak in bash when a local array is declared within a function. I also tested this on "GNU bash, version 5.0.17(1)-release (x86_64-redhat-linux-gnu)" and that one works without any leaks. Repeat-By: #!/bin/bash leak() { local arr=("leak"); } while :; do leak; done Thanks for the report. This was fixed back in January as the result of https://lists.gnu.org/archive/html/bug-bash/2021-01/msg00082.html. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: Why does "mapfile -d delim" (delim != '\n') use unbuffered read?
On 5/2/21 9:51 AM, Koichi Murase wrote: Maybe I'm asking a stupid question, but, as in the subject, why does the builtin "mapfile -d delim" use unbuffered read when delim != '\n'? It's the shell being careful in the general case. You need to guarantee behavior in all of the cases where read(2) will not return until it sees a newline. Can we use buffered read for seekable file descriptors the same as the `delim == '\n'' case? You should be able to, yes. This treatment of `mapfile' for "delim != '\n'" exists since the mapfile delimiter is first introduced by commit 25a0eacfe "commit bash-20140625 snapshot". Would it be a problem to change to the buffered read also for non-LF delimiters? If we could remove the above two lines (i.e., if (delim != '\n') unbuffered_read = 1;), I'd be very happy... Try it out and see. Chet -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: Why does "mapfile -d delim" (delim != '\n') use unbuffered read?
On 5/3/21 10:14 AM, Chet Ramey wrote: This treatment of `mapfile' for "delim != '\n'" exists since the mapfile delimiter is first introduced by commit 25a0eacfe "commit bash-20140625 snapshot". Would it be a problem to change to the buffered read also for non-LF delimiters? If we could remove the above two lines (i.e., if (delim != '\n') unbuffered_read = 1;), I'd be very happy... Try it out and see. I should have mentioned up front that this will not work in the general case on at least macOS, the BSDs, and (I believe) AIX. It won't work on any system that doesn't return -1/ESPIPE when you try to lseek on a terminal device. Glibc does; the above systems don't; POSIX makes the behavior undefined. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: Why does "mapfile -d delim" (delim != '\n') use unbuffered read?
2021年5月4日(火) 3:40 Chet Ramey : > On 5/3/21 10:14 AM, Chet Ramey wrote: > >> This treatment of `mapfile' for "delim != '\n'" exists since the > >> mapfile delimiter is first introduced by commit 25a0eacfe "commit > >> bash-20140625 snapshot". Would it be a problem to change to the > >> buffered read also for non-LF delimiters? If we could remove the above > >> two lines (i.e., if (delim != '\n') unbuffered_read = 1;), I'd be very > >> happy... > > > > Try it out and see. Actually, I had tried it with my Linux before I sent my previous mail, and it seemed to work without problems. But I was not sure if it works in every situation so asked the question. > I should have mentioned up front that this will not work in the general > case on at least macOS, the BSDs, and (I believe) AIX. > > It won't work on any system that doesn't return -1/ESPIPE when you try > to lseek on a terminal device. Glibc does; the above systems don't; > POSIX makes the behavior undefined. Oh, I see. Thank you for the explanation. I was somehow assuming that seekable file descriptors can be always correctly detected by the previous lines (builtins/mapfile.def L188). Can we check __GLIBC__ to detect glibc lseek and safely skip the check on the delimiter? Something like #ifndef __GLIBC__ /* Use the buffered read only when the delimiter is a newline because unseekable fd's may not be detected by non-glibc lseek(2). */ if (delim != '\n') unbuffered_read = 1; #endif Thank you, Koichi
Re: Why does "mapfile -d delim" (delim != '\n') use unbuffered read?
I'm still thinking about whether it is possible to use buffered read in more cases of mapfile. (1) Suggestion: use buffered read for a regular file for delim != '\n'? > It won't work on any system that doesn't return -1/ESPIPE when you try > to lseek on a terminal device. I found a related interesting discussion in a FreeBSD mailing list: https://lists.freebsd.org/pipermail/freebsd-hackers/2011-November/036863.html The BUGS section mentioned in this FreeBSD discussion can be found on the FreeBSD man page: https://www.freebsd.org/cgi/man.cgi?query=lseek&sektion=2 lseek(2) may succeed on non-seekable streams including some terminal device (mentioned in Chet's reply) and a tape device (mentioned in the above FreeBSD discussion). According to the discussion, BSD tar has changed to use lseek(2) for the seek capability test in FreeBSD only when the file descriptor is a regular file or a block device. The current devel branch (builtins/mapfile.def L187..L194) is equivalent to the following code: #ifndef __CYGWIN__ if (delim == '\n') unbuffered_read = (lseek (fd, 0L, SEEK_CUR) < 0) && (errno == ESPIPE); else unbuffered_read = 1; /* #1 */ #else unbuffered_read = 1; #endif Here I suggest modifying the line #1 in the following way so as to switch to the buffered read when the file descriptor is a regular file and lseek(2) succeeds on it. (BSD tar seemed to treat block devices the same as regular files, but I'm afraid that the above-mentioned tape device might be implemented as a block device, so I just allowed the buffered read only for regular files): #ifndef __CYGWIN__ if (delim == '\n') unbuffered_read = (lseek (fd, 0L, SEEK_CUR) < 0) && (errno == ESPIPE); else { struct stat st; unbuffered_read = 1; if (fstat (fd, &st) >= 0 && S_ISREG(st.st_mode) && lseek (fd, 0L, SEEK_CUR) >= 0) unbuffered_read = 0; } #else unbuffered_read = 1; #endif (2) Suggestion: use unbuffered_read only when the options `-n/-C' are used? It seems that the unbuffered read is only needed when `mapfile' is supplied with the option `-n max_nlines' or the option `-C callback'. In the other cases, `mapfile' anyway reads the stream till the end without stopping, so it seems that we don't have to care about the unprocessed data in the read buffer. What do you think of turning on unbuffered_read only when the option `-n' or the option `-C' is specified? (3) Suggestion: change the order of `run_callback' and `zsyncfd'? I noticed that `zsyncfd' is called before `run_callback': builtins/mapfile.def L215..L223 > /* Has a callback been registered and if so is it time to call it? */ > if (callback && line_count && (line_count % callback_quantum) == 0) > { > run_callback (callback, array_index, line); > > /* Reset the buffer for bash own stream. */ > if (unbuffered_read == 0) > zsyncfd (fd); > } I feel it is better to call `zsyncfd' before `run_callback'. If the `callback' reads data from the same file descriptor when there is unprocessed data in the read buffer, something strange will happen. `zsyncfd' will just shift the stream position backward by the same number of bytes as the unprocessed data. Since it is more natural that `callback' finds the unprocessed data in the stream, I think `zsyncfd' is better to be called before `run_callback'. I attach a combined patch for (1)--(3). This time, I didn't include the special treatment on __GLIBC__, but maybe we can skip the check of S_ISREG (st.st_mode) if __GLIBC__ is defined. Remark: In the current devel branch, when delim == '\n' and lseek(2) fails with errno != ESPIPE, `mapfile' still performs buffered read. I kept this behavior in the attached patch, but maybe this could also be changed. I doubt that we could still expect that `zsyncfd' will work when lseek(2) fails (even though errno != ESPIPE). 0001-mapfile-buffered-read-in-more-cases.patch Description: Binary data