On 5/18/15 5:45 AM, Vladimir Marek wrote:
> Hi,
>
> Our customer updated from bash 3.0.16 to 3.2.57 and claims that it
> caused decreased performance. The report is slightly vague and says that
> bash now consumes more memory and is slower after running ~ 30000
> commands. My current belief (and I want to have it confirmed form
> customer) is that the problem is that bash is now eating more CPU
> cycles.
>
> I did some experiments and tracked it down to the bgp_delete function.
> Once the number of records in 'bgpids' list reaches 30k (== ulimit -u on
> Solaris), every new allocated pid results in bgp_delete call to make
> sure that the list does not contain duplicate entries. And that means
> walking all 30k entries in single linked list because the new pid is
> most probably unique. I believe that customer use case involves rapid
> execution of external commands so the 'bgpids' is traversed many times
> and the performance hit becomes significant.
>
> I am thinking about improving the situation. Number of options come into
> mind. In order to get the fix to the official sources I would like to
> discuss the approach in advance.
>
>
> a) decreasing size of the 'bgpids' list. Why do we need 30k entries if
> we don't trust that the IDs are unique? Maybe configuration or runtime
> option?
I've thought about it. Posix only requires saving the statuses of the last
CHILD_MAX asynchronous pids. The bash code has traditionally saved
the statues of the last `ulimit -u' processes, regardless of whether
they're asynchronous. I changed the bash development version to save
only asynchronous jobs to the bgpids list about a month ago.
>
> b) create hash map containing the IDs to decrease the search time. We
> still need to maintain linked list to know which PID is oldest (to be
> removed). The linked list might benefit from being double linked one
> so that we are able to remove elements without traversing it again.
http://lists.gnu.org/archive/html/bug-bash/2015-04/msg00075.html
We'll see if the change to save only asynchronous pids makes enough of
a difference. I've attached that patch if you want to look at it; line
numbers will certainly vary.
> c) some means to clear the 'bgpids' list during runtime.
`wait' with no arguments discards all saved status values.
http://pubs.opengroup.org/onlinepubs/9699919799/utilities/wait.html#tag_20_153
`disown -a' will also work.
Chet
--
``The lyf so short, the craft so long to lerne.'' - Chaucer
``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, ITS, CWRU [email protected] http://cnswww.cns.cwru.edu/~chet/
*** bash-20150410/jobs.c 2015-04-17 14:32:17.000000000 -0400
--- bash-20150417/jobs.c 2015-04-15 21:28:08.000000000 -0400
***************
*** 1044,1051 ****
return;
! if ((dflags & DEL_NOBGPID) == 0)
{
proc = find_last_proc (job_index, 0);
- /* Could do this just for J_ASYNC jobs, but we save all. */
if (proc)
bgp_add (proc->pid, process_exit_status (proc->status));
--- 1044,1050 ----
return;
! if ((dflags & DEL_NOBGPID) == 0 && (temp->flags & (J_ASYNC|J_FOREGROUND)) == J_ASYNC)
{
proc = find_last_proc (job_index, 0);
if (proc)
bgp_add (proc->pid, process_exit_status (proc->status));