i m all for such a change On Sat, Mar 11, 2023, 12:42 PM zju <21625...@zju.edu.cn> wrote:
> > > > 2023年3月11日 10:50,zju <21625...@zju.edu.cn> 写道: > > > > > > > >> 2023年3月11日 06:17,Chet Ramey <chet.ra...@case.edu <mailto: > chet.ra...@case.edu>> 写道: > >> > >> On 3/10/23 11:37 AM, zju wrote: > >> > >>> I have already set the maximum number of processes available to a > single user "ulimit -Su”. > >>> But the memory occupied by bashes were increasing all the time which > would call oom.This is the key issue. > >> > >> If you don't see an error message from bash about fork failing, then > fork > >> hasn't failed, and the processes continue to run. > >> > >> -- > >> ``The lyf so short, the craft so long to lerne.'' - Chaucer > >> ``Ars longa, vita brevis'' - Hippocrates > >> Chet Ramey, UTech, CWRU c...@case.edu <mailto:c...@case.edu> > http://tiswww.cwru.edu/~chet/ <http://tiswww.cwru.edu/~chet/> > > > > Thanks for you reply: > > > > I want to express this problem more clearly: > > > > 1.I set "ulimit -Su 30” to observe the situation; > > > > 2.exec > > 1.Actually there are many error messages already. > > -bash: fork: retry: Resource temporarily unavailable > > -bash: fork: retry: Resource temporarily unavailable > > > > 2.exec fork book in bash > > [parallels@fedora ~]$ :() { :|:& }; > > > > 3.The following error will appear soon > > -bash: fork: retry: Resource temporarily unavailable > > -bash: fork: retry: Resource temporarily unavailable > > > > 4.And I observe the situation in other terminal. > > As you can see. The process 250229(bash) continues to pull new child > processes which the rss is larger than the old process(2096->2112). So the > memory occupied by the bashes keep growing even though the num of bash > processes is the same, if 8M is occupied by each bash process, there maybe > 4G occupied by 500 bash processes totally. > > > > And as the memory occupied by each bash process is not enough, the oom > killer would not take bash as the first target to kill. > > > > So is it possible to optimize the continuous growth of memory occupied > by child processes? > > > >> In regard to OOM, if the goal is to prevent fork bombs, the system > >> administrator would need to set a hard limit on "ulimit -u", “The > >> maximum number of processes available to a single user" as well as > >> "ulimit -d", "The maximum size of a process's data segment". Changing > >> the behavior of bash alone could not prevent an attacker from forcing > >> OOM, it would just require the attacker to be more sophisticated. > > > > Or is there anyway to avoid this problem? > > As I used to use ulimit -Su to limit the process on bash-5.0 which > dosen’t work now. > > I doubt that whether Worley said using "ulimit -d” with "ulimit -u” > could avoid this problem as the rss maybe occupied by the stack rather than > the data segment? > > > > Looking forward to your reply! > > > > > > [root@fedora ~]# ps aux | grep bash | grep paralle > > > > paralle+ 250229 0.0 0.1 224500 3828 pts/0 S+ 09:39 0:00 -bash > > > > paralle+ 255620 0.0 0.1 224368 2096 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255621 0.0 0.1 224368 2096 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255622 0.0 0.1 224368 2096 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255623 0.0 0.1 224368 2096 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255624 0.0 0.1 224368 2096 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255625 0.0 0.1 224368 2096 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255626 0.0 0.1 224368 2096 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255627 0.0 0.1 224368 2096 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255628 0.0 0.1 224368 2104 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255629 0.0 0.1 224368 2096 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255630 0.0 0.1 224368 2104 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255631 0.0 0.1 224368 2100 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255632 0.0 0.1 224368 2100 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255633 0.0 0.1 224368 2100 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255634 0.0 0.1 224368 2096 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255635 0.0 0.1 224368 2096 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255636 0.0 0.1 224368 2096 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255637 0.0 0.1 224368 2096 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255638 0.0 0.1 224368 2096 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255639 0.0 0.1 224368 2096 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255640 0.0 0.1 224368 2096 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255641 0.0 0.1 224368 2096 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255642 0.0 0.1 224368 2096 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255643 0.0 0.1 224368 2096 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255644 0.0 0.1 224368 2096 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255645 0.0 0.1 224368 2096 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255646 0.0 0.1 224368 2096 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255647 0.0 0.1 224368 2096 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255648 0.0 0.1 224368 2104 pts/0 S 10:10 0:00 -bash > > > > [root@fedora ~]# ps aux | grep bash | grep paralle > > > > paralle+ 250229 0.0 0.1 224500 3828 pts/0 S+ 09:39 0:00 -bash > > > > paralle+ 255708 0.0 0.1 224368 2108 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255709 0.0 0.1 224368 2108 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255712 0.0 0.1 224368 2112 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255716 0.0 0.1 224368 2112 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255723 0.0 0.1 224368 2112 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255724 0.0 0.1 224368 2112 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255726 0.0 0.1 224368 2112 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255728 0.0 0.1 224368 2116 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255729 0.0 0.1 224368 2112 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255730 0.0 0.1 224368 2116 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255731 0.0 0.1 224368 2112 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255733 0.0 0.1 224368 2116 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255735 0.0 0.1 224368 2116 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255736 0.0 0.1 224368 2112 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255737 0.0 0.1 224368 2112 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255738 0.0 0.1 224368 2112 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255739 0.0 0.1 224368 2112 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255740 0.0 0.1 224368 2116 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255741 0.0 0.1 224368 2112 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255742 0.0 0.1 224368 2116 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255743 0.0 0.1 224368 2112 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255744 0.0 0.1 224368 2116 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255745 0.0 0.1 224368 2112 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255746 0.0 0.1 224368 2116 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255747 0.0 0.1 224368 2112 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255748 0.0 0.1 224368 2112 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255749 0.0 0.1 224368 2120 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255750 0.0 0.1 224368 2112 pts/0 S 10:10 0:00 -bash > > > > paralle+ 255751 0.0 0.1 224368 2120 pts/0 S 10:10 0:00 -bash > > > > I tried to dig for further information about this problem. > > The change was made in this pr: > > https://git.savannah.gnu.org/cgit/bash.git/commit/?h=devel&id=ea31c00845c858098d232bd014bf27b5a63a668b > < > https://git.savannah.gnu.org/cgit/bash.git/commit/?h=devel&id=ea31c00845c858098d232bd014bf27b5a63a668b > > > > After that, the parent bash[1509865] could not kill the child > bash[1509872] through sigterm, the log was listed below. > Mar 11 16:49:24 localhost.localdomain [1509762]: [ulimit -Su 10] return > code=[0], execute success by [syy(uid=1000)] from [pts/0 (9.2.65.118)] > Mar 11 16:49:30 localhost.localdomain [1509833]: [clear] return code=[0], > execute success by [root(uid=0)] from [pts/1 (9.2.65.118)] > Mar 11 16:49:33 localhost.localdomain kernel: copy_process: 443 callbacks > suppressed > Mar 11 16:49:33 localhost.localdomain kernel: task bash (pid 1509866) > alloc pid failed.pid number has exceeded 10 in user processes > ... > Mar 11 16:49:48 localhost.localdomain sysmonitor[1125]: sysmonitor[1125]: > comm:bash exe:bash[1509865](parent comm:systemd parent exe:systemd[1]) send > SIGTERM to comm:bash exe:bash[1509872]. > ... > Mar 11 16:49:48 localhost.localdomain sysmonitor[1125]: sysmonitor[1125]: > comm:bash exe:bash[1509865](parent comm:systemd parent exe:systemd[1]) send > SIGTERM to comm:bash exe:bash[1509865]. > Mar 11 16:49:48 localhost.localdomain [1510037]: [:(){ :|:& };:] return > code=[0], execute success by [syy(uid=1000)] from [pts/0 (9.2.65.118)] > > The sigterm was sent by the function terminate_current_pipeline in jobs.c? > So should we change a signal to realize the killing action as the sigterm > could not kill the child bash now? > Or finding out the reason about the growing memory occupied by the bashes, > but it is quite difficult for me unfortunately. > I made a patch to express what I mean more clearly: > From 467269c690efb24ade5ae9cdb9b9a25c1452a5a4 Mon Sep 17 00:00:00 2001 > From: Yangyang Shen <21625...@zju.edu.cn> > Date: Sat, 11 Mar 2023 06:34:08 -0500 > Subject: [PATCH] change sigterm to sigkill as bash child will ingnore > sigterm > > --- > jobs.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/jobs.c b/jobs.c > index 7c3b6e8..d9d6a7e 100644 > --- a/jobs.c > +++ b/jobs.c > @@ -1643,7 +1643,7 @@ terminate_current_pipeline () > { > if (pipeline_pgrp && pipeline_pgrp != shell_pgrp) > { > - killpg (pipeline_pgrp, SIGTERM); > + killpg (pipeline_pgrp, SIGKILL); > killpg (pipeline_pgrp, SIGCONT); > } > } > -- > 2.19.1 > > > > >