Package: apache2.2-common Version: 2.2.22-13 Severity: normal See attached graph.png. The 1+ gb memory plateau is due to apache, which should normally be using more like 10 mb. I noticed this, and restarted it. A few hours later it happened again. At that point, I was using mpm-worker; I switched to mpm-prefork, and made each process only serve 1000 requests. Shortly after, it happened again. Note the abrupt slope; this is no slow leak.
My server only serves static files and runs a few cgi scripts. No php etc. The problem turned out to be caused by "sosospider", a Chinese web spider, which apparently ignores robots.txt[1]. It wandered into my gitweb (which is of course blocked from being spidered by robots.txt), and proceeded to try to download multiple tarballs snapshots of a 300 mb git repository at once. git.kitenet.net 123.151.139.212 - - [11/May/2013:01:22:40 -0400] "GET /?p=avianaquamiser.com;a=snapshot;h=4612ba9206187b86d1403b641dbc5fa00af19d93;sf=tgz HTTP/1.1" 200 142434304 "http://git.kitenet.net/" "Mozilla/5.0 (compatible; Sosospider/2.0; +http://help.soso.com/webspider.htm)" git.kitenet.net 123.151.139.212 - - [11/May/2013:01:23:10 -0400] "GET /?p=avianaquamiser.com;a=snapshot;h=70adc391bbf0e96b0b7ed021852817c372ca7b8f;sf=tgz HTTP/1.1" 200 - "http://git.kitenet.net/" "Mozilla/5.0 (compatible; Sosospider/2.0; +http://help.soso.com/webspider.htm)" git.kitenet.net 123.151.139.212 - - [11/May/2013:01:23:01 -0400] "GET /?p=avianaquamiser.com;a=snapshot;h=61401a7e229fb16878be9602d38032da05db1f90;sf=tgz HTTP/1.1" 200 7839744 "http://git.kitenet.net/" "Mozilla/5.0 (compatible; Sosospider/2.0; +http://help.soso.com/webspider.htm)" git.kitenet.net 123.151.139.212 - - [11/May/2013:01:23:30 -0400] "GET /?p=avianaquamiser.com;a=snapshot;h=7fde76d445040bc8cd2313d283d63fd1a955963e;sf=tgz HTTP/1.1" 200 2695168 "http://git.kitenet.net/" "Mozilla/5.0 (compatible; Sosospider/2.0; +http://help.soso.com/webspider.htm)" This causes gitweb to be very active, but somehow this also makes apache's memory use balloon up quite high. In top, I saw multiple apache processes over 50 mb each. Prefork was a worse choice; server became nonresponsive and I had to reboot it. I have configured gitweb with $feature{'snapshot'}{'default'} = []; and blacklisted this spider's address space, so I hope I will not see this again. I don't understand why apache is using all that memory. Could it be trying to buffer the cgi's output? If apache mallocs a lot of memory for such a buffer, will it ever free it? Perhaps sosospider is doing additional evil things beyond ignoring robots.txt, that cause this behavior. IMHO gitweb should not come configured this way by default, but the apache behavior is especially concerning. [1] The spider's website claims "在robots.txt中添加了禁止访问的规则后,sosospider即会遵循按规则停止相应的页面/站点抓取" but I get the impression from looking up this spider that they're lying or incompetant. My robots.txt file has been in place for 5 years. -- Package-specific info: List of enabled modules from 'apache2 -M': alias auth_basic authn_file authz_default authz_groupfile authz_host authz_user autoindex cgi deflate dir env expires include mime negotiation reqtimeout rewrite setenvif status userdir -- System Information: Debian Release: 7.0 APT prefers unstable APT policy: (500, 'unstable'), (500, 'testing') Architecture: i386 (i686) Kernel: Linux 3.2.0-4-686-pae (SMP w/2 CPU cores) Locale: LANG=en_US.utf8, LC_CTYPE=en_US.utf8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Versions of packages apache2 depends on: ii apache2-mpm-prefork 2.2.22-13 ii apache2.2-common 2.2.22-13 apache2 recommends no packages. apache2 suggests no packages. Versions of packages apache2.2-common depends on: ii apache2-utils 2.2.22-13 ii apache2.2-bin 2.2.22-13 ii lsb-base 4.1+Debian9 ii mime-support 3.52-2 ii perl 5.14.2-20 ii procps 1:3.3.4-2 Versions of packages apache2.2-common recommends: ii ssl-cert 1.0.32 Versions of packages apache2.2-common suggests: pn apache2-doc <none> ii apache2-suexec 2.2.22-13 ii chromium [www-browser] 25.0.1364.160-1 ii epiphany-browser [www-browser] 3.4.2-2.1 ii iceweasel [www-browser] 10.0.12esr-1+nmu1 ii konqueror [www-browser] 4:4.8.4-2 ii lynx-cur [www-browser] 2.8.8dev.15-2 ii w3m [www-browser] 0.5.3-8 -- no debconf information -- see shy jo
<<attachment: graph.png>>
signature.asc
Description: Digital signature