You could try actually looking at the errata list first...
006: RELIABILITY FIX: November 21, 2004 All architectures
Fix for transmit side breakage on macppc and mbuf leaks with xl(4).
I wonder what that is for..
On Thu, Jun 09, 2005 at 09:10:17PM -0500, R Ginn wrote:
> Hi,
> OpenBSD 3.6 (I'm running i386) seems to have a memory leak as regards
> to its use of mbufs for network traffic. The default number of
> mbuf clusters (kern.maxclusters) is fine until I run a series of
> dump commands to a tape drive on a remote system. After the dump
> completes, the number of mbufs in use remains high. Each time I
> run another dump, the number climbs. Soon I run out of them and
> the system locks all ethernet traffic (which hangs all the other
> systems depending on this one). Increasing the kern.maxclusters
> at this point unlocks the system (although the dump terminates at
> that point).
>
> Fortunately, when it hangs, it spits out a message to indicate
> that it ran out of mbuf clusters and to increase kern.maxclusters
> BTW, kudos to whoever put that message and suggestion in, it is
> a great/necessary feature that is so often missing in products.
>
> Note that after the dump completes, there are no extra processes
> left (the # of processes before I run the rdump = the # of processes
> after the rdump completes).
>
> I checked w/ipcs to see if dump was using any shared memory
> but, as expected, it doesn't and there weren't any in use.
>
> Here is the dump command being used:
>
> dump 0udbsf 54000 64 96000 [EMAIL PROTECTED]:/dev/nrst0 /
>
> Before the dump, 40 mbufs and 33 mbuf clusters are in use.
> After the dump, 437 mbufs and 146 mbuf clusters are in use.
> Before a 2nd dump, 438 mbufs and 148 mbuf clusters are in use.
> After a 2nd dump, 4329 mbufs and 1197 mbuf clusters are in use.
> Before a 3rd dump, 4330 mbufs and 1199 mbuf clusters are in use.
> After a 3rd dump, 8545 mbufs and 2325 mbuf clusters are in use.
>
> BTW, the first dump here is for "/" and the 2nd dump is
> for "/usr" ("/usr" is about 10x bigger than "/"). To
> eliminate the case where the issue is just the highwater
> mark, the 3rd dump above is an identical dump of "/usr".
>
> So, since dump (and nothing else extra) is running after the dump
> completes, I don't know why the system is "using" more mbufs after
> it completes its dump.
>
> I noticed that a wireless driver had an mbuf leak. So, in case
> it's relevant, I am using the xl(4) ethernet driver.
>
> So, is this a memory/mbuf leak in the kernel? Am I doing something
> wrong? Is there anything I can do to "clean up" after each dump?
> My current work-around is to set a very large (40,000) maxclusters
> value and reboot the system after each set of dumps but that really
> rubs me the wrong way -- this is a UNIX(y 8-) system after all ...
>
> I've provided some traces below.
>
> Thanks,
> Rob Ginn
> [EMAIL PROTECTED]
>
> BEFORE I run an remote dump (but after a reboot)
> ================================================
>
> Script started on Thu Jun 9 16:14:17 2005
> demo# netstat -m
> 40 mbufs in use:
> 35 mbufs allocated to data
> 1 mbuf allocated to packet headers
> 4 mbufs allocated to socket names and addresses
> 33/46/40000 mbuf clusters in use (current/peak/max)
> 112 Kbytes allocated to network (67% in use)
> 0 requests for memory denied
> 0 requests for memory delayed
> 0 calls to protocol drain routines
> demo# ps xa
> PID TT STAT TIME COMMAND
> 1 ?? Is 0:00.04 /sbin/init
> 21191 ?? Is 0:00.03 syslogd: [priv] (syslogd)
> 26414 ?? I 0:00.09 syslogd -a /var/empty/dev/log
> 30515 ?? Is 0:00.01 pflogd: [priv] (pflogd)
> 16850 ?? Is 0:00.01 portmap
> 28016 ?? I 0:00.32 pflogd: [running] -s 116 -f /var/log/pflog (pflogd)
> 10782 ?? I 0:00.05 ypserv
> 4580 ?? Is 0:00.30 ypbind
> 26954 ?? Is 0:00.01 mountd
> 20553 ?? Is 0:00.01 nfsd: master (nfsd)
> 11934 ?? IL 0:00.00 nfsd: server (nfsd)
> 14637 ?? IL 0:00.40 nfsd: server (nfsd)
> 6754 ?? IL 0:00.00 nfsd: server (nfsd)
> 15064 ?? IL 0:00.00 nfsd: server (nfsd)
> 16771 ?? Is 0:00.00 rpc.lockd
> 20629 ?? Is 0:00.07 /usr/sbin/dhcpd xl0
> 3712 ?? Is 0:00.01 lpd
> 26612 ?? Is 0:00.02 inetd
> 21469 ?? Is 0:00.42 sendmail: accepting connections (sendmail)
> 24532 ?? Is 0:00.17 /usr/sbin/sshd
> 14769 ?? I 0:00.01 rarpd -a
> 25583 ?? Is 0:00.01 rpc.bootparamd
> 13440 ?? Is 0:00.01 mopd -a
> 10486 ?? Is 0:00.00 /usr/local/adm/bin/rpc.statd
> 23664 ?? Is 0:00.04 cron
> 27922 p0 Is 0:00.02 -bin/csh -i
> 17109 p0 ?+ 0:00.00 ps -xa
> 12055 C0 Is 0:00.07 -csh (csh)
> 31440 C0 I+ 0:00.01 script BEFORE
> 20480 C0 I+ 0:00.01 script BEFORE
> 29807 C1 Is+ 0:00.01 /usr/libexec/getty Pc ttyC1
> 5065 C2 Is+ 0:00.01 /usr/libexec/getty Pc ttyC2
> 6641 C3 Is+ 0:00.01 /usr/libexec/getty Pc ttyC3
> 23297 C5 Is+ 0:00.01 /usr/libexec/getty Pc ttyC5
>
>
> AFTER I run a remote dump
> =========================
>
> Script started on Thu Jun 9 16:16:12 2005
> demo# netstat -m
> 437 mbufs in use:
> 232 mbufs allocated to data
> 201 mbufs allocated to packet headers
> 4 mbufs allocated to socket names and addresses
> 146/188/40000 mbuf clusters in use (current/peak/max)
> 516 Kbytes allocated to network (77% in use)
> 0 requests for memory denied
> 0 requests for memory delayed
> 0 calls to protocol drain routines
> demo# ps xa
> PID TT STAT TIME COMMAND
> 1 ?? Is 0:00.04 /sbin/init
> 21191 ?? Is 0:00.03 syslogd: [priv] (syslogd)
> 26414 ?? I 0:00.10 syslogd -a /var/empty/dev/log
> 30515 ?? Is 0:00.01 pflogd: [priv] (pflogd)
> 16850 ?? Is 0:00.01 portmap
> 28016 ?? I 0:00.32 pflogd: [running] -s 116 -f /var/log/pflog (pflogd)
> 10782 ?? I 0:00.05 ypserv
> 4580 ?? Is 0:00.31 ypbind
> 26954 ?? Is 0:00.01 mountd
> 20553 ?? Is 0:00.01 nfsd: master (nfsd)
> 11934 ?? IL 0:00.00 nfsd: server (nfsd)
> 14637 ?? IL 0:00.41 nfsd: server (nfsd)
> 6754 ?? IL 0:00.00 nfsd: server (nfsd)
> 15064 ?? IL 0:00.00 nfsd: server (nfsd)
> 16771 ?? Is 0:00.00 rpc.lockd
> 20629 ?? Is 0:00.07 /usr/sbin/dhcpd xl0
> 3712 ?? Is 0:00.01 lpd
> 26612 ?? Is 0:00.02 inetd
> 21469 ?? Is 0:00.42 sendmail: accepting connections (sendmail)
> 24532 ?? Is 0:00.17 /usr/sbin/sshd
> 14769 ?? I 0:00.01 rarpd -a
> 25583 ?? Is 0:00.01 rpc.bootparamd
> 13440 ?? Is 0:00.01 mopd -a
> 10486 ?? Is 0:00.00 /usr/local/adm/bin/rpc.statd
> 23664 ?? Is 0:00.04 cron
> 24790 p0 Is 0:00.02 -bin/csh -i
> 6134 p0 ?+ 0:00.00 ps -xa
> 12055 C0 Is 0:00.08 -csh (csh)
> 13116 C0 I+ 0:00.01 script AFTER
> 27577 C0 I+ 0:00.01 script AFTER
> 29807 C1 Is+ 0:00.01 /usr/libexec/getty Pc ttyC1
> 5065 C2 Is+ 0:00.01 /usr/libexec/getty Pc ttyC2
> 6641 C3 Is+ 0:00.01 /usr/libexec/getty Pc ttyC3
> 23297 C5 Is+ 0:00.01 /usr/libexec/getty Pc ttyC5