On Tue, Nov 30, 2010 at 10:13:13AM +0100, Otto Moerbeek wrote:
> On Tue, Nov 30, 2010 at 08:35:46AM +0100, Xavier Beaudouin wrote:
>
> > Hello,
> >
> > I have updated a openbgpd router from OpenBSD 4.7 i386 to 4.8 amd64.
> >
> > Now I have new instability like this :
> >
> > Nov 29 21:25:22 core-3 bgpd[28895]: fatal in RDE: path_alloc: Cannot
> > allocate
> > memory
> > Nov 30 02:01:47 core-3 bgpd[5522]: fatal in RDE: up_generate: Cannot
> > allocate
> > memory
> >
> > I have 2Gb on this machine and login.conf like this :
> >
> > default:\
> > :path=/usr/bin /bin /usr/sbin /sbin /usr/X11R6/bin /usr/local/bin:\
> > :umask=022:\
> > :datasize-max=1512M:\
> > :datasize-cur=1024M:\
> > :maxproc-max=2048:\
> > :maxproc-cur=1024:\
> > :openfiles-cur=1024:\
> > :stacksize-cur=4M:\
> > :localcipher=blowfish,6:\
> > :ypcipher=old:\
> > :tc=auth-defaults:\
> > :tc=auth-ftp-defaults:
> >
> > This currently make me mad, because this router handle more than 130 peers
> > and
> > is still unstable.
> >
> > What is needed to make openbgpd work as it should and shuttup ?
> >
> > (I am going to add a monit... because on production day this is not
> > acceptable).
> >
> > Xavier
>
> By default daemons run in the daemon login class. Check that, also
> check if you do not have stale /etc/login.conf.db file lying around.
>
> AFAIK I know, bgpd does not increase its limits to the max, so it does
> not make sense to have different values for -max and -cur.
>
> If these things don't help, analyzing this requires some specific bgpd
> knowledge, which I do not have.
>
Maybe it is time to change the default datalimit in the RDE. So maybe
something like this may help.
bgpd needs quite a bit more (temporary) memory when running with
softreconfig. A lot of additional memory is needed on reloads and when
large sessions flap that cause a lot of UPDATE messages.
Side note: bgpd on amd64 needs quite a bit more memory then i386 because
of the 64bit pointers.
--
:wq Claudio
Index: rde.c
===================================================================
RCS file: /cvs/src/usr.sbin/bgpd/rde.c,v
retrieving revision 1.302
diff -u -p -r1.302 rde.c
--- rde.c 24 Nov 2010 00:58:10 -0000 1.302
+++ rde.c 30 Nov 2010 10:12:56 -0000
@@ -18,6 +18,8 @@
#include <sys/types.h>
#include <sys/socket.h>
+#include <sys/time.h>
+#include <sys/resource.h>
#include <errno.h>
#include <ifaddrs.h>
@@ -156,6 +158,7 @@ pid_t
rde_main(int pipe_m2r[2], int pipe_s2r[2], int pipe_m2s[2], int pipe_s2rctl[2],
int debug)
{
+ struct rlimit rl;
pid_t pid;
struct passwd *pw;
struct pollfd *pfd = NULL;
@@ -184,6 +187,13 @@ rde_main(int pipe_m2r[2], int pipe_s2r[2
setproctitle("route decision engine");
bgpd_process = PROC_RDE;
+
+ if (getrlimit(RLIMIT_DATA, &rl) == -1)
+ fatal("getrlimit");
+ rl.rlim_cur = RLIM_INFINITY;
+ rl.rlim_max = RLIM_INFINITY;
+ if (setrlimit(RLIMIT_DATA, &rl) == -1)
+ fatal("setrlimit");
if (setgroups(1, &pw->pw_gid) ||
setresgid(pw->pw_gid, pw->pw_gid, pw->pw_gid) ||