On Tue, Nov 30, 2010 at 11:25:41AM +0100, Claudio Jeker wrote:
> On Tue, Nov 30, 2010 at 10:13:13AM +0100, Otto Moerbeek wrote:
> > On Tue, Nov 30, 2010 at 08:35:46AM +0100, Xavier Beaudouin wrote:
> >
> > > Hello,
> > >
> > > I have updated a openbgpd router from OpenBSD 4.7 i386 to 4.8 amd64.
> > >
> > > Now I have new instability like this :
> > >
> > > Nov 29 21:25:22 core-3 bgpd[28895]: fatal in RDE: path_alloc: Cannot
> > > allocate
> > > memory
> > > Nov 30 02:01:47 core-3 bgpd[5522]: fatal in RDE: up_generate: Cannot
> > > allocate
> > > memory
> > >
> > > I have 2Gb on this machine and login.conf like this :
> > >
> > > default:\
> > > :path=/usr/bin /bin /usr/sbin /sbin /usr/X11R6/bin
> > > /usr/local/bin:\
> > > :umask=022:\
> > > :datasize-max=1512M:\
> > > :datasize-cur=1024M:\
> > > :maxproc-max=2048:\
> > > :maxproc-cur=1024:\
> > > :openfiles-cur=1024:\
> > > :stacksize-cur=4M:\
> > > :localcipher=blowfish,6:\
> > > :ypcipher=old:\
> > > :tc=auth-defaults:\
> > > :tc=auth-ftp-defaults:
> > >
> > > This currently make me mad, because this router handle more than 130
> > > peers and
> > > is still unstable.
> > >
> > > What is needed to make openbgpd work as it should and shuttup ?
> > >
> > > (I am going to add a monit... because on production day this is not
> > > acceptable).
> > >
> > > Xavier
> >
> > By default daemons run in the daemon login class. Check that, also
> > check if you do not have stale /etc/login.conf.db file lying around.
> >
> > AFAIK I know, bgpd does not increase its limits to the max, so it does
> > not make sense to have different values for -max and -cur.
> >
> > If these things don't help, analyzing this requires some specific bgpd
> > knowledge, which I do not have.
> >
>
> Maybe it is time to change the default datalimit in the RDE. So maybe
> something like this may help.
> bgpd needs quite a bit more (temporary) memory when running with
> softreconfig. A lot of additional memory is needed on reloads and when
> large sessions flap that cause a lot of UPDATE messages.
>
> Side note: bgpd on amd64 needs quite a bit more memory then i386 because
> of the 64bit pointers.
Two questions:
- why the getrlimit() if you are seting both cur and max?
- isn't it better to set cur to max? Running with no bounds feels not ok.
-Otto
> --
> :wq Claudio
>
> Index: rde.c
> ===================================================================
> RCS file: /cvs/src/usr.sbin/bgpd/rde.c,v
> retrieving revision 1.302
> diff -u -p -r1.302 rde.c
> --- rde.c 24 Nov 2010 00:58:10 -0000 1.302
> +++ rde.c 30 Nov 2010 10:12:56 -0000
> @@ -18,6 +18,8 @@
>
> #include <sys/types.h>
> #include <sys/socket.h>
> +#include <sys/time.h>
> +#include <sys/resource.h>
>
> #include <errno.h>
> #include <ifaddrs.h>
> @@ -156,6 +158,7 @@ pid_t
> rde_main(int pipe_m2r[2], int pipe_s2r[2], int pipe_m2s[2], int
> pipe_s2rctl[2],
> int debug)
> {
> + struct rlimit rl;
> pid_t pid;
> struct passwd *pw;
> struct pollfd *pfd = NULL;
> @@ -184,6 +187,13 @@ rde_main(int pipe_m2r[2], int pipe_s2r[2
>
> setproctitle("route decision engine");
> bgpd_process = PROC_RDE;
> +
> + if (getrlimit(RLIMIT_DATA, &rl) == -1)
> + fatal("getrlimit");
> + rl.rlim_cur = RLIM_INFINITY;
> + rl.rlim_max = RLIM_INFINITY;
> + if (setrlimit(RLIMIT_DATA, &rl) == -1)
> + fatal("setrlimit");
>
> if (setgroups(1, &pw->pw_gid) ||
> setresgid(pw->pw_gid, pw->pw_gid, pw->pw_gid) ||