Fairly long mail with loads of boring debug output, skip this if
you're not interested in ospf / route table dumps.

Here's a typical ospfd startup on a machine with a full route table
already present. Note it does 33 full table dumps. It's on my home
hack network, just a couple of ospf speakers, the bgp routes in this
case are from an ebgp multihop session with qualify via default and
nexthop overwritten on all of them, just to give me a decent
number of routes in the table. (Not that this really matters,
just giving a bit of background).


$ sudo ospfd -vd 2>&1 | ts %.s
1311944647.245361 startup
1311944647.248227 orig_rtr_lsa: area 0.0.0.0
1311944647.250257 orig_rtr_lsa: stub net, interface trunk0
1311944647.252789 if_fsm: event UP resulted in action START and changing state 
for interface trunk0 from DOWN to WAIT
1311944647.461167 nbr_fsm: event HELLO_RECEIVED resulted in action 
START_INACTIVITY_TIMER and changing state for neighbor ID 85.158.44.158 from 
DOWN to INIT
1311944647.461348 nbr_fsm: event 2_WAY_RECEIVED resulted in action EVAL and 
changing state for neighbor ID 85.158.44.158 from INIT to 2-WAY
1311944647.463438 if_fsm: event NEIGHBORCHANGE resulted in action NOTHING and 
changing state for interface trunk0 from WAIT to WAIT
1311944647.483172 nbr_fsm: event HELLO_RECEIVED resulted in action 
START_INACTIVITY_TIMER and changing state for neighbor ID 85.158.44.145 from 
DOWN to INIT
1311944647.485408 nbr_fsm: event 2_WAY_RECEIVED resulted in action EVAL and 
changing state for neighbor ID 85.158.44.145 from INIT to 2-WAY
1311944647.487592 if_act_elect: interface trunk0 old dr none new dr 
85.158.44.158, old bdr none new bdr 85.158.44.145
1311944647.489681 nbr_fsm: event ADJ_OK resulted in action EVAL and changing 
state for neighbor ID 85.158.44.145 from 2-WAY to EXSTA
1311944647.491838 nbr_fsm: event ADJ_OK resulted in action EVAL and changing 
state for neighbor ID 85.158.44.158 from 2-WAY to EXSTA
1311944647.494002 orig_rtr_lsa: area 0.0.0.0
1311944647.496179 orig_rtr_lsa: stub net, interface trunk0
1311944647.498037 orig_rtr_lsa: area 0.0.0.0
1311944647.500199 orig_rtr_lsa: stub net, interface trunk0
1311944647.502043 if_fsm: event BACKUPSEEN resulted in action ELECT and 
changing state for interface trunk0 from WAIT to OTHER
1311944647.504205 if_act_elect: interface trunk0 old dr 85.158.44.158 new dr 
85.158.44.158, old bdr 85.158.44.145 new bdr 85.158.44.145
1311944647.506265 if_fsm: event NEIGHBORCHANGE resulted in action ELECT and 
changing state for interface trunk0 from OTHER to OTHER
1311944647.508422 nbr_fsm: event NEGOTIATION_DONE resulted in action SNAPSHOT 
and changing state for neighbor ID 85.158.44.158 from EXSTA to SNAP
1311944647.512110 nbr_fsm: event SNAPSHOT_DONE resulted in action SNAPSHOT_DONE 
and changing state for neighbor ID 85.158.44.158 from SNAP to EXCHG
1311944647.512517 nbr_fsm: event NEGOTIATION_DONE resulted in action SNAPSHOT 
and changing state for neighbor ID 85.158.44.145 from EXSTA to SNAP
1311944647.514394 nbr_fsm: event SNAPSHOT_DONE resulted in action SNAPSHOT_DONE 
and changing state for neighbor ID 85.158.44.145 from SNAP to EXCHG
1311944647.516596 nbr_fsm: event EXCHANGE_DONE resulted in action EXCHANGE_DONE 
and changing state for neighbor ID 85.158.44.158 from EXCHG to LOAD
1311944647.518775 nbr_fsm: event EXCHANGE_DONE resulted in action EXCHANGE_DONE 
and changing state for neighbor ID 85.158.44.145 from EXCHG to LOAD
1311944647.520946 orig_rtr_lsa: area 0.0.0.0
1311944647.523029 orig_rtr_lsa: stub net, interface trunk0
1311944647.524893 nbr_fsm: event LOADING_DONE resulted in action NOTHING and 
changing state for neighbor ID 85.158.44.145 from LOAD to FULL
1311944647.527055 orig_rtr_lsa: area 0.0.0.0
1311944647.528960 orig_rtr_lsa: transit net, interface trunk0
1311944647.531124 nbr_fsm: event LOADING_DONE resulted in action NOTHING and 
changing state for neighbor ID 85.158.44.158 from LOAD to FULL
1311944648.011830 rde_asext_get: 85.158.44.144/28 is net LSA
1311944648.258556 if_fsm: event WAITTIMER resulted in action NOTHING and 
changing state for interface trunk0 from OTHER to OTHER
1311944648.268307 spf_calc: area 0.0.0.0 calculated
1311944650.162326 reloading interface list and routing table
1311944651.691387 rde_asext_get: 85.158.44.144/28 is net LSA
1311944651.695676 reloading interface list and routing table
1311944653.317742 rde_asext_get: 85.158.44.144/28 is net LSA
1311944653.438272 reloading interface list and routing table
1311944654.617898 rde_asext_get: 85.158.44.144/28 is net LSA
1311944654.858517 reloading interface list and routing table
1311944655.824418 rde_asext_get: 85.158.44.144/28 is net LSA
1311944655.948133 reloading interface list and routing table
1311944657.100642 rde_asext_get: 85.158.44.144/28 is net LSA
1311944657.351365 reloading interface list and routing table
1311944658.196634 rde_asext_get: 85.158.44.144/28 is net LSA
1311944658.244793 reloading interface list and routing table
1311944659.411147 rde_asext_get: 85.158.44.144/28 is net LSA
1311944659.528240 reloading interface list and routing table
1311944660.453517 rde_asext_get: 85.158.44.144/28 is net LSA
1311944660.529119 reloading interface list and routing table
1311944661.684422 rde_asext_get: 85.158.44.144/28 is net LSA
1311944661.702691 reloading interface list and routing table
1311944662.620291 rde_asext_get: 85.158.44.144/28 is net LSA
1311944662.650936 reloading interface list and routing table
1311944663.822432 rde_asext_get: 85.158.44.144/28 is net LSA
1311944663.829135 reloading interface list and routing table
1311944664.811774 rde_asext_get: 85.158.44.144/28 is net LSA
1311944664.878562 reloading interface list and routing table
1311944665.742252 rde_asext_get: 85.158.44.144/28 is net LSA
1311944665.849591 reloading interface list and routing table
1311944666.892424 rde_asext_get: 85.158.44.144/28 is net LSA
1311944666.919084 reloading interface list and routing table
1311944667.897500 rde_asext_get: 85.158.44.144/28 is net LSA
1311944667.950069 reloading interface list and routing table
1311944668.834394 rde_asext_get: 85.158.44.144/28 is net LSA
1311944668.839092 reloading interface list and routing table
1311944669.794673 rde_asext_get: 85.158.44.144/28 is net LSA
1311944669.881599 reloading interface list and routing table
1311944670.833773 rde_asext_get: 85.158.44.144/28 is net LSA
1311944670.838419 reloading interface list and routing table
1311944671.701701 rde_asext_get: 85.158.44.144/28 is net LSA
1311944671.706570 reloading interface list and routing table
1311944672.680161 rde_asext_get: 85.158.44.144/28 is net LSA
1311944672.685104 reloading interface list and routing table
1311944673.515843 rde_asext_get: 85.158.44.144/28 is net LSA
1311944673.590037 reloading interface list and routing table
1311944674.343083 rde_asext_get: 85.158.44.144/28 is net LSA
1311944674.467545 reloading interface list and routing table
1311944675.250285 rde_asext_get: 85.158.44.144/28 is net LSA
1311944675.288347 reloading interface list and routing table
1311944676.074780 rde_asext_get: 85.158.44.144/28 is net LSA
1311944676.079273 reloading interface list and routing table
1311944676.873384 rde_asext_get: 85.158.44.144/28 is net LSA
1311944676.935585 reloading interface list and routing table
1311944677.817690 rde_asext_get: 85.158.44.144/28 is net LSA
1311944677.823135 reloading interface list and routing table
1311944678.818398 rde_asext_get: 85.158.44.144/28 is net LSA
1311944678.839772 reloading interface list and routing table
1311944679.760439 rde_asext_get: 85.158.44.144/28 is net LSA
1311944679.764297 reloading interface list and routing table
1311944680.608475 rde_asext_get: 85.158.44.144/28 is net LSA
1311944680.618463 reloading interface list and routing table
1311944681.470117 rde_asext_get: 85.158.44.144/28 is net LSA
1311944681.491439 reloading interface list and routing table
1311944682.472115 rde_asext_get: 85.158.44.144/28 is net LSA
1311944682.533400 reloading interface list and routing table
1311944683.515197 rde_asext_get: 85.158.44.144/28 is net LSA
1311944683.518727 reloading interface list and routing table
1311944684.250171 rde_asext_get: 85.158.44.144/28 is net LSA

As you might expect considering that the kernel locks on
table dumps, this sucks somewhat.

Here's a diff. Rather than reloading directly on receipt of
RTM_DESYNC it arms an evtimer so that it defers reloading until
1sec after the last desync message. (Randomly chosen value and
might want to be larger, but it serves well as a demonstration).

Index: kroute.c
===================================================================
RCS file: /cvs/src/usr.sbin/ospfd/kroute.c,v
retrieving revision 1.90
diff -u -p -r1.90 kroute.c
--- kroute.c    4 Jul 2011 04:34:14 -0000       1.90
+++ kroute.c    29 Jul 2011 13:09:27 -0000
@@ -48,6 +48,7 @@ struct {
        int                     fd;
        struct event            ev;
        u_int                   rdomain;
+       struct event            reload;
 } kr_state;
 
 struct kroute_node {
@@ -102,6 +103,7 @@ int         dispatch_rtmsg(void);
 int            fetchtable(void);
 int            fetchifs(u_short);
 int            rtmsg_process(char *, int);
+void           kr_fib_reload_timer(int, short, void *);
 
 RB_HEAD(kroute_tree, kroute_node)      krt;
 RB_PROTOTYPE(kroute_tree, kroute_node, entry, kroute_compare)
@@ -170,6 +172,8 @@ kr_init(int fs, u_int rdomain)
            kr_dispatch_msg, NULL);
        event_add(&kr_state.ev, NULL);
 
+       evtimer_set(&kr_state.reload, kr_fib_reload_timer, NULL);
+
        return (0);
 }
 
@@ -357,7 +361,14 @@ kr_fib_decouple(void)
 }
 
 void
-kr_fib_reload(void)
+kr_fib_reload_timer(int fd, short event, void *bula)
+{
+       log_info("reloading via timer");
+       kr_fib_reload();
+}
+
+void
+kr_fib_reload()
 {
        struct kroute_node      *krn, *kr, *kn;
 
@@ -1295,6 +1306,7 @@ rtmsg_process(char *buf, int len)
        struct sockaddr_rtlabel *label;
        struct kroute_node      *kr, *okr;
        struct in_addr           prefix, nexthop;
+       struct timeval           tv;
        u_int8_t                 prefixlen, prio;
        int                      flags, mpath;
        u_short                  ifindex = 0;
@@ -1527,7 +1539,11 @@ add:
                         * We lost some routing packets. Force a reload of
                         * the kernel route/interface information.
                         */
-                       kr_fib_reload();
+                       timerclear(&tv);
+                       tv.tv_sec = 1;
+                       log_info("desync; reloading in +1s");
+                       if (evtimer_add(&kr_state.reload, &tv) == -1)
+                               fatal("add_reload_timer");
                        break;
                default:
                        /* ignore for now */


Log output now looks like the below and my keyboard doesn't
keep freezing up until ospfd has worked out what's going on.

- Any thoughts as to why we get RTM_DESYNC for about
15 seconds at startup?

- Is this something we want? (I tend to think yes, there seems
to be little point in immediately causing a bunch more load on
the route socket right when we already know it is flooded).

- If so is this diff the right way to do it? (possibly with
a bit less chatter, at this stage I'd like to know how many
desyncs we see, but that's probably logging a bit more than
we really want in production).



$ sudo ospfd -vd 2>&1 | ts %.s
1311944928.892021 startup
1311944928.892547 orig_rtr_lsa: area 0.0.0.0
1311944928.892639 orig_rtr_lsa: stub net, interface trunk0
1311944928.892732 if_fsm: event UP resulted in action START and changing state 
for interface trunk0 from DOWN to WAIT
1311944928.892817 nbr_fsm: event HELLO_RECEIVED resulted in action 
START_INACTIVITY_TIMER and changing state for neighbor ID 85.158.44.145 from 
DOWN to INIT
1311944928.892901 nbr_fsm: event 2_WAY_RECEIVED resulted in action EVAL and 
changing state for neighbor ID 85.158.44.145 from INIT to 2-WAY
1311944928.892982 if_fsm: event NEIGHBORCHANGE resulted in action NOTHING and 
changing state for interface trunk0 from WAIT to WAIT
1311944928.893062 nbr_fsm: event HELLO_RECEIVED resulted in action 
START_INACTIVITY_TIMER and changing state for neighbor ID 85.158.44.158 from 
DOWN to INIT
1311944928.893140 nbr_fsm: event 2_WAY_RECEIVED resulted in action EVAL and 
changing state for neighbor ID 85.158.44.158 from INIT to 2-WAY
1311944928.893220 if_fsm: event NEIGHBORCHANGE resulted in action NOTHING and 
changing state for interface trunk0 from WAIT to WAIT
1311944929.075417 recv_db_description: packet ignored in state 2-WAY, neighbor 
ID 85.158.44.158
1311944929.075602 recv_db_description: packet ignored in state 2-WAY, neighbor 
ID 85.158.44.145
1311944929.177956 if_act_elect: interface trunk0 old dr none new dr 
85.158.44.158, old bdr none new bdr 85.158.44.145
1311944929.178515 nbr_fsm: event ADJ_OK resulted in action EVAL and changing 
state for neighbor ID 85.158.44.158 from 2-WAY to EXSTA
1311944929.180768 nbr_fsm: event ADJ_OK resulted in action EVAL and changing 
state for neighbor ID 85.158.44.145 from 2-WAY to EXSTA
1311944929.181962 orig_rtr_lsa: area 0.0.0.0
1311944929.182075 orig_rtr_lsa: stub net, interface trunk0
1311944929.182165 orig_rtr_lsa: area 0.0.0.0
1311944929.182251 orig_rtr_lsa: stub net, interface trunk0
1311944929.182336 if_fsm: event BACKUPSEEN resulted in action ELECT and 
changing state for interface trunk0 from WAIT to OTHER
1311944929.182426 nbr_fsm: event NEGOTIATION_DONE resulted in action SNAPSHOT 
and changing state for neighbor ID 85.158.44.145 from EXSTA to SNAP
1311944929.182510 nbr_fsm: event SNAPSHOT_DONE resulted in action SNAPSHOT_DONE 
and changing state for neighbor ID 85.158.44.145 from SNAP to EXCHG
1311944929.182591 nbr_fsm: event EXCHANGE_DONE resulted in action EXCHANGE_DONE 
and changing state for neighbor ID 85.158.44.145 from EXCHG to LOAD
1311944929.183792 orig_rtr_lsa: area 0.0.0.0
1311944929.184244 orig_rtr_lsa: stub net, interface trunk0
1311944929.184366 nbr_fsm: event LOADING_DONE resulted in action NOTHING and 
changing state for neighbor ID 85.158.44.145 from LOAD to FULL
1311944929.605696 rde_asext_get: 85.158.44.144/28 is net LSA
1311944929.745106 if_fsm: event WAITTIMER resulted in action NOTHING and 
changing state for interface trunk0 from OTHER to OTHER
1311944929.745689 spf_calc: area 0.0.0.0 calculated
1311944934.088555 nbr_fsm: event NEGOTIATION_DONE resulted in action SNAPSHOT 
and changing state for neighbor ID 85.158.44.158 from EXSTA to SNAP
1311944934.088948 nbr_fsm: event SNAPSHOT_DONE resulted in action SNAPSHOT_DONE 
and changing state for neighbor ID 85.158.44.158 from SNAP to EXCHG
1311944934.091257 orig_rtr_lsa: area 0.0.0.0
1311944934.091481 orig_rtr_lsa: transit net, interface trunk0
1311944934.091569 nbr_fsm: event EXCHANGE_DONE resulted in action EXCHANGE_DONE 
and changing state for neighbor ID 85.158.44.158 from EXCHG to FULL
1311944934.754859 spf_calc: area 0.0.0.0 calculated
1311944936.644972 desync; reloading in +1s
1311944936.934774 desync; reloading in +1s
1311944937.119760 desync; reloading in +1s
1311944937.196244 desync; reloading in +1s
1311944937.259738 desync; reloading in +1s
1311944937.296499 desync; reloading in +1s
1311944937.692425 desync; reloading in +1s
1311944938.171783 desync; reloading in +1s
1311944938.240240 desync; reloading in +1s
1311944938.401020 desync; reloading in +1s
1311944938.449526 desync; reloading in +1s
1311944938.563503 desync; reloading in +1s
1311944938.988847 desync; reloading in +1s
1311944939.105727 desync; reloading in +1s
1311944939.139481 desync; reloading in +1s
1311944939.179924 desync; reloading in +1s
1311944939.223955 desync; reloading in +1s
1311944939.258888 desync; reloading in +1s
1311944939.293146 desync; reloading in +1s
1311944939.346465 desync; reloading in +1s
1311944939.454273 desync; reloading in +1s
1311944939.524388 desync; reloading in +1s
1311944939.580252 desync; reloading in +1s
1311944939.845603 desync; reloading in +1s
1311944940.130101 desync; reloading in +1s
1311944940.310999 desync; reloading in +1s
1311944940.379659 desync; reloading in +1s
1311944940.540292 desync; reloading in +1s
1311944940.567551 desync; reloading in +1s
1311944940.623952 desync; reloading in +1s
1311944940.656550 desync; reloading in +1s
1311944940.679952 desync; reloading in +1s
1311944940.737146 desync; reloading in +1s
1311944940.816541 desync; reloading in +1s
1311944941.119553 desync; reloading in +1s
1311944941.183786 desync; reloading in +1s
1311944941.289621 desync; reloading in +1s
1311944941.356420 desync; reloading in +1s
1311944941.520677 desync; reloading in +1s
1311944941.837621 desync; reloading in +1s
1311944941.911556 desync; reloading in +1s
1311944942.132253 desync; reloading in +1s
1311944942.306139 desync; reloading in +1s
1311944942.329337 desync; reloading in +1s
1311944942.452721 desync; reloading in +1s
1311944942.923603 desync; reloading in +1s
1311944943.344793 desync; reloading in +1s
1311944943.376447 desync; reloading in +1s
1311944943.514129 desync; reloading in +1s
1311944943.720768 desync; reloading in +1s
1311944943.776535 desync; reloading in +1s
1311944943.801344 desync; reloading in +1s
1311944943.930324 desync; reloading in +1s
1311944944.208302 desync; reloading in +1s
1311944944.225744 desync; reloading in +1s
1311944944.654888 desync; reloading in +1s
1311944944.803416 desync; reloading in +1s
1311944945.359961 desync; reloading in +1s
1311944945.386787 desync; reloading in +1s
1311944945.550980 desync; reloading in +1s
1311944945.618353 desync; reloading in +1s
1311944946.208881 desync; reloading in +1s
1311944946.245657 desync; reloading in +1s
1311944946.570283 desync; reloading in +1s
1311944946.677813 desync; reloading in +1s
1311944946.904857 desync; reloading in +1s
1311944946.929621 desync; reloading in +1s
1311944947.235428 desync; reloading in +1s
1311944947.314593 desync; reloading in +1s
1311944947.352818 desync; reloading in +1s
1311944947.376679 desync; reloading in +1s
1311944947.433288 desync; reloading in +1s
1311944947.479770 desync; reloading in +1s
1311944947.576309 desync; reloading in +1s
1311944947.612431 desync; reloading in +1s
1311944947.649463 desync; reloading in +1s
1311944947.779558 desync; reloading in +1s
1311944947.937614 desync; reloading in +1s
1311944948.002990 desync; reloading in +1s
1311944948.109801 desync; reloading in +1s
1311944948.254401 desync; reloading in +1s
1311944948.535950 desync; reloading in +1s
1311944948.655182 desync; reloading in +1s
1311944948.727690 desync; reloading in +1s
1311944948.754586 desync; reloading in +1s
1311944948.857421 desync; reloading in +1s
1311944948.878237 desync; reloading in +1s
1311944948.916328 desync; reloading in +1s
1311944949.002707 desync; reloading in +1s
1311944949.115686 desync; reloading in +1s
1311944949.302041 desync; reloading in +1s
1311944949.609576 desync; reloading in +1s
1311944949.677969 desync; reloading in +1s
1311944949.708876 desync; reloading in +1s
1311944949.776538 desync; reloading in +1s
1311944949.878124 desync; reloading in +1s
1311944949.984918 desync; reloading in +1s
1311944950.013782 desync; reloading in +1s
1311944950.159249 desync; reloading in +1s
1311944950.278763 desync; reloading in +1s
1311944950.319382 desync; reloading in +1s
1311944950.455139 desync; reloading in +1s
1311944950.455293 desync; reloading in +1s
1311944950.725111 desync; reloading in +1s
1311944951.736701 reloading via timer
1311944951.864051 reloading interface list and routing table
1311944952.553177 rde_asext_get: 85.158.44.144/28 is net LSA


bgpd needs similar RTM_DESYNC handling but until ospfd copes
better, I'd not be too happy to add yet more code doing full
table dumps, this sort of thing is pretty common:

Jul 29 08:39:01 last message repeated 13 times
Jul 29 09:09:43 ospfd[7105]: reloading interface list and routing table
Jul 29 09:10:07 last message repeated 15 times
Jul 29 09:38:44 ospfd[7105]: reloading interface list and routing table
Jul 29 09:41:31 last message repeated 77 times
Jul 29 09:46:18 ospfd[7105]: reloading interface list and routing table
Jul 29 09:46:50 last message repeated 26 times
Jul 29 09:49:07 last message repeated 15 times
Jul 29 09:49:19 last message repeated 5 times
Jul 29 10:38:42 ospfd[7105]: reloading interface list and routing table
Jul 29 10:40:47 last message repeated 42 times
Jul 29 11:38:42 ospfd[7105]: reloading interface list and routing table
Jul 29 11:40:42 last message repeated 21 times
Jul 29 12:09:44 ospfd[7105]: reloading interface list and routing table
Jul 29 12:10:38 last message repeated 18 times
Jul 29 12:38:42 ospfd[7105]: reloading interface list and routing table
Jul 29 12:38:43 ospfd[7105]: reloading interface list and routing table
Jul 29 13:09:49 ospfd[7105]: reloading interface list and routing table
Jul 29 13:10:37 last message repeated 14 times

Reply via email to