Hi Ilia, On 3/3/26 5:08 PM, Ilia Baikov wrote: >> Just to be sure I understand. Is traffic from your workloads still >> affected? > > Unfortunately it still affects. As ARP is handled other way (arp flow > is dropped at OvS level) some of the instances has no public connectivity. > > >> Double checking, are you sure you upgraded ovn-northd to use the version >> from my branch? > > 100% sure. I've builded packages from source, packed into deb packages > and then builded kolla images using self-hosted repo with compiled deb > packages. >
Thanks for checking! > ansible all -i multinode -m shell -a "docker exec ovn_northd ovn-northd > --version" --limit control > > us-east-standard-1 | CHANGED | rc=0 >> > ovn-northd 24.03.8 > Open vSwitch Library 3.3.8 > > us-east-4 | CHANGED | rc=0 >> > ovn-northd 24.03.8 > Open vSwitch Library 3.3.8 > I have a hunch and I think I know what might be happening. Is it possible that your logical switches don't have other_config:broadcast-arps-to-all-routers=false? Also, we're currently flooding all ARP requests coming from the fabric (entering OVN through the localnet port) to all logical switch ports => 4K resubmit limit gets hit. I'll update my test patch to cover this last part too while waiting for your reply on the LS config question above. Regards, Dumitru > > Regards, > > Ilia Baikov > [email protected] > > 03.03.2026 18:54, Dumitru Ceara пишет: >> On 3/3/26 3:54 PM, Ilia Baikov wrote: >>> Hi, >>> >> Hi, >> >> >>> Done, OVN components are now builded from your branch and deployed into >>> production region where issue persist. To be more precise lets focus on >>> a cluster that runs using L2 network setup, as this is the best field >>> for testing and reproduction for this case and not breaking other >>> stabilised regions which runs L3. >>> >>> ovn-controller 24.03.8 >>> Open vSwitch Library 3.3.8 >>> OpenFlow versions 0x6:0x6 >>> SB DB Schema 20.33.0 >>> >>> OVN components are deployed ~ at 14:11, then resubmit exceptions >>> appearing (24.03.2 just shows unrecognized op code (27) and so on). >>> >> Just to be sure I understand. Is traffic from your workloads still >> affected? >> >>> I've also enabled rconn/vconn dbg for ovn and ovs (but later than 14:11, >>> but it seems vconn/rconn shows something useful for debugging). >>> >>> ovn logs starting from 14:11 - https://gist.githubusercontent.com/ >>> frct1/5f99221e1519d1552c8ef16a7ec8ee52/ >>> raw/147e9a171e538f9cd837008181272437b1c7ed37/ovn.log >>> ovs logs starting from 14:11 - https://gist.githubusercontent.com/ >>> frct1/5f99221e1519d1552c8ef16a7ec8ee52/ >>> raw/147e9a171e538f9cd837008181272437b1c7ed37/ovs.log >>> >> Double checking, are you sure you upgraded ovn-northd to use the version >> from my branch? >> >> I'm asking because this packet should not hit the mc_flood_l2 group >> anymore: >> >> 2026-03-03T14:22:51.358Z|00012|ofproto_dpif_xlate(handler250)|WARN| >> Dropped >> 2244 log messages in last 60 seconds (most recently, 0 seconds ago) due >> to excessive rate >> 2026-03-03T14:22:51.358Z|00013|ofproto_dpif_xlate(handler250)|WARN|over >> 4096 resubmit actions on bridge br-int while processing >> arp,in_port=2409,vlan_tci=0x0000,dl_src=fa:16:3e:ba:70:84,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=166.1.160.225,arp_tpa=166.1.160.1,arp_op=1,arp_sha=fa:16:3e:ba:70:84,arp_tha=00:00:00:00:00:00 >> >> The mc_flood_l2 group has 2k ports (all the VM ports) but my change >> should change it to hit the mc_unknown group, which only has a handful >> of ports as you said in your previous email. >> >>> I will keep it running 24.03.8 for easier debugging. >>> >> Thanks, >> Dumitru >> >>> Regards, >>> >>> Ilia Baikov >>> [email protected] >>> >>> 03.03.2026 15:55, Dumitru Ceara пишет: >>>> On 3/2/26 7:24 PM, Ilia Baikov wrote: >>>>> Hi Dumitru! >>>>> >>>> Hi Ilia, >>>> >>>>> ovn-nbctl --no-leader-only list Logical_switch_port | grep unknown | >>>>> wc -l >>>>> *10 >>>>> >>>> OK, that's just a few (and I see on your other deployment too), so >>>> that's great. >>>> >>>> Mind trying out this WIP patch for now and see if it works for you? >>>> >>>> https://github.com/dceara/ovn/commits/mc_flood_l2_to_unknown-26.03 >>>> https://github.com/dceara/ovn/commits/mc_flood_l2_to_unknown-25.09 >>>> https://github.com/dceara/ovn/commits/mc_flood_l2_to_unknown-25.03 >>>> https://github.com/dceara/ovn/commits/mc_flood_l2_to_unknown-24.09 >>>> https://github.com/dceara/ovn/commits/mc_flood_l2_to_unknown-24.03 >>>> >>>> They're all the same, just based on different stable branches, I wasn't >>>> sure which one you'll need. >>>> >>>> Looking forward to hear your results. >>>> >>>> Thanks, >>>> Dumitru >>>> >>>> >>>>> *More output for lsp list (no filtering with ls uuid)* >>>>> >>>>> ovn-nbctl --no-leader-only list Logical_switch_port | grep unknown >>>>> -A 5 >>>>> addresses : [unknown] >>>>> dhcpv4_options : [] >>>>> dhcpv6_options : [] >>>>> dynamic_addresses : [] >>>>> enabled : true >>>>> external_ids : {"neutron:cidrs"="10.10.3.243/24", >>>>> "neutron:device_id"="ba1a43e2-5496-4ced-8b8c-9b42c5ddd6f1", >>>>> "neutron:device_owner"="network:floatingip_agent_gateway", >>>>> "neutron:host_id"=us-east-standard-2, "neutron:mtu"="", >>>>> "neutron:network_name"=neutron-bb8d0ef6-9b45-4398-86f3-51323a0db2cd, >>>>> "neutron:port_capabilities"="", "neutron:port_name"="", >>>>> "neutron:project_id"="", "neutron:revision_number"="5", >>>>> "neutron:security_group_ids"="", "neutron:subnet_pool_addr_scope4"="", >>>>> "neutron:subnet_pool_addr_scope6"="", "neutron:vnic_type"=normal} >>>>> -- >>>>> addresses : ["fa:16:3e:62:e5:5f 193.32.177.44", unknown] >>>>> dhcpv4_options : [] >>>>> dhcpv6_options : [] >>>>> dynamic_addresses : [] >>>>> enabled : true >>>>> external_ids : {"neutron:cidrs"="193.32.177.44/24", >>>>> "neutron:device_id"="544cddd2-7a53-492a-8933-91fb97fd0546", >>>>> "neutron:device_owner"="network:floatingip_agent_gateway", >>>>> "neutron:host_id"=us-east-standard-1, "neutron:mtu"="", >>>>> "neutron:network_name"=neutron-7dce255f-4824-4a21-a550-f8d03a25c285, >>>>> "neutron:port_capabilities"="", "neutron:port_name"="", >>>>> "neutron:project_id"="", "neutron:revision_number"="3", >>>>> "neutron:security_group_ids"="", "neutron:subnet_pool_addr_scope4"="", >>>>> "neutron:subnet_pool_addr_scope6"="", "neutron:vnic_type"=normal} >>>>> -- >>>>> addresses : [unknown] >>>>> dhcpv4_options : [] >>>>> dhcpv6_options : [] >>>>> dynamic_addresses : [] >>>>> enabled : true >>>>> external_ids : {"neutron:cidrs"="12.26.0.2/16", >>>>> "neutron:device_id"=dhcp8b62a377-0e4b-5497-b096-c08bf79b6c42- >>>>> c5db4fec-9c10-4022-835d-7281506d8a7e, >>>>> "neutron:device_owner"="network:dhcp", "neutron:host_id"=us-east- >>>>> standard-1, "neutron:mtu"="", "neutron:network_name"=neutron- >>>>> c5db4fec-9c10-4022-835d-7281506d8a7e, "neutron:port_capabilities"="", >>>>> "neutron:port_name"="", >>>>> "neutron:project_id"=a3b7099e62ac4fb9b3d548dfaff7aeaf, >>>>> "neutron:revision_number"="5", "neutron:security_group_ids"="", >>>>> "neutron:subnet_pool_addr_scope4"="", >>>>> "neutron:subnet_pool_addr_scope6"="", "neutron:vnic_type"=normal} >>>>> -- >>>>> addresses : [unknown] >>>>> dhcpv4_options : [] >>>>> dhcpv6_options : [] >>>>> dynamic_addresses : [] >>>>> enabled : true >>>>> external_ids : {"neutron:cidrs"="10.10.3.242/24", >>>>> "neutron:device_id"="544cddd2-7a53-492a-8933-91fb97fd0546", >>>>> "neutron:device_owner"="network:floatingip_agent_gateway", >>>>> "neutron:host_id"=us-east-standard-1, "neutron:mtu"="", >>>>> "neutron:network_name"=neutron-bb8d0ef6-9b45-4398-86f3-51323a0db2cd, >>>>> "neutron:port_capabilities"="", "neutron:port_name"="", >>>>> "neutron:project_id"="", "neutron:revision_number"="5", >>>>> "neutron:security_group_ids"="", "neutron:subnet_pool_addr_scope4"="", >>>>> "neutron:subnet_pool_addr_scope6"="", "neutron:vnic_type"=normal} >>>>> -- >>>>> addresses : ["fa:16:3e:6e:27:09 12.26.0.109", unknown] >>>>> dhcpv4_options : [] >>>>> dhcpv6_options : [] >>>>> dynamic_addresses : [] >>>>> enabled : true >>>>> external_ids : {"neutron:cidrs"="12.26.0.109/16", >>>>> "neutron:device_id"="6e2d75ce-1503-4e40-bc72-ef3adc59d45f", >>>>> "neutron:device_owner"="network:router_centralized_snat", >>>>> "neutron:host_id"=us-east-standard-1, "neutron:mtu"="", >>>>> "neutron:network_name"=neutron-c5db4fec-9c10-4022-835d-7281506d8a7e, >>>>> "neutron:port_capabilities"="", "neutron:port_name"="", >>>>> "neutron:project_id"="", "neutron:revision_number"="6", >>>>> "neutron:security_group_ids"="", "neutron:subnet_pool_addr_scope4"="", >>>>> "neutron:subnet_pool_addr_scope6"="", "neutron:vnic_type"=normal} >>>>> -- >>>>> addresses : ["fa:16:3e:0c:ac:01 12.26.1.76", unknown] >>>>> dhcpv4_options : [] >>>>> dhcpv6_options : [] >>>>> dynamic_addresses : [] >>>>> enabled : true >>>>> external_ids : {"neutron:cidrs"="12.26.1.76/16", >>>>> "neutron:device_id"="4d3f7d3d-a637-4e40-8bc3-fda4712a1ada", >>>>> "neutron:device_owner"="network:router_centralized_snat", >>>>> "neutron:host_id"=us-east-standard-1, "neutron:mtu"="", >>>>> "neutron:network_name"=neutron-c5db4fec-9c10-4022-835d-7281506d8a7e, >>>>> "neutron:port_capabilities"="", "neutron:port_name"="", >>>>> "neutron:project_id"="", "neutron:revision_number"="6", >>>>> "neutron:security_group_ids"="", "neutron:subnet_pool_addr_scope4"="", >>>>> "neutron:subnet_pool_addr_scope6"="", "neutron:vnic_type"=normal} >>>>> -- >>>>> addresses : [unknown] >>>>> dhcpv4_options : [] >>>>> dhcpv6_options : [] >>>>> dynamic_addresses : [] >>>>> enabled : true >>>>> external_ids : {"neutron:cidrs"="10.10.3.240/24", >>>>> "neutron:device_id"=dhcp8b62a377-0e4b-5497-b096-c08bf79b6c42- >>>>> bb8d0ef6-9b45-4398-86f3-51323a0db2cd, >>>>> "neutron:device_owner"="network:dhcp", "neutron:host_id"=us-east- >>>>> standard-1, "neutron:mtu"="", "neutron:network_name"=neutron- >>>>> bb8d0ef6-9b45-4398-86f3-51323a0db2cd, "neutron:port_capabilities"="", >>>>> "neutron:port_name"="", >>>>> "neutron:project_id"="03d31c9de2ec41c787add9b44aacd3a8", >>>>> "neutron:revision_number"="6", "neutron:security_group_ids"="", >>>>> "neutron:subnet_pool_addr_scope4"="", >>>>> "neutron:subnet_pool_addr_scope6"="", "neutron:vnic_type"=normal} >>>>> -- >>>>> addresses : [unknown] >>>>> dhcpv4_options : [] >>>>> dhcpv6_options : [] >>>>> dynamic_addresses : [] >>>>> enabled : [] >>>>> external_ids : {} >>>>> -- >>>>> addresses : [unknown] >>>>> dhcpv4_options : [] >>>>> dhcpv6_options : [] >>>>> dynamic_addresses : [] >>>>> enabled : true >>>>> external_ids : {"neutron:cidrs"="193.32.177.174/24", >>>>> "neutron:device_id"="ba1a43e2-5496-4ced-8b8c-9b42c5ddd6f1", >>>>> "neutron:device_owner"="network:floatingip_agent_gateway", >>>>> "neutron:host_id"=us-east-standard-2, "neutron:mtu"="", >>>>> "neutron:network_name"=neutron-7dce255f-4824-4a21-a550-f8d03a25c285, >>>>> "neutron:port_capabilities"="", "neutron:port_name"="", >>>>> "neutron:project_id"="", "neutron:revision_number"="5", >>>>> "neutron:security_group_ids"="", "neutron:subnet_pool_addr_scope4"="", >>>>> "neutron:subnet_pool_addr_scope6"="", "neutron:vnic_type"=normal} >>>>> -- >>>>> addresses : [unknown] >>>>> dhcpv4_options : [] >>>>> dhcpv6_options : [] >>>>> dynamic_addresses : [] >>>>> enabled : [] >>>>> external_ids : {}* >>>>> >>>>> Regards, >>>>> >>>>> Ilia Baikov >>>>> [email protected] >>>>> >>>>> 02.03.2026 18:23, Dumitru Ceara пишет: >>>>>> On 3/2/26 1:18 PM, Ilia Baikov wrote: >>>>>>> To keep region stable decided to rollback to 25.09 which has no >>>>>>> split >>>>>>> buf merged and migrated to L3 topology with /32 advertise via BGP. >>>>>>> Just a guess: reaching 2k ports (VMs) in a single logical_switch is >>>>>>> the >>>>>>> reason why ARP flows are being dropped/discarded because of resubmit >>>>>>> limit. What do you think? >>>>>>> >>>>>> Hi Ilia, >>>>>> >>>>>> Right, the very high number of logical switch ports that are part of >>>>>> the >>>>>> MC_FLOOD_L2 OVN multicast group (in this case all your VM ports) is >>>>>> what's causing issues with broadcast ARP requests: >>>>>> a. generated by the logical router port >>>>>> b. generated by VMs attached to the logical switch >>>>>> >>>>>> I'll try to prepare a test/rfc patch in the next days to see if >>>>>> changing >>>>>> the action for some logical flows from flooding on the "MC_FLOOD_L2" >>>>>> group to flooding on the "MC_UNKNOWN" group makes things work in your >>>>>> setup. >>>>>> >>>>>> Before that, can you please share how many of those 2k VM ports have >>>>>> LSP.ddresses configured to include "unknown"? >>>>>> >>>>>> Thanks, >>>>>> Dumitru >>>>>> >>>>>>> Regards, >>>>>>> >>>>>>> Ilia Baikov >>>>>>> [email protected] >>>>>>> >>>>>>> 26.02.2026 16:51, Ilia Baikov пишет: >>>>>>>> Hi, >>>>>>>> >>>>>>>> This patches seems to fix DHCP issues but there is cases when >>>>>>>> instance >>>>>>>> booted, received configuration from the metadata service but don't >>>>>>>> have a public connectivity (done through L2 networking). >>>>>>>> >>>>>>>>> Which, if the logical switch has a reasonably high number of ports >>>>>>>>> (maybe around 200) will probably cause the resubmit limit to be >>>>>>>>> hit >>>>>>>> This is the case, public L2 network with around ~2000 running >>>>>>>> instances (or ports in terms of LSP). >>>>>>>> >>>>>>>>> Are these OVN router port IPs? Or are they OVN workload IPs? Or >>>>>>>>> are >>>>>>>>> they just IPs owned by some fabric hosts, outside of OVN? >>>>>>>> .1 IP from each subnet runs by the border gateway. So instance asks >>>>>>>> for .1 to know GW MAC adddress but due to hitting limit instance >>>>>>>> receive no response because ARP flow is dropped. >>>>>>>> >>>>>>>>> Also, aside from the logs, do you actually see any traffic being >>>>>>>>> impacted? I.e., are your workloads able to come up and properly >>>>>>>>> communicate? >>>>>>>> Nope, there is connectivity loss since some instances has no public >>>>>>>> connectivity due to ARP issues. >>>>>>>> >>>>>>>> >>>>>>>> Regards, >>>>>>>> >>>>>>>> Ilia Baikov >>>>>>>> [email protected] >>>>>>>> 26.02.2026 14:31, Dumitru Ceara пишет: >>>>>>>>> Hi Ilia, >>>>>>>>> >>>>>>>>> On 2/24/26 3:29 PM, Ilia Baikov wrote: >>>>>>>>>> Just checked openvswitch logs. Resubmit 4096 is actually occurs >>>>>>>>>> even on >>>>>>>>>> 25.09.2. >>>>>>>>> v25.09.2 includes: >>>>>>>>> https://github.com/ovn-org/ovn/commit/0bb60da >>>>>>>>> >>>>>>>>> Which should fix the "self-DoS" issues introduced by: >>>>>>>>> https://github.com/ovn-org/ovn/commit/325c7b2 >>>>>>>>> >>>>>>>>> But that means that in some cases, e.g., for real BUM traffic >>>>>>>>> or for >>>>>>>>> GARPs originated by OVN router ports we will try to "flood" the >>>>>>>>> packet >>>>>>>>> in the L2 broadcast domain. >>>>>>>>> >>>>>>>>> Which, if the logical switch has a reasonably high number of ports >>>>>>>>> (maybe around 200) will probably cause the resubmit limit to be >>>>>>>>> hit. >>>>>>>>> >>>>>>>>> In the examples below, I see the packets that cause this are ARP >>>>>>>>> requests requesting the MAC address of: >>>>>>>>> - 138.124.72.1 >>>>>>>>> - 83.219.248.109 >>>>>>>>> - 138.124.72.1 >>>>>>>>> - 91.92.46.1 >>>>>>>>> >>>>>>>>> Are these OVN router port IPs? Or are they OVN workload IPs? Or >>>>>>>>> are >>>>>>>>> they just IPs owned by some fabric hosts, outside of OVN? >>>>>>>>> >>>>>>>>> Also, aside from the logs, do you actually see any traffic being >>>>>>>>> impacted? I.e., are your workloads able to come up and properly >>>>>>>>> communicate? >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Dumitru >>>>>>>>> >>>>>>>>>> Final flow: unchanged >>>>>>>>>> Megaflow: >>>>>>>>>> recirc_id=0,eth,arp,in_port=346,dl_src=fa:16:3e:63:aa:d0 >>>>>>>>>> Datapath actions: drop >>>>>>>>>> 2026-02-24T14:23:17.457Z|04071|connmgr|INFO|br-int<->unix#4346: 1 >>>>>>>>>> flow_mods in the last 0 s (1 adds) >>>>>>>>>> 2026-02-24T14:23:34.821Z|00076|ofproto_dpif_xlate(handler24)| >>>>>>>>>> WARN| >>>>>>>>>> Dropped 854 log messages in last 60 seconds (most recently, 0 >>>>>>>>>> seconds >>>>>>>>>> ago) due to excessive rate >>>>>>>>>> 2026-02-24T14:23:34.821Z|00077|ofproto_dpif_xlate(handler24)| >>>>>>>>>> WARN| >>>>>>>>>> over >>>>>>>>>> 4096 resubmit actions on bridge br-int while processing >>>>>>>>>> arp,in_port=4715,vlan_tci=0x0000,dl_src=fa:16:3e:97:65:15,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=138.124.72.142,arp_tpa=138.124.72.1,arp_op=1,arp_sha=fa:16:3e:97:65:15,arp_tha=00:00:00:00:00:00 >>>>>>>>>> 2026-02-24T14:23:45.464Z|00091|dpif(handler28)|WARN|system@ovs- >>>>>>>>>> system: >>>>>>>>>> execute >>>>>>>>>> ct(commit,zone=163,mark=0/0x41,label=0/0xffff00000000000000000000,nat(src)),154 >>>>>>>>>> failed (Invalid argument) on packet >>>>>>>>>> tcp,vlan_tci=0x0000,dl_src=0c:86:10:b7:9e:e0,dl_dst=fa:16:3e:69:22:89,nw_src=31.44.82.94,nw_dst=31.169.126.149,nw_tos=32,nw_ecn=0,nw_ttl=57,nw_frag=no,tp_src=51064,tp_dst=443,tcp_flags=syn >>>>>>>>>> tcp_csum:d7b0 >>>>>>>>>> with metadata >>>>>>>>>> skb_priority(0),skb_mark(0),ct_state(0x21),ct_zone(0xa3),ct_tuple4(src=31.44.82.94,dst=31.169.126.149,proto=6,tp_src=51064,tp_dst=443),in_port(2) >>>>>>>>>> mtu 0 >>>>>>>>>> 2026-02-24T14:23:56.702Z|00072|ofproto_dpif_upcall(handler30)| >>>>>>>>>> WARN| >>>>>>>>>> Dropped 697 log messages in last 60 seconds (most recently, 0 >>>>>>>>>> seconds >>>>>>>>>> ago) due to excessive rate >>>>>>>>>> 2026-02-24T14:23:56.702Z|00073|ofproto_dpif_upcall(handler30)| >>>>>>>>>> WARN|Flow: >>>>>>>>>> arp,in_port=409,vlan_tci=0x0000,dl_src=fa:16:3e:22:f2:f7,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=192.145.28.207,arp_tpa=192.145.28.1,arp_op=1,arp_sha=fa:16:3e:22:f2:f7,arp_tha=00:00:00:00:00:00 >>>>>>>>>> >>>>>>>>>> bridge("br-int") >>>>>>>>>> ---------------- >>>>>>>>>> 0. priority 0 >>>>>>>>>> drop >>>>>>>>>> >>>>>>>>>> Final flow: unchanged >>>>>>>>>> Megaflow: >>>>>>>>>> recirc_id=0,eth,arp,in_port=409,dl_src=fa:16:3e:22:f2:f7 >>>>>>>>>> Datapath actions: drop >>>>>>>>>> 2026-02-24T14:24:34.891Z|02715|ofproto_dpif_xlate(handler2)|WARN| >>>>>>>>>> Dropped >>>>>>>>>> 1059 log messages in last 60 seconds (most recently, 1 seconds >>>>>>>>>> ago) due >>>>>>>>>> to excessive rate >>>>>>>>>> 2026-02-24T14:24:34.891Z|02716|ofproto_dpif_xlate(handler2)| >>>>>>>>>> WARN|over >>>>>>>>>> 4096 resubmit actions on bridge br-int while processing >>>>>>>>>> arp,in_port=1,vlan_tci=0x0000,dl_src=0c:86:10:b7:9e:e0,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=83.219.248.1,arp_tpa=83.219.248.109,arp_op=1,arp_sha=0c:86:10:b7:9e:e0,arp_tha=00:00:00:00:00:00 >>>>>>>>>> 2026-02-24T14:24:46.042Z|04072|connmgr|INFO|br-int<->unix#4353: 1 >>>>>>>>>> flow_mods in the last 0 s (1 adds) >>>>>>>>>> 2026-02-24T14:24:59.041Z|00066|ofproto_dpif_upcall(handler78)| >>>>>>>>>> WARN| >>>>>>>>>> Dropped 662 log messages in last 63 seconds (most recently, 3 >>>>>>>>>> seconds >>>>>>>>>> ago) due to excessive rate >>>>>>>>>> 2026-02-24T14:24:59.041Z|00067|ofproto_dpif_upcall(handler78)| >>>>>>>>>> WARN|Flow: >>>>>>>>>> arp,in_port=339,vlan_tci=0x0000,dl_src=fa:16:3e:39:60:bb,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=91.92.46.85,arp_tpa=91.92.46.1,arp_op=1,arp_sha=fa:16:3e:39:60:bb,arp_tha=00:00:00:00:00:00 >>>>>>>>>> >>>>>>>>>> bridge("br-int") >>>>>>>>>> ---------------- >>>>>>>>>> 0. priority 0 >>>>>>>>>> drop >>>>>>>>>> >>>>>>>>>> Final flow: unchanged >>>>>>>>>> Megaflow: >>>>>>>>>> recirc_id=0,eth,arp,in_port=339,dl_src=fa:16:3e:39:60:bb >>>>>>>>>> Datapath actions: drop >>>>>>>>>> 2026-02-24T14:25:34.783Z|00067|ofproto_dpif_xlate(handler7)|WARN| >>>>>>>>>> Dropped >>>>>>>>>> 952 log messages in last 60 seconds (most recently, 0 seconds >>>>>>>>>> ago) >>>>>>>>>> due >>>>>>>>>> to excessive rate >>>>>>>>>> 2026-02-24T14:25:34.783Z|00068|ofproto_dpif_xlate(handler7)| >>>>>>>>>> WARN|over >>>>>>>>>> 4096 resubmit actions on bridge br-int while processing >>>>>>>>>> arp,in_port=4812,vlan_tci=0x0000,dl_src=fa:16:3e:68:f7:1b,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=138.124.72.245,arp_tpa=138.124.72.1,arp_op=1,arp_sha=fa:16:3e:68:f7:1b,arp_tha=00:00:00:00:00:00 >>>>>>>>>> 2026-02-24T14:25:59.094Z|00067|ofproto_dpif_upcall(handler11)| >>>>>>>>>> WARN| >>>>>>>>>> Dropped 720 log messages in last 60 seconds (most recently, 0 >>>>>>>>>> seconds >>>>>>>>>> ago) due to excessive rate >>>>>>>>>> 2026-02-24T14:25:59.095Z|00068|ofproto_dpif_upcall(handler11)| >>>>>>>>>> WARN|Flow: >>>>>>>>>> arp,in_port=305,vlan_tci=0x0000,dl_src=fa:16:3e:d9:8d:f3,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=91.92.46.188,arp_tpa=91.92.46.1,arp_op=1,arp_sha=fa:16:3e:d9:8d:f3,arp_tha=00:00:00:00:00:00 >>>>>>>>>> >>>>>>>>>> bridge("br-int") >>>>>>>>>> ---------------- >>>>>>>>>> 0. priority 0 >>>>>>>>>> drop >>>>>>>>>> >>>>>>>>>> Final flow: unchanged >>>>>>>>>> Megaflow: >>>>>>>>>> recirc_id=0,eth,arp,in_port=305,dl_src=fa:16:3e:d9:8d:f3 >>>>>>>>>> Datapath actions: drop >>>>>>>>>> 2026-02-24T14:26:35.024Z|02717|ofproto_dpif_xlate(handler2)|WARN| >>>>>>>>>> Dropped >>>>>>>>>> 937 log messages in last 61 seconds (most recently, 1 seconds >>>>>>>>>> ago) >>>>>>>>>> due >>>>>>>>>> to excessive rate >>>>>>>>>> 2026-02-24T14:26:35.024Z|02718|ofproto_dpif_xlate(handler2)| >>>>>>>>>> WARN|over >>>>>>>>>> 4096 resubmit actions on bridge br-int while processing >>>>>>>>>> arp,in_port=1,vlan_tci=0x0000,dl_src=0c:86:10:b7:9e:e0,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=104.165.244.1,arp_tpa=104.165.244.146,arp_op=1,arp_sha=0c:86:10:b7:9e:e0,arp_tha=00:00:00:00:00:00 >>>>>>>>>> 2026-02-24T14:26:59.151Z|00067|ofproto_dpif_upcall(handler67)| >>>>>>>>>> WARN| >>>>>>>>>> Dropped 884 log messages in last 60 seconds (most recently, 0 >>>>>>>>>> seconds >>>>>>>>>> ago) due to excessive rate >>>>>>>>>> 2026-02-24T14:26:59.151Z|00068|ofproto_dpif_upcall(handler67)| >>>>>>>>>> WARN|Flow: >>>>>>>>>> arp,in_port=380,vlan_tci=0x0000,dl_src=fa:16:3e:f1:5b:e7,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=138.124.72.215,arp_tpa=138.124.72.1,arp_op=1,arp_sha=fa:16:3e:f1:5b:e7,arp_tha=00:00:00:00:00:00 >>>>>>>>>> >>>>>>>>>> bridge("br-int") >>>>>>>>>> ---------------- >>>>>>>>>> 0. in_port=380, priority 100, cookie 0x2cfc9def >>>>>>>>>> set_field:0x90/0xffff->reg13 >>>>>>>>>> set_field:0x3->reg11 >>>>>>>>>> set_field:0x1->reg12 >>>>>>>>>> set_field:0x1->metadata >>>>>>>>>> set_field:0x1d2->reg14 >>>>>>>>>> set_field:0/0xffff0000->reg13 >>>>>>>>>> resubmit(,8) >>>>>>>>>> 8. metadata=0x1, priority 50, cookie 0x43f4e129 >>>>>>>>>> set_field:0/0x1000->reg10 >>>>>>>>>> resubmit(,73) >>>>>>>>>> 73. arp,reg14=0x1d2,metadata=0x1, priority 95, cookie >>>>>>>>>> 0x2cfc9def >>>>>>>>>> resubmit(,74) >>>>>>>>>> 74. arp,reg14=0x1d2,metadata=0x1, priority 80, cookie >>>>>>>>>> 0x2cfc9def >>>>>>>>>> set_field:0x1000/0x1000->reg10 >>>>>>>>>> move:NXM_NX_REG10[12]->NXM_NX_XXREG0[111] >>>>>>>>>> -> NXM_NX_XXREG0[111] is now 0x1 >>>>>>>>>> resubmit(,9) >>>>>>>>>> 9. reg0=0x8000/0x8000,metadata=0x1, priority 50, cookie >>>>>>>>>> 0xf4bfe3b3 >>>>>>>>>> drop >>>>>>>>>> >>>>>>>>>> Final flow: >>>>>>>>>> arp,reg0=0x8000,reg10=0x1000,reg11=0x3,reg12=0x1,reg13=0x90,reg14=0x1d2,metadata=0x1,in_port=380,vlan_tci=0x0000,dl_src=fa:16:3e:f1:5b:e7,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=138.124.72.215,arp_tpa=138.124.72.1,arp_op=1,arp_sha=fa:16:3e:f1:5b:e7,arp_tha=00:00:00:00:00:00 >>>>>>>>>> Megaflow: >>>>>>>>>> recirc_id=0,eth,arp,in_port=380,dl_src=fa:16:3e:f1:5b:e7 >>>>>>>>>> Datapath actions: drop >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Broadcast arps to all routers is set to false. >>>>>>>>>> _uuid : 1841d88f-3fbf-427f-8d6c-c3edaba47a0a >>>>>>>>>> acls : [] >>>>>>>>>> copp : [] >>>>>>>>>> dns_records : [] >>>>>>>>>> external_ids : {"neutron:availability_zone_hints"="", >>>>>>>>>> "neutron:mtu"="1500", "neutron:network_name"=poland-public, >>>>>>>>>> "neutron:provnet-network-type"=vlan, >>>>>>>>>> "neutron:revision_number"="12"} >>>>>>>>>> forwarding_groups : [] >>>>>>>>>> load_balancer : [] >>>>>>>>>> load_balancer_group : [] >>>>>>>>>> name : neutron-da85395e-c326-489d-b4e6- >>>>>>>>>> dfb62aad360d >>>>>>>>>> other_config : {broadcast-arps-to-all-routers="false", >>>>>>>>>> fdb_age_threshold="0", mcast_flood_unregistered="false", >>>>>>>>>> mcast_snoop="false", vlan-passthru="false"} >>>>>>>>>> ports : [00288a04-90a4-4e8e-bada-8213747c92e4, >>>>>>>>>> 0047d609- >>>>>>>>>> ebff-4c43-8f1d-32d83d70c9e6, 00b6c585-ae29-4e88-a52a-3a16e1d91112 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Regards, >>>>>>>>>> >>>>>>>>>> Ilia Baikov >>>>>>>>>> [email protected] >>>>>>>>>> >>>>>>>>>> 24.02.2026 17:16, Ilia Baikov пишет: >>>>>>>>>>> Hello, >>>>>>>>>>> After ugprading to OpenStack 2025.2 with OVN 25.09.2 (which >>>>>>>>>>> contains >>>>>>>>>>> split buf fix) seems like no issues with DHCP, but I see a >>>>>>>>>>> lot of >>>>>>>>>>> missed ARP, VM unable to reach GW and there is no ARP >>>>>>>>>>> broadcasted to >>>>>>>>>>> some of VMs. Debugging shows that ovn installs drop arp flows >>>>>>>>>>> for >>>>>>>>>>> some >>>>>>>>>>> reason. >>>>>>>>>>> >>>>>>>>>>> ovs-appctl ofproto/trace br-int \ >>>>>>>>>>> "in_port=2,dl_vlan=1000,dl_src=0c:86:10:b7:9e:e0,dl_dst=ff:ff:ff:ff:ff:ff,dl_type=0x0806,arp_op=1,arp_spa=192.145.28.1,arp_tpa=192.145.28.113" >>>>>>>>>>> 2>&1 | tail -80 >>>>>>>>>>> Flow: >>>>>>>>>>> arp,in_port=2,dl_vlan=1000,dl_vlan_pcp=0,vlan_tci1=0x0000,dl_src=0c:86:10:b7:9e:e0,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=192.145.28.1,arp_tpa=192.145.28.113,arp_op=1,arp_sha=00:00:00:00:00:00,arp_tha=00:00:00:00:00:00 >>>>>>>>>>> >>>>>>>>>>> bridge("br-int") >>>>>>>>>>> ---------------- >>>>>>>>>>> 0. in_port=2, priority 100 >>>>>>>>>>> move:NXM_NX_TUN_ID[0..23]->OXM_OF_METADATA[0..23] >>>>>>>>>>> -> OXM_OF_METADATA[0..23] is now 0 >>>>>>>>>>> move:NXM_NX_TUN_METADATA0[16..30]->NXM_NX_REG14[0..14] >>>>>>>>>>> -> NXM_NX_REG14[0..14] is now 0 >>>>>>>>>>> move:NXM_NX_TUN_METADATA0[0..15]->NXM_NX_REG15[0..15] >>>>>>>>>>> -> NXM_NX_REG15[0..15] is now 0 >>>>>>>>>>> resubmit(,45) >>>>>>>>>>> 45. priority 0 >>>>>>>>>>> drop >>>>>>>>>>> >>>>>>>>>>> Final flow: unchanged >>>>>>>>>>> Megaflow: recirc_id=0,eth,arp,in_port=2,dl_src=0c:86:10:b7:9e:e0 >>>>>>>>>>> Datapath actions: drop >>>>>>>>>>> >>>>>>>>>>> docker exec ovn_controller ovn-controller --version >>>>>>>>>>> ovn-controller 25.09.2 >>>>>>>>>>> Open vSwitch Library 3.6.2 >>>>>>>>>>> OpenFlow versions 0x6:0x6 >>>>>>>>>>> SB DB Schema 21.5.0 >>>>>>>>>>> >>>>>>>>>>> ovn-controller logs shows no errors clearly: >>>>>>>>>>> 2026-02-24T14:06:39.403Z|00001|vlog|INFO|opened log file / >>>>>>>>>>> var/log/ >>>>>>>>>>> kolla/openvswitch/ovn-controller.log >>>>>>>>>>> 2026-02-24T14:06:39.406Z|00002|reconnect|INFO| >>>>>>>>>>> tcp:127.0.0.1:6640: >>>>>>>>>>> connecting... >>>>>>>>>>> 2026-02-24T14:06:39.406Z|00003|reconnect|INFO| >>>>>>>>>>> tcp:127.0.0.1:6640: >>>>>>>>>>> connected >>>>>>>>>>> 2026-02-24T14:06:39.463Z|00004|main|INFO|OVN internal version >>>>>>>>>>> is : >>>>>>>>>>> [25.09.2-21.5.0-81.10] >>>>>>>>>>> 2026-02-24T14:06:39.463Z|00005|main|INFO|OVS IDL reconnected, >>>>>>>>>>> force >>>>>>>>>>> recompute. >>>>>>>>>>> 2026-02-24T14:06:39.464Z|00006|reconnect|INFO| >>>>>>>>>>> tcp:10.11.0.4:16641: >>>>>>>>>>> connecting... >>>>>>>>>>> 2026-02-24T14:06:39.464Z|00007|main|INFO|OVNSB IDL reconnected, >>>>>>>>>>> force >>>>>>>>>>> recompute. >>>>>>>>>>> 2026-02-24T14:06:39.464Z|00008|reconnect|INFO| >>>>>>>>>>> tcp:10.11.0.4:16641: >>>>>>>>>>> connected >>>>>>>>>>> 2026-02-24T14:06:39.464Z|00001|rconn(ovn_statctrl3)|INFO|unix:/ >>>>>>>>>>> var/ >>>>>>>>>>> run/openvswitch/br-int.mgmt: connected >>>>>>>>>>> 2026-02-24T14:06:39.464Z|00001|rconn(ovn_pinctrl0)|INFO|unix:/ >>>>>>>>>>> var/run/ >>>>>>>>>>> openvswitch/br-int.mgmt: connected >>>>>>>>>>> 2026-02-24T14:06:39.529Z|00009|main|INFO|OVS feature set >>>>>>>>>>> changed, >>>>>>>>>>> force recompute. >>>>>>>>>>> 2026-02-24T14:06:39.532Z|00010|rconn|INFO|unix:/var/run/ >>>>>>>>>>> openvswitch/ >>>>>>>>>>> br-int.mgmt: connected >>>>>>>>>>> 2026-02-24T14:06:39.532Z|00011|main|INFO|OVS OpenFlow connection >>>>>>>>>>> reconnected,force recompute. >>>>>>>>>>> 2026-02-24T14:06:39.536Z|00012|main|INFO|OVS feature set >>>>>>>>>>> changed, >>>>>>>>>>> force recompute. >>>>>>>>>>> 2026-02-24T14:06:40.564Z|00013|main|INFO|OVS feature set >>>>>>>>>>> changed, >>>>>>>>>>> force recompute. >>>>>>>>>>> 2026-02-24T14:06:45.920Z|00014|binding|INFO|Releasing lport >>>>>>>>>>> bcd3ecfa- >>>>>>>>>>> f43c-4e72-8978-73bbad07ed75 from this chassis (sb_readonly=1) >>>>>>>>>>> 2026-02-24T14:06:45.924Z|00015|binding|INFO|Releasing lport >>>>>>>>>>> 4f1f45b0-726c-4fea-b462-06dcbf559c25 from this chassis >>>>>>>>>>> (sb_readonly=1) >>>>>>>>>>> 2026-02-24T14:06:46.927Z|00016|timeval|WARN|Unreasonably long >>>>>>>>>>> 1413ms >>>>>>>>>>> poll interval (1294ms user, 117ms system) >>>>>>>>>>> 2026-02-24T14:06:46.927Z|00017|timeval|WARN|faults: 38131 minor, >>>>>>>>>>> 0 major >>>>>>>>>>> 2026-02-24T14:06:46.927Z|00018|timeval|WARN|disk: 0 reads, 8 >>>>>>>>>>> writes >>>>>>>>>>> 2026-02-24T14:06:46.927Z|00019|timeval|WARN|context switches: 0 >>>>>>>>>>> voluntary, 65 involuntary >>>>>>>>>>> 2026-02-24T14:06:46.936Z|00020|coverage|INFO|Event coverage, avg >>>>>>>>>>> rate >>>>>>>>>>> over last: 5 seconds, last minute, last hour, hash=1a815819: >>>>>>>>>>> 2026-02-24T14:06:46.936Z|00021|coverage|INFO|physical_run >>>>>>>>>>> 0.2/sec >>>>>>>>>>> 0.017/sec 0.0003/sec total: 1 >>>>>>>>>>> 2026-02-24T14:06:46.936Z|00022|coverage|INFO|lflow_conj_alloc >>>>>>>>>>> 0.0/ >>>>>>>>>>> sec 0.000/sec 0.0000/sec total: 407 >>>>>>>>>>> 2026-02-24T14:06:46.936Z|00023|coverage|INFO|lflow_cache_miss >>>>>>>>>>> 0.0/ >>>>>>>>>>> sec 0.000/sec 0.0000/sec total: 13470 >>>>>>>>>>> 2026-02-24T14:06:46.936Z|00024|coverage|INFO|lflow_cache_hit >>>>>>>>>>> 0.0/sec >>>>>>>>>>> 0.000/sec 0.0000/sec total: 394 >>>>>>>>>>> 2026-02-24T14:06:46.936Z|00025|coverage|INFO|lflow_cache_add >>>>>>>>>>> 0.0/sec >>>>>>>>>>> 0.000/sec 0.0000/sec total: 12956 >>>>>>>>>>> 2026-02-24T14:06:46.936Z|00026|coverage|INFO| >>>>>>>>>>> lflow_cache_add_matches >>>>>>>>>>> 0.0/sec 0.000/sec 0.0000/sec total: 2412 >>>>>>>>>>> 2026-02-24T14:06:46.936Z|00027|coverage|INFO| >>>>>>>>>>> lflow_cache_add_expr >>>>>>>>>>> 0.0/sec 0.000/sec 0.0000/sec total: 10544 >>>>>>>>>>> 2026-02-24T14:06:46.936Z|00028|coverage|INFO| >>>>>>>>>>> consider_logical_flow >>>>>>>>>>> 0.0/sec 0.000/sec 0.0000/sec total: 20680 >>>>>>>>>>> 2026-02-24T14:06:46.936Z|00029|coverage|INFO|lflow_run 0.2/sec >>>>>>>>>>> 0.017/sec 0.0003/sec total: 1 >>>>>>>>>>> 2026-02-24T14:06:46.936Z|00030|coverage|INFO|miniflow_malloc >>>>>>>>>>> 16.6/ >>>>>>>>>>> sec 1.383/sec 0.0231/sec total: 28561 >>>>>>>>>>> 2026-02-24T14:06:46.936Z|00031|coverage|INFO|hmap_pathological >>>>>>>>>>> 11.2/ >>>>>>>>>>> sec 0.933/sec 0.0156/sec total: 257 >>>>>>>>>>> 2026-02-24T14:06:46.936Z|00032|coverage|INFO|hmap_expand >>>>>>>>>>> 837.2/sec >>>>>>>>>>> 69.767/sec 1.1628/sec total: 30358 >>>>>>>>>>> 2026-02-24T14:06:46.936Z|00033|coverage|INFO|hmap_reserve >>>>>>>>>>> 0.4/sec >>>>>>>>>>> 0.033/sec 0.0006/sec total: 21733 >>>>>>>>>>> 2026-02-24T14:06:46.936Z|00034|coverage|INFO|txn_unchanged >>>>>>>>>>> 2.4/sec >>>>>>>>>>> 0.200/sec 0.0033/sec total: 65 >>>>>>>>>>> 2026-02-24T14:06:46.936Z|00035|coverage|INFO|txn_incomplete >>>>>>>>>>> 1.4/sec >>>>>>>>>>> 0.117/sec 0.0019/sec total: 60 >>>>>>>>>>> 2026-02-24T14:06:46.936Z|00036|coverage|INFO|txn_success 0.6/sec >>>>>>>>>>> 0.050/sec 0.0008/sec total: 3 >>>>>>>>>>> 2026-02-24T14:06:46.936Z|00037|coverage|INFO|poll_create_node >>>>>>>>>>> 24.0/ >>>>>>>>>>> sec 2.000/sec 0.0333/sec total: 1304 >>>>>>>>>>> 2026-02-24T14:06:46.937Z|00038|coverage|INFO|poll_zero_timeout >>>>>>>>>>> 0.0/ >>>>>>>>>>> sec 0.000/sec 0.0000/sec total: 1 >>>>>>>>>>> 2026-02-24T14:06:46.937Z|00039|coverage|INFO|rconn_queued >>>>>>>>>>> 0.8/sec >>>>>>>>>>> 0.067/sec 0.0011/sec total: 4 >>>>>>>>>>> 2026-02-24T14:06:46.937Z|00040|coverage|INFO|rconn_sent 0.8/sec >>>>>>>>>>> 0.067/sec 0.0011/sec total: 4 >>>>>>>>>>> 2026-02-24T14:06:46.937Z|00041|coverage|INFO|seq_change 9.2/sec >>>>>>>>>>> 0.767/sec 0.0128/sec total: 532 >>>>>>>>>>> 2026-02-24T14:06:46.937Z|00042|coverage|INFO|pstream_open >>>>>>>>>>> 0.2/sec >>>>>>>>>>> 0.017/sec 0.0003/sec total: 1 >>>>>>>>>>> 2026-02-24T14:06:46.937Z|00043|coverage|INFO|stream_open 1.2/sec >>>>>>>>>>> 0.100/sec 0.0017/sec total: 6 >>>>>>>>>>> 2026-02-24T14:06:46.937Z|00044|coverage|INFO|util_xalloc >>>>>>>>>>> 29035.4/sec >>>>>>>>>>> 2419.617/sec 40.3269/sec total: 2277081 >>>>>>>>>>> 2026-02-24T14:06:46.937Z|00045|coverage|INFO|vconn_received >>>>>>>>>>> 0.8/sec >>>>>>>>>>> 0.067/sec 0.0011/sec total: 4 >>>>>>>>>>> 2026-02-24T14:06:46.937Z|00046|coverage|INFO|vconn_sent 1.2/sec >>>>>>>>>>> 0.100/sec 0.0017/sec total: 6 >>>>>>>>>>> 2026-02-24T14:06:46.937Z|00047|coverage|INFO| >>>>>>>>>>> jsonrpc_recv_incomplete >>>>>>>>>>> 0.6/sec 0.050/sec 0.0008/sec total: 52 >>>>>>>>>>> 2026-02-24T14:06:46.937Z|00048|coverage|INFO|138 events never >>>>>>>>>>> hit >>>>>>>>>>> 2026-02-24T14:06:46.976Z|00049|binding|INFO|Releasing lport >>>>>>>>>>> 4f1f45b0-726c-4fea-b462-06dcbf559c25 from this chassis >>>>>>>>>>> (sb_readonly=0) >>>>>>>>>>> 2026-02-24T14:06:46.977Z|00050|binding|INFO|Releasing lport >>>>>>>>>>> bcd3ecfa- >>>>>>>>>>> f43c-4e72-8978-73bbad07ed75 from this chassis (sb_readonly=0) >>>>>>>>>>> 2026-02-24T14:06:48.054Z|00051|timeval|WARN|Unreasonably long >>>>>>>>>>> 1117ms >>>>>>>>>>> poll interval (1108ms user, 8ms system) >>>>>>>>>>> 2026-02-24T14:06:48.054Z|00052|timeval|WARN|faults: 2581 >>>>>>>>>>> minor, 0 >>>>>>>>>>> major >>>>>>>>>>> 2026-02-24T14:06:48.054Z|00053|timeval|WARN|context switches: 0 >>>>>>>>>>> voluntary, 8 involuntary >>>>>>>>>>> 2026-02-24T14:06:48.055Z|00054|coverage|INFO|Event coverage, avg >>>>>>>>>>> rate >>>>>>>>>>> over last: 5 seconds, last minute, last hour, hash=0878340f: >>>>>>>>>>> 2026-02-24T14:06:48.055Z|00055|coverage|INFO|physical_run >>>>>>>>>>> 0.2/sec >>>>>>>>>>> 0.017/sec 0.0003/sec total: 2 >>>>>>>>>>> 2026-02-24T14:06:48.055Z|00056|coverage|INFO|lflow_conj_alloc >>>>>>>>>>> 0.0/ >>>>>>>>>>> sec 0.000/sec 0.0000/sec total: 814 >>>>>>>>>>> 2026-02-24T14:06:48.055Z|00057|coverage|INFO|lflow_cache_miss >>>>>>>>>>> 0.0/ >>>>>>>>>>> sec 0.000/sec 0.0000/sec total: 13979 >>>>>>>>>>> 2026-02-24T14:06:48.055Z|00058|coverage|INFO|lflow_cache_hit >>>>>>>>>>> 0.0/sec >>>>>>>>>>> 0.000/sec 0.0000/sec total: 13671 >>>>>>>>>>> 2026-02-24T14:06:48.055Z|00059|coverage|INFO|lflow_cache_add >>>>>>>>>>> 0.0/sec >>>>>>>>>>> 0.000/sec 0.0000/sec total: 12956 >>>>>>>>>>> 2026-02-24T14:06:48.055Z|00060|coverage|INFO| >>>>>>>>>>> lflow_cache_add_matches >>>>>>>>>>> 0.0/sec 0.000/sec 0.0000/sec total: 2412 >>>>>>>>>>> 2026-02-24T14:06:48.055Z|00061|coverage|INFO| >>>>>>>>>>> lflow_cache_add_expr >>>>>>>>>>> 0.0/sec 0.000/sec 0.0000/sec total: 10544 >>>>>>>>>>> 2026-02-24T14:06:48.055Z|00062|coverage|INFO| >>>>>>>>>>> consider_logical_flow >>>>>>>>>>> 0.0/sec 0.000/sec 0.0000/sec total: 41360 >>>>>>>>>>> 2026-02-24T14:06:48.055Z|00063|coverage|INFO|lflow_run 0.2/sec >>>>>>>>>>> 0.017/sec 0.0003/sec total: 2 >>>>>>>>>>> 2026-02-24T14:06:48.055Z|00064|coverage|INFO|cmap_expand 0.0/sec >>>>>>>>>>> 0.000/sec 0.0000/sec total: 7 >>>>>>>>>>> 2026-02-24T14:06:48.055Z|00065|coverage|INFO|miniflow_malloc >>>>>>>>>>> 16.6/ >>>>>>>>>>> sec 1.383/sec 0.0231/sec total: 63156 >>>>>>>>>>> 2026-02-24T14:06:48.055Z|00066|coverage|INFO|hmap_pathological >>>>>>>>>>> 11.2/ >>>>>>>>>>> sec 0.933/sec 0.0156/sec total: 311 >>>>>>>>>>> 2026-02-24T14:06:48.056Z|00067|coverage|INFO|hmap_expand >>>>>>>>>>> 837.2/sec >>>>>>>>>>> 69.767/sec 1.1628/sec total: 30539 >>>>>>>>>>> 2026-02-24T14:06:48.056Z|00068|coverage|INFO|hmap_reserve >>>>>>>>>>> 0.4/sec >>>>>>>>>>> 0.033/sec 0.0006/sec total: 22553 >>>>>>>>>>> 2026-02-24T14:06:48.056Z|00069|coverage|INFO|txn_unchanged >>>>>>>>>>> 2.4/sec >>>>>>>>>>> 0.200/sec 0.0033/sec total: 67 >>>>>>>>>>> 2026-02-24T14:06:48.056Z|00070|coverage|INFO|txn_incomplete >>>>>>>>>>> 1.4/sec >>>>>>>>>>> 0.117/sec 0.0019/sec total: 60 >>>>>>>>>>> 2026-02-24T14:06:48.056Z|00071|coverage|INFO|txn_success 0.6/sec >>>>>>>>>>> 0.050/sec 0.0008/sec total: 4 >>>>>>>>>>> 2026-02-24T14:06:48.056Z|00072|coverage|INFO|poll_create_node >>>>>>>>>>> 24.0/ >>>>>>>>>>> sec 2.000/sec 0.0333/sec total: 1335 >>>>>>>>>>> 2026-02-24T14:06:48.056Z|00073|coverage|INFO|poll_zero_timeout >>>>>>>>>>> 0.0/ >>>>>>>>>>> sec 0.000/sec 0.0000/sec total: 1 >>>>>>>>>>> 2026-02-24T14:06:48.056Z|00074|coverage|INFO|rconn_queued >>>>>>>>>>> 0.8/sec >>>>>>>>>>> 0.067/sec 0.0011/sec total: 4 >>>>>>>>>>> 2026-02-24T14:06:48.056Z|00075|coverage|INFO|rconn_sent 0.8/sec >>>>>>>>>>> 0.067/sec 0.0011/sec total: 4 >>>>>>>>>>> 2026-02-24T14:06:48.056Z|00076|coverage|INFO|seq_change 9.2/sec >>>>>>>>>>> 0.767/sec 0.0128/sec total: 546 >>>>>>>>>>> 2026-02-24T14:06:48.056Z|00077|coverage|INFO|pstream_open >>>>>>>>>>> 0.2/sec >>>>>>>>>>> 0.017/sec 0.0003/sec total: 1 >>>>>>>>>>> 2026-02-24T14:06:48.056Z|00078|coverage|INFO|stream_open 1.2/sec >>>>>>>>>>> 0.100/sec 0.0017/sec total: 6 >>>>>>>>>>> 2026-02-24T14:06:48.056Z|00079|coverage|INFO|long_poll_interval >>>>>>>>>>> 0.0/sec 0.000/sec 0.0000/sec total: 1 >>>>>>>>>>> 2026-02-24T14:06:48.056Z|00080|coverage|INFO|util_xalloc >>>>>>>>>>> 29035.4/sec >>>>>>>>>>> 2419.617/sec 40.3269/sec total: 2477649 >>>>>>>>>>> 2026-02-24T14:06:48.056Z|00081|coverage|INFO|vconn_received >>>>>>>>>>> 0.8/sec >>>>>>>>>>> 0.067/sec 0.0011/sec total: 4 >>>>>>>>>>> 2026-02-24T14:06:48.056Z|00082|coverage|INFO|vconn_sent 1.2/sec >>>>>>>>>>> 0.100/sec 0.0017/sec total: 6 >>>>>>>>>>> 2026-02-24T14:06:48.056Z|00083|coverage|INFO| >>>>>>>>>>> jsonrpc_recv_incomplete >>>>>>>>>>> 0.6/sec 0.050/sec 0.0008/sec total: 52 >>>>>>>>>>> 2026-02-24T14:06:48.056Z|00084|coverage|INFO|136 events never >>>>>>>>>>> hit >>>>>>>>>>> 2026-02-24T14:06:48.056Z|00085|poll_loop|INFO|wakeup due to >>>>>>>>>>> [POLLIN] >>>>>>>>>>> on fd 29 (10.11.0.2:40496<->10.11.0.4:16641) at lib/stream- >>>>>>>>>>> fd.c:157 >>>>>>>>>>> (82% CPU usage) >>>>>>>>>>> 2026-02-24T14:06:48.097Z|00086|poll_loop|INFO|wakeup due to >>>>>>>>>>> [POLLIN] >>>>>>>>>>> on fd 29 (10.11.0.2:40496<->10.11.0.4:16641) at lib/stream- >>>>>>>>>>> fd.c:157 >>>>>>>>>>> (82% CPU usage) >>>>>>>>>>> 2026-02-24T14:06:48.104Z|00087|poll_loop|INFO|wakeup due to 0-ms >>>>>>>>>>> timeout at controller/ovn-controller.c:7558 (82% CPU usage) >>>>>>>>>>> 2026-02-24T14:06:48.283Z|00088|poll_loop|INFO|wakeup due to 0-ms >>>>>>>>>>> timeout at controller/ofctrl.c:692 (82% CPU usage) >>>>>>>>>>> 2026-02-24T14:06:48.870Z|00089|poll_loop|INFO|wakeup due to >>>>>>>>>>> [POLLIN] >>>>>>>>>>> on fd 33 (<->/var/run/openvswitch/br-int.mgmt) at lib/stream- >>>>>>>>>>> fd.c:153 >>>>>>>>>>> (82% CPU usage) >>>>>>>>>>> 2026-02-24T14:06:48.877Z|00090|poll_loop|INFO|wakeup due to >>>>>>>>>>> [POLLOUT] >>>>>>>>>>> on fd 33 (<->/var/run/openvswitch/br-int.mgmt) at lib/stream- >>>>>>>>>>> fd.c:153 >>>>>>>>>>> (82% CPU usage) >>>>>>>>>>> 2026-02-24T14:06:48.884Z|00091|poll_loop|INFO|wakeup due to >>>>>>>>>>> [POLLOUT] >>>>>>>>>>> on fd 33 (<->/var/run/openvswitch/br-int.mgmt) at lib/stream- >>>>>>>>>>> fd.c:153 >>>>>>>>>>> (82% CPU usage) >>>>>>>>>>> 2026-02-24T14:06:48.892Z|00092|poll_loop|INFO|wakeup due to >>>>>>>>>>> [POLLOUT] >>>>>>>>>>> on fd 33 (<->/var/run/openvswitch/br-int.mgmt) at lib/stream- >>>>>>>>>>> fd.c:153 >>>>>>>>>>> (82% CPU usage) >>>>>>>>>>> 2026-02-24T14:06:48.900Z|00093|poll_loop|INFO|wakeup due to >>>>>>>>>>> [POLLOUT] >>>>>>>>>>> on fd 33 (<->/var/run/openvswitch/br-int.mgmt) at lib/stream- >>>>>>>>>>> fd.c:153 >>>>>>>>>>> (82% CPU usage) >>>>>>>>>>> 2026-02-24T14:06:48.907Z|00094|poll_loop|INFO|wakeup due to >>>>>>>>>>> [POLLOUT] >>>>>>>>>>> on fd 33 (<->/var/run/openvswitch/br-int.mgmt) at lib/stream- >>>>>>>>>>> fd.c:153 >>>>>>>>>>> (82% CPU usage) >>>>>>>>>>> 2026-02-24T14:06:49.875Z|00095|memory|INFO|143124 kB peak >>>>>>>>>>> resident set >>>>>>>>>>> size after 10.5 seconds >>>>>>>>>>> 2026-02-24T14:06:49.875Z|00096|memory|INFO|idl-cells- >>>>>>>>>>> OVN_Southbound:301305 idl-cells-Open_vSwitch:25815 lflow-cache- >>>>>>>>>>> entries-cache-expr:10548 lflow-cache-entries-cache-matches:2413 >>>>>>>>>>> lflow- >>>>>>>>>>> cache-size-KB:32447 local_datapath_usage-KB:2 >>>>>>>>>>> ofctrl_desired_flow_usage-KB:8528 ofctrl_installed_flow_usage- >>>>>>>>>>> KB:6365 >>>>>>>>>>> ofctrl_rconn_packet_counter-KB:5161 ofctrl_sb_flow_ref_usage- >>>>>>>>>>> KB:3196 >>>>>>>>>>> oflow_update_usage-KB:1 >>>>>>>>>>> >>>>>>>>>>> Regards, >>>>>>>>>>> >>>>>>>>>>> Ilia Baikov >>>>>>>>>>> [email protected] >>>>>>>>>>> >>>>>>>>>>> 12.02.2026 21:22, Ilia Baikov пишет: >>>>>>>>>>>> Hi, >>>>>>>>>>>> Returning back to this issue after a while as I'm migrating >>>>>>>>>>>> ml2/ovs >>>>>>>>>>>> to ml2/ovn. >>>>>>>>>>>> Seems like the same issue from 2025 still persists. >>>>>>>>>>>> >>>>>>>>>>>> refs: >>>>>>>>>>>> [0]https://mail.openvswitch.org/pipermail/ovs-discuss/2025- >>>>>>>>>>>> February/053456.html >>>>>>>>>>>> [1]https://mail.openvswitch.org/pipermail/ovs-discuss/2025- >>>>>>>>>>>> March/053484.html >>>>>>>>>>>> >>>>>>>>>>>> case: >>>>>>>>>>>> Big L2 domain with border device learning IPs by flooding ARP. >>>>>>>>>>>> For >>>>>>>>>>>> some reason in case L2 device (nic with VLANs attached) is >>>>>>>>>>>> attached >>>>>>>>>>>> to the br-ex bridge, then after a while OVN stops sending DHCP >>>>>>>>>>>> packets (OFFER/ACK/etc). >>>>>>>>>>>> >>>>>>>>>>>> Anybody else observed the same issue? The only way to stabilize >>>>>>>>>>>> region is to switch to L3 networking using ovn-bgp-agent >>>>>>>>>>>> (eth0 is >>>>>>>>>>>> detached from br-ex, so no more ARPs delivered to ovn- >>>>>>>>>>>> controller), >>>>>>>>>>>> but there is monstrous overhead using kernel routing: IRQ >>>>>>>>>>>> higher up >>>>>>>>>>>> to x5-6 times, like 10-12% while L2 networking is just 2% which >>>>>>>>>>>> is fine. >>>>>>>>>>>> >>>>>>>>>>>> Meanwhile, no errors, warns, resubmit logs in logs. >>>>>>>>>>>> > _______________________________________________ discuss mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
