On 3/3/26 3:54 PM, Ilia Baikov wrote: > Hi, > Hi,
> Done, OVN components are now builded from your branch and deployed into > production region where issue persist. To be more precise lets focus on > a cluster that runs using L2 network setup, as this is the best field > for testing and reproduction for this case and not breaking other > stabilised regions which runs L3. > > ovn-controller 24.03.8 > Open vSwitch Library 3.3.8 > OpenFlow versions 0x6:0x6 > SB DB Schema 20.33.0 > > OVN components are deployed ~ at 14:11, then resubmit exceptions > appearing (24.03.2 just shows unrecognized op code (27) and so on). > Just to be sure I understand. Is traffic from your workloads still affected? > I've also enabled rconn/vconn dbg for ovn and ovs (but later than 14:11, > but it seems vconn/rconn shows something useful for debugging). > > ovn logs starting from 14:11 - https://gist.githubusercontent.com/ > frct1/5f99221e1519d1552c8ef16a7ec8ee52/ > raw/147e9a171e538f9cd837008181272437b1c7ed37/ovn.log > ovs logs starting from 14:11 - https://gist.githubusercontent.com/ > frct1/5f99221e1519d1552c8ef16a7ec8ee52/ > raw/147e9a171e538f9cd837008181272437b1c7ed37/ovs.log > Double checking, are you sure you upgraded ovn-northd to use the version from my branch? I'm asking because this packet should not hit the mc_flood_l2 group anymore: 2026-03-03T14:22:51.358Z|00012|ofproto_dpif_xlate(handler250)|WARN|Dropped 2244 log messages in last 60 seconds (most recently, 0 seconds ago) due to excessive rate 2026-03-03T14:22:51.358Z|00013|ofproto_dpif_xlate(handler250)|WARN|over 4096 resubmit actions on bridge br-int while processing arp,in_port=2409,vlan_tci=0x0000,dl_src=fa:16:3e:ba:70:84,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=166.1.160.225,arp_tpa=166.1.160.1,arp_op=1,arp_sha=fa:16:3e:ba:70:84,arp_tha=00:00:00:00:00:00 The mc_flood_l2 group has 2k ports (all the VM ports) but my change should change it to hit the mc_unknown group, which only has a handful of ports as you said in your previous email. > > I will keep it running 24.03.8 for easier debugging. > Thanks, Dumitru > Regards, > > Ilia Baikov > [email protected] > > 03.03.2026 15:55, Dumitru Ceara пишет: >> On 3/2/26 7:24 PM, Ilia Baikov wrote: >>> Hi Dumitru! >>> >> Hi Ilia, >> >>> ovn-nbctl --no-leader-only list Logical_switch_port | grep unknown | >>> wc -l >>> *10 >>> >> OK, that's just a few (and I see on your other deployment too), so >> that's great. >> >> Mind trying out this WIP patch for now and see if it works for you? >> >> https://github.com/dceara/ovn/commits/mc_flood_l2_to_unknown-26.03 >> https://github.com/dceara/ovn/commits/mc_flood_l2_to_unknown-25.09 >> https://github.com/dceara/ovn/commits/mc_flood_l2_to_unknown-25.03 >> https://github.com/dceara/ovn/commits/mc_flood_l2_to_unknown-24.09 >> https://github.com/dceara/ovn/commits/mc_flood_l2_to_unknown-24.03 >> >> They're all the same, just based on different stable branches, I wasn't >> sure which one you'll need. >> >> Looking forward to hear your results. >> >> Thanks, >> Dumitru >> >> >>> *More output for lsp list (no filtering with ls uuid)* >>> >>> ovn-nbctl --no-leader-only list Logical_switch_port | grep unknown -A 5 >>> addresses : [unknown] >>> dhcpv4_options : [] >>> dhcpv6_options : [] >>> dynamic_addresses : [] >>> enabled : true >>> external_ids : {"neutron:cidrs"="10.10.3.243/24", >>> "neutron:device_id"="ba1a43e2-5496-4ced-8b8c-9b42c5ddd6f1", >>> "neutron:device_owner"="network:floatingip_agent_gateway", >>> "neutron:host_id"=us-east-standard-2, "neutron:mtu"="", >>> "neutron:network_name"=neutron-bb8d0ef6-9b45-4398-86f3-51323a0db2cd, >>> "neutron:port_capabilities"="", "neutron:port_name"="", >>> "neutron:project_id"="", "neutron:revision_number"="5", >>> "neutron:security_group_ids"="", "neutron:subnet_pool_addr_scope4"="", >>> "neutron:subnet_pool_addr_scope6"="", "neutron:vnic_type"=normal} >>> -- >>> addresses : ["fa:16:3e:62:e5:5f 193.32.177.44", unknown] >>> dhcpv4_options : [] >>> dhcpv6_options : [] >>> dynamic_addresses : [] >>> enabled : true >>> external_ids : {"neutron:cidrs"="193.32.177.44/24", >>> "neutron:device_id"="544cddd2-7a53-492a-8933-91fb97fd0546", >>> "neutron:device_owner"="network:floatingip_agent_gateway", >>> "neutron:host_id"=us-east-standard-1, "neutron:mtu"="", >>> "neutron:network_name"=neutron-7dce255f-4824-4a21-a550-f8d03a25c285, >>> "neutron:port_capabilities"="", "neutron:port_name"="", >>> "neutron:project_id"="", "neutron:revision_number"="3", >>> "neutron:security_group_ids"="", "neutron:subnet_pool_addr_scope4"="", >>> "neutron:subnet_pool_addr_scope6"="", "neutron:vnic_type"=normal} >>> -- >>> addresses : [unknown] >>> dhcpv4_options : [] >>> dhcpv6_options : [] >>> dynamic_addresses : [] >>> enabled : true >>> external_ids : {"neutron:cidrs"="12.26.0.2/16", >>> "neutron:device_id"=dhcp8b62a377-0e4b-5497-b096-c08bf79b6c42- >>> c5db4fec-9c10-4022-835d-7281506d8a7e, >>> "neutron:device_owner"="network:dhcp", "neutron:host_id"=us-east- >>> standard-1, "neutron:mtu"="", "neutron:network_name"=neutron- >>> c5db4fec-9c10-4022-835d-7281506d8a7e, "neutron:port_capabilities"="", >>> "neutron:port_name"="", >>> "neutron:project_id"=a3b7099e62ac4fb9b3d548dfaff7aeaf, >>> "neutron:revision_number"="5", "neutron:security_group_ids"="", >>> "neutron:subnet_pool_addr_scope4"="", >>> "neutron:subnet_pool_addr_scope6"="", "neutron:vnic_type"=normal} >>> -- >>> addresses : [unknown] >>> dhcpv4_options : [] >>> dhcpv6_options : [] >>> dynamic_addresses : [] >>> enabled : true >>> external_ids : {"neutron:cidrs"="10.10.3.242/24", >>> "neutron:device_id"="544cddd2-7a53-492a-8933-91fb97fd0546", >>> "neutron:device_owner"="network:floatingip_agent_gateway", >>> "neutron:host_id"=us-east-standard-1, "neutron:mtu"="", >>> "neutron:network_name"=neutron-bb8d0ef6-9b45-4398-86f3-51323a0db2cd, >>> "neutron:port_capabilities"="", "neutron:port_name"="", >>> "neutron:project_id"="", "neutron:revision_number"="5", >>> "neutron:security_group_ids"="", "neutron:subnet_pool_addr_scope4"="", >>> "neutron:subnet_pool_addr_scope6"="", "neutron:vnic_type"=normal} >>> -- >>> addresses : ["fa:16:3e:6e:27:09 12.26.0.109", unknown] >>> dhcpv4_options : [] >>> dhcpv6_options : [] >>> dynamic_addresses : [] >>> enabled : true >>> external_ids : {"neutron:cidrs"="12.26.0.109/16", >>> "neutron:device_id"="6e2d75ce-1503-4e40-bc72-ef3adc59d45f", >>> "neutron:device_owner"="network:router_centralized_snat", >>> "neutron:host_id"=us-east-standard-1, "neutron:mtu"="", >>> "neutron:network_name"=neutron-c5db4fec-9c10-4022-835d-7281506d8a7e, >>> "neutron:port_capabilities"="", "neutron:port_name"="", >>> "neutron:project_id"="", "neutron:revision_number"="6", >>> "neutron:security_group_ids"="", "neutron:subnet_pool_addr_scope4"="", >>> "neutron:subnet_pool_addr_scope6"="", "neutron:vnic_type"=normal} >>> -- >>> addresses : ["fa:16:3e:0c:ac:01 12.26.1.76", unknown] >>> dhcpv4_options : [] >>> dhcpv6_options : [] >>> dynamic_addresses : [] >>> enabled : true >>> external_ids : {"neutron:cidrs"="12.26.1.76/16", >>> "neutron:device_id"="4d3f7d3d-a637-4e40-8bc3-fda4712a1ada", >>> "neutron:device_owner"="network:router_centralized_snat", >>> "neutron:host_id"=us-east-standard-1, "neutron:mtu"="", >>> "neutron:network_name"=neutron-c5db4fec-9c10-4022-835d-7281506d8a7e, >>> "neutron:port_capabilities"="", "neutron:port_name"="", >>> "neutron:project_id"="", "neutron:revision_number"="6", >>> "neutron:security_group_ids"="", "neutron:subnet_pool_addr_scope4"="", >>> "neutron:subnet_pool_addr_scope6"="", "neutron:vnic_type"=normal} >>> -- >>> addresses : [unknown] >>> dhcpv4_options : [] >>> dhcpv6_options : [] >>> dynamic_addresses : [] >>> enabled : true >>> external_ids : {"neutron:cidrs"="10.10.3.240/24", >>> "neutron:device_id"=dhcp8b62a377-0e4b-5497-b096-c08bf79b6c42- >>> bb8d0ef6-9b45-4398-86f3-51323a0db2cd, >>> "neutron:device_owner"="network:dhcp", "neutron:host_id"=us-east- >>> standard-1, "neutron:mtu"="", "neutron:network_name"=neutron- >>> bb8d0ef6-9b45-4398-86f3-51323a0db2cd, "neutron:port_capabilities"="", >>> "neutron:port_name"="", >>> "neutron:project_id"="03d31c9de2ec41c787add9b44aacd3a8", >>> "neutron:revision_number"="6", "neutron:security_group_ids"="", >>> "neutron:subnet_pool_addr_scope4"="", >>> "neutron:subnet_pool_addr_scope6"="", "neutron:vnic_type"=normal} >>> -- >>> addresses : [unknown] >>> dhcpv4_options : [] >>> dhcpv6_options : [] >>> dynamic_addresses : [] >>> enabled : [] >>> external_ids : {} >>> -- >>> addresses : [unknown] >>> dhcpv4_options : [] >>> dhcpv6_options : [] >>> dynamic_addresses : [] >>> enabled : true >>> external_ids : {"neutron:cidrs"="193.32.177.174/24", >>> "neutron:device_id"="ba1a43e2-5496-4ced-8b8c-9b42c5ddd6f1", >>> "neutron:device_owner"="network:floatingip_agent_gateway", >>> "neutron:host_id"=us-east-standard-2, "neutron:mtu"="", >>> "neutron:network_name"=neutron-7dce255f-4824-4a21-a550-f8d03a25c285, >>> "neutron:port_capabilities"="", "neutron:port_name"="", >>> "neutron:project_id"="", "neutron:revision_number"="5", >>> "neutron:security_group_ids"="", "neutron:subnet_pool_addr_scope4"="", >>> "neutron:subnet_pool_addr_scope6"="", "neutron:vnic_type"=normal} >>> -- >>> addresses : [unknown] >>> dhcpv4_options : [] >>> dhcpv6_options : [] >>> dynamic_addresses : [] >>> enabled : [] >>> external_ids : {}* >>> >>> Regards, >>> >>> Ilia Baikov >>> [email protected] >>> >>> 02.03.2026 18:23, Dumitru Ceara пишет: >>>> On 3/2/26 1:18 PM, Ilia Baikov wrote: >>>>> To keep region stable decided to rollback to 25.09 which has no split >>>>> buf merged and migrated to L3 topology with /32 advertise via BGP. >>>>> Just a guess: reaching 2k ports (VMs) in a single logical_switch is >>>>> the >>>>> reason why ARP flows are being dropped/discarded because of resubmit >>>>> limit. What do you think? >>>>> >>>> Hi Ilia, >>>> >>>> Right, the very high number of logical switch ports that are part of >>>> the >>>> MC_FLOOD_L2 OVN multicast group (in this case all your VM ports) is >>>> what's causing issues with broadcast ARP requests: >>>> a. generated by the logical router port >>>> b. generated by VMs attached to the logical switch >>>> >>>> I'll try to prepare a test/rfc patch in the next days to see if >>>> changing >>>> the action for some logical flows from flooding on the "MC_FLOOD_L2" >>>> group to flooding on the "MC_UNKNOWN" group makes things work in your >>>> setup. >>>> >>>> Before that, can you please share how many of those 2k VM ports have >>>> LSP.ddresses configured to include "unknown"? >>>> >>>> Thanks, >>>> Dumitru >>>> >>>>> Regards, >>>>> >>>>> Ilia Baikov >>>>> [email protected] >>>>> >>>>> 26.02.2026 16:51, Ilia Baikov пишет: >>>>>> Hi, >>>>>> >>>>>> This patches seems to fix DHCP issues but there is cases when >>>>>> instance >>>>>> booted, received configuration from the metadata service but don't >>>>>> have a public connectivity (done through L2 networking). >>>>>> >>>>>>> Which, if the logical switch has a reasonably high number of ports >>>>>>> (maybe around 200) will probably cause the resubmit limit to be hit >>>>>> This is the case, public L2 network with around ~2000 running >>>>>> instances (or ports in terms of LSP). >>>>>> >>>>>>> Are these OVN router port IPs? Or are they OVN workload IPs? Or >>>>>>> are >>>>>>> they just IPs owned by some fabric hosts, outside of OVN? >>>>>> .1 IP from each subnet runs by the border gateway. So instance asks >>>>>> for .1 to know GW MAC adddress but due to hitting limit instance >>>>>> receive no response because ARP flow is dropped. >>>>>> >>>>>>> Also, aside from the logs, do you actually see any traffic being >>>>>>> impacted? I.e., are your workloads able to come up and properly >>>>>>> communicate? >>>>>> Nope, there is connectivity loss since some instances has no public >>>>>> connectivity due to ARP issues. >>>>>> >>>>>> >>>>>> Regards, >>>>>> >>>>>> Ilia Baikov >>>>>> [email protected] >>>>>> 26.02.2026 14:31, Dumitru Ceara пишет: >>>>>>> Hi Ilia, >>>>>>> >>>>>>> On 2/24/26 3:29 PM, Ilia Baikov wrote: >>>>>>>> Just checked openvswitch logs. Resubmit 4096 is actually occurs >>>>>>>> even on >>>>>>>> 25.09.2. >>>>>>> v25.09.2 includes: >>>>>>> https://github.com/ovn-org/ovn/commit/0bb60da >>>>>>> >>>>>>> Which should fix the "self-DoS" issues introduced by: >>>>>>> https://github.com/ovn-org/ovn/commit/325c7b2 >>>>>>> >>>>>>> But that means that in some cases, e.g., for real BUM traffic or for >>>>>>> GARPs originated by OVN router ports we will try to "flood" the >>>>>>> packet >>>>>>> in the L2 broadcast domain. >>>>>>> >>>>>>> Which, if the logical switch has a reasonably high number of ports >>>>>>> (maybe around 200) will probably cause the resubmit limit to be hit. >>>>>>> >>>>>>> In the examples below, I see the packets that cause this are ARP >>>>>>> requests requesting the MAC address of: >>>>>>> - 138.124.72.1 >>>>>>> - 83.219.248.109 >>>>>>> - 138.124.72.1 >>>>>>> - 91.92.46.1 >>>>>>> >>>>>>> Are these OVN router port IPs? Or are they OVN workload IPs? Or >>>>>>> are >>>>>>> they just IPs owned by some fabric hosts, outside of OVN? >>>>>>> >>>>>>> Also, aside from the logs, do you actually see any traffic being >>>>>>> impacted? I.e., are your workloads able to come up and properly >>>>>>> communicate? >>>>>>> >>>>>>> Thanks, >>>>>>> Dumitru >>>>>>> >>>>>>>> Final flow: unchanged >>>>>>>> Megaflow: recirc_id=0,eth,arp,in_port=346,dl_src=fa:16:3e:63:aa:d0 >>>>>>>> Datapath actions: drop >>>>>>>> 2026-02-24T14:23:17.457Z|04071|connmgr|INFO|br-int<->unix#4346: 1 >>>>>>>> flow_mods in the last 0 s (1 adds) >>>>>>>> 2026-02-24T14:23:34.821Z|00076|ofproto_dpif_xlate(handler24)|WARN| >>>>>>>> Dropped 854 log messages in last 60 seconds (most recently, 0 >>>>>>>> seconds >>>>>>>> ago) due to excessive rate >>>>>>>> 2026-02-24T14:23:34.821Z|00077|ofproto_dpif_xlate(handler24)|WARN| >>>>>>>> over >>>>>>>> 4096 resubmit actions on bridge br-int while processing >>>>>>>> arp,in_port=4715,vlan_tci=0x0000,dl_src=fa:16:3e:97:65:15,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=138.124.72.142,arp_tpa=138.124.72.1,arp_op=1,arp_sha=fa:16:3e:97:65:15,arp_tha=00:00:00:00:00:00 >>>>>>>> 2026-02-24T14:23:45.464Z|00091|dpif(handler28)|WARN|system@ovs- >>>>>>>> system: >>>>>>>> execute >>>>>>>> ct(commit,zone=163,mark=0/0x41,label=0/0xffff00000000000000000000,nat(src)),154 >>>>>>>> failed (Invalid argument) on packet >>>>>>>> tcp,vlan_tci=0x0000,dl_src=0c:86:10:b7:9e:e0,dl_dst=fa:16:3e:69:22:89,nw_src=31.44.82.94,nw_dst=31.169.126.149,nw_tos=32,nw_ecn=0,nw_ttl=57,nw_frag=no,tp_src=51064,tp_dst=443,tcp_flags=syn >>>>>>>> tcp_csum:d7b0 >>>>>>>> with metadata >>>>>>>> skb_priority(0),skb_mark(0),ct_state(0x21),ct_zone(0xa3),ct_tuple4(src=31.44.82.94,dst=31.169.126.149,proto=6,tp_src=51064,tp_dst=443),in_port(2) >>>>>>>> mtu 0 >>>>>>>> 2026-02-24T14:23:56.702Z|00072|ofproto_dpif_upcall(handler30)|WARN| >>>>>>>> Dropped 697 log messages in last 60 seconds (most recently, 0 >>>>>>>> seconds >>>>>>>> ago) due to excessive rate >>>>>>>> 2026-02-24T14:23:56.702Z|00073|ofproto_dpif_upcall(handler30)| >>>>>>>> WARN|Flow: >>>>>>>> arp,in_port=409,vlan_tci=0x0000,dl_src=fa:16:3e:22:f2:f7,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=192.145.28.207,arp_tpa=192.145.28.1,arp_op=1,arp_sha=fa:16:3e:22:f2:f7,arp_tha=00:00:00:00:00:00 >>>>>>>> >>>>>>>> bridge("br-int") >>>>>>>> ---------------- >>>>>>>> 0. priority 0 >>>>>>>> drop >>>>>>>> >>>>>>>> Final flow: unchanged >>>>>>>> Megaflow: recirc_id=0,eth,arp,in_port=409,dl_src=fa:16:3e:22:f2:f7 >>>>>>>> Datapath actions: drop >>>>>>>> 2026-02-24T14:24:34.891Z|02715|ofproto_dpif_xlate(handler2)|WARN| >>>>>>>> Dropped >>>>>>>> 1059 log messages in last 60 seconds (most recently, 1 seconds >>>>>>>> ago) due >>>>>>>> to excessive rate >>>>>>>> 2026-02-24T14:24:34.891Z|02716|ofproto_dpif_xlate(handler2)| >>>>>>>> WARN|over >>>>>>>> 4096 resubmit actions on bridge br-int while processing >>>>>>>> arp,in_port=1,vlan_tci=0x0000,dl_src=0c:86:10:b7:9e:e0,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=83.219.248.1,arp_tpa=83.219.248.109,arp_op=1,arp_sha=0c:86:10:b7:9e:e0,arp_tha=00:00:00:00:00:00 >>>>>>>> 2026-02-24T14:24:46.042Z|04072|connmgr|INFO|br-int<->unix#4353: 1 >>>>>>>> flow_mods in the last 0 s (1 adds) >>>>>>>> 2026-02-24T14:24:59.041Z|00066|ofproto_dpif_upcall(handler78)|WARN| >>>>>>>> Dropped 662 log messages in last 63 seconds (most recently, 3 >>>>>>>> seconds >>>>>>>> ago) due to excessive rate >>>>>>>> 2026-02-24T14:24:59.041Z|00067|ofproto_dpif_upcall(handler78)| >>>>>>>> WARN|Flow: >>>>>>>> arp,in_port=339,vlan_tci=0x0000,dl_src=fa:16:3e:39:60:bb,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=91.92.46.85,arp_tpa=91.92.46.1,arp_op=1,arp_sha=fa:16:3e:39:60:bb,arp_tha=00:00:00:00:00:00 >>>>>>>> >>>>>>>> bridge("br-int") >>>>>>>> ---------------- >>>>>>>> 0. priority 0 >>>>>>>> drop >>>>>>>> >>>>>>>> Final flow: unchanged >>>>>>>> Megaflow: recirc_id=0,eth,arp,in_port=339,dl_src=fa:16:3e:39:60:bb >>>>>>>> Datapath actions: drop >>>>>>>> 2026-02-24T14:25:34.783Z|00067|ofproto_dpif_xlate(handler7)|WARN| >>>>>>>> Dropped >>>>>>>> 952 log messages in last 60 seconds (most recently, 0 seconds ago) >>>>>>>> due >>>>>>>> to excessive rate >>>>>>>> 2026-02-24T14:25:34.783Z|00068|ofproto_dpif_xlate(handler7)| >>>>>>>> WARN|over >>>>>>>> 4096 resubmit actions on bridge br-int while processing >>>>>>>> arp,in_port=4812,vlan_tci=0x0000,dl_src=fa:16:3e:68:f7:1b,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=138.124.72.245,arp_tpa=138.124.72.1,arp_op=1,arp_sha=fa:16:3e:68:f7:1b,arp_tha=00:00:00:00:00:00 >>>>>>>> 2026-02-24T14:25:59.094Z|00067|ofproto_dpif_upcall(handler11)|WARN| >>>>>>>> Dropped 720 log messages in last 60 seconds (most recently, 0 >>>>>>>> seconds >>>>>>>> ago) due to excessive rate >>>>>>>> 2026-02-24T14:25:59.095Z|00068|ofproto_dpif_upcall(handler11)| >>>>>>>> WARN|Flow: >>>>>>>> arp,in_port=305,vlan_tci=0x0000,dl_src=fa:16:3e:d9:8d:f3,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=91.92.46.188,arp_tpa=91.92.46.1,arp_op=1,arp_sha=fa:16:3e:d9:8d:f3,arp_tha=00:00:00:00:00:00 >>>>>>>> >>>>>>>> bridge("br-int") >>>>>>>> ---------------- >>>>>>>> 0. priority 0 >>>>>>>> drop >>>>>>>> >>>>>>>> Final flow: unchanged >>>>>>>> Megaflow: recirc_id=0,eth,arp,in_port=305,dl_src=fa:16:3e:d9:8d:f3 >>>>>>>> Datapath actions: drop >>>>>>>> 2026-02-24T14:26:35.024Z|02717|ofproto_dpif_xlate(handler2)|WARN| >>>>>>>> Dropped >>>>>>>> 937 log messages in last 61 seconds (most recently, 1 seconds ago) >>>>>>>> due >>>>>>>> to excessive rate >>>>>>>> 2026-02-24T14:26:35.024Z|02718|ofproto_dpif_xlate(handler2)| >>>>>>>> WARN|over >>>>>>>> 4096 resubmit actions on bridge br-int while processing >>>>>>>> arp,in_port=1,vlan_tci=0x0000,dl_src=0c:86:10:b7:9e:e0,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=104.165.244.1,arp_tpa=104.165.244.146,arp_op=1,arp_sha=0c:86:10:b7:9e:e0,arp_tha=00:00:00:00:00:00 >>>>>>>> 2026-02-24T14:26:59.151Z|00067|ofproto_dpif_upcall(handler67)|WARN| >>>>>>>> Dropped 884 log messages in last 60 seconds (most recently, 0 >>>>>>>> seconds >>>>>>>> ago) due to excessive rate >>>>>>>> 2026-02-24T14:26:59.151Z|00068|ofproto_dpif_upcall(handler67)| >>>>>>>> WARN|Flow: >>>>>>>> arp,in_port=380,vlan_tci=0x0000,dl_src=fa:16:3e:f1:5b:e7,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=138.124.72.215,arp_tpa=138.124.72.1,arp_op=1,arp_sha=fa:16:3e:f1:5b:e7,arp_tha=00:00:00:00:00:00 >>>>>>>> >>>>>>>> bridge("br-int") >>>>>>>> ---------------- >>>>>>>> 0. in_port=380, priority 100, cookie 0x2cfc9def >>>>>>>> set_field:0x90/0xffff->reg13 >>>>>>>> set_field:0x3->reg11 >>>>>>>> set_field:0x1->reg12 >>>>>>>> set_field:0x1->metadata >>>>>>>> set_field:0x1d2->reg14 >>>>>>>> set_field:0/0xffff0000->reg13 >>>>>>>> resubmit(,8) >>>>>>>> 8. metadata=0x1, priority 50, cookie 0x43f4e129 >>>>>>>> set_field:0/0x1000->reg10 >>>>>>>> resubmit(,73) >>>>>>>> 73. arp,reg14=0x1d2,metadata=0x1, priority 95, cookie >>>>>>>> 0x2cfc9def >>>>>>>> resubmit(,74) >>>>>>>> 74. arp,reg14=0x1d2,metadata=0x1, priority 80, cookie >>>>>>>> 0x2cfc9def >>>>>>>> set_field:0x1000/0x1000->reg10 >>>>>>>> move:NXM_NX_REG10[12]->NXM_NX_XXREG0[111] >>>>>>>> -> NXM_NX_XXREG0[111] is now 0x1 >>>>>>>> resubmit(,9) >>>>>>>> 9. reg0=0x8000/0x8000,metadata=0x1, priority 50, cookie >>>>>>>> 0xf4bfe3b3 >>>>>>>> drop >>>>>>>> >>>>>>>> Final flow: >>>>>>>> arp,reg0=0x8000,reg10=0x1000,reg11=0x3,reg12=0x1,reg13=0x90,reg14=0x1d2,metadata=0x1,in_port=380,vlan_tci=0x0000,dl_src=fa:16:3e:f1:5b:e7,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=138.124.72.215,arp_tpa=138.124.72.1,arp_op=1,arp_sha=fa:16:3e:f1:5b:e7,arp_tha=00:00:00:00:00:00 >>>>>>>> Megaflow: recirc_id=0,eth,arp,in_port=380,dl_src=fa:16:3e:f1:5b:e7 >>>>>>>> Datapath actions: drop >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Broadcast arps to all routers is set to false. >>>>>>>> _uuid : 1841d88f-3fbf-427f-8d6c-c3edaba47a0a >>>>>>>> acls : [] >>>>>>>> copp : [] >>>>>>>> dns_records : [] >>>>>>>> external_ids : {"neutron:availability_zone_hints"="", >>>>>>>> "neutron:mtu"="1500", "neutron:network_name"=poland-public, >>>>>>>> "neutron:provnet-network-type"=vlan, >>>>>>>> "neutron:revision_number"="12"} >>>>>>>> forwarding_groups : [] >>>>>>>> load_balancer : [] >>>>>>>> load_balancer_group : [] >>>>>>>> name : neutron-da85395e-c326-489d-b4e6-dfb62aad360d >>>>>>>> other_config : {broadcast-arps-to-all-routers="false", >>>>>>>> fdb_age_threshold="0", mcast_flood_unregistered="false", >>>>>>>> mcast_snoop="false", vlan-passthru="false"} >>>>>>>> ports : [00288a04-90a4-4e8e-bada-8213747c92e4, >>>>>>>> 0047d609- >>>>>>>> ebff-4c43-8f1d-32d83d70c9e6, 00b6c585-ae29-4e88-a52a-3a16e1d91112 >>>>>>>> >>>>>>>> >>>>>>>> Regards, >>>>>>>> >>>>>>>> Ilia Baikov >>>>>>>> [email protected] >>>>>>>> >>>>>>>> 24.02.2026 17:16, Ilia Baikov пишет: >>>>>>>>> Hello, >>>>>>>>> After ugprading to OpenStack 2025.2 with OVN 25.09.2 (which >>>>>>>>> contains >>>>>>>>> split buf fix) seems like no issues with DHCP, but I see a lot of >>>>>>>>> missed ARP, VM unable to reach GW and there is no ARP >>>>>>>>> broadcasted to >>>>>>>>> some of VMs. Debugging shows that ovn installs drop arp flows for >>>>>>>>> some >>>>>>>>> reason. >>>>>>>>> >>>>>>>>> ovs-appctl ofproto/trace br-int \ >>>>>>>>> "in_port=2,dl_vlan=1000,dl_src=0c:86:10:b7:9e:e0,dl_dst=ff:ff:ff:ff:ff:ff,dl_type=0x0806,arp_op=1,arp_spa=192.145.28.1,arp_tpa=192.145.28.113" >>>>>>>>> 2>&1 | tail -80 >>>>>>>>> Flow: >>>>>>>>> arp,in_port=2,dl_vlan=1000,dl_vlan_pcp=0,vlan_tci1=0x0000,dl_src=0c:86:10:b7:9e:e0,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=192.145.28.1,arp_tpa=192.145.28.113,arp_op=1,arp_sha=00:00:00:00:00:00,arp_tha=00:00:00:00:00:00 >>>>>>>>> >>>>>>>>> bridge("br-int") >>>>>>>>> ---------------- >>>>>>>>> 0. in_port=2, priority 100 >>>>>>>>> move:NXM_NX_TUN_ID[0..23]->OXM_OF_METADATA[0..23] >>>>>>>>> -> OXM_OF_METADATA[0..23] is now 0 >>>>>>>>> move:NXM_NX_TUN_METADATA0[16..30]->NXM_NX_REG14[0..14] >>>>>>>>> -> NXM_NX_REG14[0..14] is now 0 >>>>>>>>> move:NXM_NX_TUN_METADATA0[0..15]->NXM_NX_REG15[0..15] >>>>>>>>> -> NXM_NX_REG15[0..15] is now 0 >>>>>>>>> resubmit(,45) >>>>>>>>> 45. priority 0 >>>>>>>>> drop >>>>>>>>> >>>>>>>>> Final flow: unchanged >>>>>>>>> Megaflow: recirc_id=0,eth,arp,in_port=2,dl_src=0c:86:10:b7:9e:e0 >>>>>>>>> Datapath actions: drop >>>>>>>>> >>>>>>>>> docker exec ovn_controller ovn-controller --version >>>>>>>>> ovn-controller 25.09.2 >>>>>>>>> Open vSwitch Library 3.6.2 >>>>>>>>> OpenFlow versions 0x6:0x6 >>>>>>>>> SB DB Schema 21.5.0 >>>>>>>>> >>>>>>>>> ovn-controller logs shows no errors clearly: >>>>>>>>> 2026-02-24T14:06:39.403Z|00001|vlog|INFO|opened log file /var/log/ >>>>>>>>> kolla/openvswitch/ovn-controller.log >>>>>>>>> 2026-02-24T14:06:39.406Z|00002|reconnect|INFO|tcp:127.0.0.1:6640: >>>>>>>>> connecting... >>>>>>>>> 2026-02-24T14:06:39.406Z|00003|reconnect|INFO|tcp:127.0.0.1:6640: >>>>>>>>> connected >>>>>>>>> 2026-02-24T14:06:39.463Z|00004|main|INFO|OVN internal version is : >>>>>>>>> [25.09.2-21.5.0-81.10] >>>>>>>>> 2026-02-24T14:06:39.463Z|00005|main|INFO|OVS IDL reconnected, >>>>>>>>> force >>>>>>>>> recompute. >>>>>>>>> 2026-02-24T14:06:39.464Z|00006|reconnect|INFO|tcp:10.11.0.4:16641: >>>>>>>>> connecting... >>>>>>>>> 2026-02-24T14:06:39.464Z|00007|main|INFO|OVNSB IDL reconnected, >>>>>>>>> force >>>>>>>>> recompute. >>>>>>>>> 2026-02-24T14:06:39.464Z|00008|reconnect|INFO|tcp:10.11.0.4:16641: >>>>>>>>> connected >>>>>>>>> 2026-02-24T14:06:39.464Z|00001|rconn(ovn_statctrl3)|INFO|unix:/ >>>>>>>>> var/ >>>>>>>>> run/openvswitch/br-int.mgmt: connected >>>>>>>>> 2026-02-24T14:06:39.464Z|00001|rconn(ovn_pinctrl0)|INFO|unix:/ >>>>>>>>> var/run/ >>>>>>>>> openvswitch/br-int.mgmt: connected >>>>>>>>> 2026-02-24T14:06:39.529Z|00009|main|INFO|OVS feature set changed, >>>>>>>>> force recompute. >>>>>>>>> 2026-02-24T14:06:39.532Z|00010|rconn|INFO|unix:/var/run/ >>>>>>>>> openvswitch/ >>>>>>>>> br-int.mgmt: connected >>>>>>>>> 2026-02-24T14:06:39.532Z|00011|main|INFO|OVS OpenFlow connection >>>>>>>>> reconnected,force recompute. >>>>>>>>> 2026-02-24T14:06:39.536Z|00012|main|INFO|OVS feature set changed, >>>>>>>>> force recompute. >>>>>>>>> 2026-02-24T14:06:40.564Z|00013|main|INFO|OVS feature set changed, >>>>>>>>> force recompute. >>>>>>>>> 2026-02-24T14:06:45.920Z|00014|binding|INFO|Releasing lport >>>>>>>>> bcd3ecfa- >>>>>>>>> f43c-4e72-8978-73bbad07ed75 from this chassis (sb_readonly=1) >>>>>>>>> 2026-02-24T14:06:45.924Z|00015|binding|INFO|Releasing lport >>>>>>>>> 4f1f45b0-726c-4fea-b462-06dcbf559c25 from this chassis >>>>>>>>> (sb_readonly=1) >>>>>>>>> 2026-02-24T14:06:46.927Z|00016|timeval|WARN|Unreasonably long >>>>>>>>> 1413ms >>>>>>>>> poll interval (1294ms user, 117ms system) >>>>>>>>> 2026-02-24T14:06:46.927Z|00017|timeval|WARN|faults: 38131 minor, >>>>>>>>> 0 major >>>>>>>>> 2026-02-24T14:06:46.927Z|00018|timeval|WARN|disk: 0 reads, 8 >>>>>>>>> writes >>>>>>>>> 2026-02-24T14:06:46.927Z|00019|timeval|WARN|context switches: 0 >>>>>>>>> voluntary, 65 involuntary >>>>>>>>> 2026-02-24T14:06:46.936Z|00020|coverage|INFO|Event coverage, avg >>>>>>>>> rate >>>>>>>>> over last: 5 seconds, last minute, last hour, hash=1a815819: >>>>>>>>> 2026-02-24T14:06:46.936Z|00021|coverage|INFO|physical_run 0.2/sec >>>>>>>>> 0.017/sec 0.0003/sec total: 1 >>>>>>>>> 2026-02-24T14:06:46.936Z|00022|coverage|INFO|lflow_conj_alloc >>>>>>>>> 0.0/ >>>>>>>>> sec 0.000/sec 0.0000/sec total: 407 >>>>>>>>> 2026-02-24T14:06:46.936Z|00023|coverage|INFO|lflow_cache_miss >>>>>>>>> 0.0/ >>>>>>>>> sec 0.000/sec 0.0000/sec total: 13470 >>>>>>>>> 2026-02-24T14:06:46.936Z|00024|coverage|INFO|lflow_cache_hit >>>>>>>>> 0.0/sec >>>>>>>>> 0.000/sec 0.0000/sec total: 394 >>>>>>>>> 2026-02-24T14:06:46.936Z|00025|coverage|INFO|lflow_cache_add >>>>>>>>> 0.0/sec >>>>>>>>> 0.000/sec 0.0000/sec total: 12956 >>>>>>>>> 2026-02-24T14:06:46.936Z|00026|coverage|INFO| >>>>>>>>> lflow_cache_add_matches >>>>>>>>> 0.0/sec 0.000/sec 0.0000/sec total: 2412 >>>>>>>>> 2026-02-24T14:06:46.936Z|00027|coverage|INFO|lflow_cache_add_expr >>>>>>>>> 0.0/sec 0.000/sec 0.0000/sec total: 10544 >>>>>>>>> 2026-02-24T14:06:46.936Z|00028|coverage|INFO|consider_logical_flow >>>>>>>>> 0.0/sec 0.000/sec 0.0000/sec total: 20680 >>>>>>>>> 2026-02-24T14:06:46.936Z|00029|coverage|INFO|lflow_run 0.2/sec >>>>>>>>> 0.017/sec 0.0003/sec total: 1 >>>>>>>>> 2026-02-24T14:06:46.936Z|00030|coverage|INFO|miniflow_malloc >>>>>>>>> 16.6/ >>>>>>>>> sec 1.383/sec 0.0231/sec total: 28561 >>>>>>>>> 2026-02-24T14:06:46.936Z|00031|coverage|INFO|hmap_pathological >>>>>>>>> 11.2/ >>>>>>>>> sec 0.933/sec 0.0156/sec total: 257 >>>>>>>>> 2026-02-24T14:06:46.936Z|00032|coverage|INFO|hmap_expand 837.2/sec >>>>>>>>> 69.767/sec 1.1628/sec total: 30358 >>>>>>>>> 2026-02-24T14:06:46.936Z|00033|coverage|INFO|hmap_reserve 0.4/sec >>>>>>>>> 0.033/sec 0.0006/sec total: 21733 >>>>>>>>> 2026-02-24T14:06:46.936Z|00034|coverage|INFO|txn_unchanged 2.4/sec >>>>>>>>> 0.200/sec 0.0033/sec total: 65 >>>>>>>>> 2026-02-24T14:06:46.936Z|00035|coverage|INFO|txn_incomplete >>>>>>>>> 1.4/sec >>>>>>>>> 0.117/sec 0.0019/sec total: 60 >>>>>>>>> 2026-02-24T14:06:46.936Z|00036|coverage|INFO|txn_success 0.6/sec >>>>>>>>> 0.050/sec 0.0008/sec total: 3 >>>>>>>>> 2026-02-24T14:06:46.936Z|00037|coverage|INFO|poll_create_node >>>>>>>>> 24.0/ >>>>>>>>> sec 2.000/sec 0.0333/sec total: 1304 >>>>>>>>> 2026-02-24T14:06:46.937Z|00038|coverage|INFO|poll_zero_timeout >>>>>>>>> 0.0/ >>>>>>>>> sec 0.000/sec 0.0000/sec total: 1 >>>>>>>>> 2026-02-24T14:06:46.937Z|00039|coverage|INFO|rconn_queued 0.8/sec >>>>>>>>> 0.067/sec 0.0011/sec total: 4 >>>>>>>>> 2026-02-24T14:06:46.937Z|00040|coverage|INFO|rconn_sent 0.8/sec >>>>>>>>> 0.067/sec 0.0011/sec total: 4 >>>>>>>>> 2026-02-24T14:06:46.937Z|00041|coverage|INFO|seq_change 9.2/sec >>>>>>>>> 0.767/sec 0.0128/sec total: 532 >>>>>>>>> 2026-02-24T14:06:46.937Z|00042|coverage|INFO|pstream_open 0.2/sec >>>>>>>>> 0.017/sec 0.0003/sec total: 1 >>>>>>>>> 2026-02-24T14:06:46.937Z|00043|coverage|INFO|stream_open 1.2/sec >>>>>>>>> 0.100/sec 0.0017/sec total: 6 >>>>>>>>> 2026-02-24T14:06:46.937Z|00044|coverage|INFO|util_xalloc >>>>>>>>> 29035.4/sec >>>>>>>>> 2419.617/sec 40.3269/sec total: 2277081 >>>>>>>>> 2026-02-24T14:06:46.937Z|00045|coverage|INFO|vconn_received >>>>>>>>> 0.8/sec >>>>>>>>> 0.067/sec 0.0011/sec total: 4 >>>>>>>>> 2026-02-24T14:06:46.937Z|00046|coverage|INFO|vconn_sent 1.2/sec >>>>>>>>> 0.100/sec 0.0017/sec total: 6 >>>>>>>>> 2026-02-24T14:06:46.937Z|00047|coverage|INFO| >>>>>>>>> jsonrpc_recv_incomplete >>>>>>>>> 0.6/sec 0.050/sec 0.0008/sec total: 52 >>>>>>>>> 2026-02-24T14:06:46.937Z|00048|coverage|INFO|138 events never hit >>>>>>>>> 2026-02-24T14:06:46.976Z|00049|binding|INFO|Releasing lport >>>>>>>>> 4f1f45b0-726c-4fea-b462-06dcbf559c25 from this chassis >>>>>>>>> (sb_readonly=0) >>>>>>>>> 2026-02-24T14:06:46.977Z|00050|binding|INFO|Releasing lport >>>>>>>>> bcd3ecfa- >>>>>>>>> f43c-4e72-8978-73bbad07ed75 from this chassis (sb_readonly=0) >>>>>>>>> 2026-02-24T14:06:48.054Z|00051|timeval|WARN|Unreasonably long >>>>>>>>> 1117ms >>>>>>>>> poll interval (1108ms user, 8ms system) >>>>>>>>> 2026-02-24T14:06:48.054Z|00052|timeval|WARN|faults: 2581 minor, 0 >>>>>>>>> major >>>>>>>>> 2026-02-24T14:06:48.054Z|00053|timeval|WARN|context switches: 0 >>>>>>>>> voluntary, 8 involuntary >>>>>>>>> 2026-02-24T14:06:48.055Z|00054|coverage|INFO|Event coverage, avg >>>>>>>>> rate >>>>>>>>> over last: 5 seconds, last minute, last hour, hash=0878340f: >>>>>>>>> 2026-02-24T14:06:48.055Z|00055|coverage|INFO|physical_run 0.2/sec >>>>>>>>> 0.017/sec 0.0003/sec total: 2 >>>>>>>>> 2026-02-24T14:06:48.055Z|00056|coverage|INFO|lflow_conj_alloc >>>>>>>>> 0.0/ >>>>>>>>> sec 0.000/sec 0.0000/sec total: 814 >>>>>>>>> 2026-02-24T14:06:48.055Z|00057|coverage|INFO|lflow_cache_miss >>>>>>>>> 0.0/ >>>>>>>>> sec 0.000/sec 0.0000/sec total: 13979 >>>>>>>>> 2026-02-24T14:06:48.055Z|00058|coverage|INFO|lflow_cache_hit >>>>>>>>> 0.0/sec >>>>>>>>> 0.000/sec 0.0000/sec total: 13671 >>>>>>>>> 2026-02-24T14:06:48.055Z|00059|coverage|INFO|lflow_cache_add >>>>>>>>> 0.0/sec >>>>>>>>> 0.000/sec 0.0000/sec total: 12956 >>>>>>>>> 2026-02-24T14:06:48.055Z|00060|coverage|INFO| >>>>>>>>> lflow_cache_add_matches >>>>>>>>> 0.0/sec 0.000/sec 0.0000/sec total: 2412 >>>>>>>>> 2026-02-24T14:06:48.055Z|00061|coverage|INFO|lflow_cache_add_expr >>>>>>>>> 0.0/sec 0.000/sec 0.0000/sec total: 10544 >>>>>>>>> 2026-02-24T14:06:48.055Z|00062|coverage|INFO|consider_logical_flow >>>>>>>>> 0.0/sec 0.000/sec 0.0000/sec total: 41360 >>>>>>>>> 2026-02-24T14:06:48.055Z|00063|coverage|INFO|lflow_run 0.2/sec >>>>>>>>> 0.017/sec 0.0003/sec total: 2 >>>>>>>>> 2026-02-24T14:06:48.055Z|00064|coverage|INFO|cmap_expand 0.0/sec >>>>>>>>> 0.000/sec 0.0000/sec total: 7 >>>>>>>>> 2026-02-24T14:06:48.055Z|00065|coverage|INFO|miniflow_malloc >>>>>>>>> 16.6/ >>>>>>>>> sec 1.383/sec 0.0231/sec total: 63156 >>>>>>>>> 2026-02-24T14:06:48.055Z|00066|coverage|INFO|hmap_pathological >>>>>>>>> 11.2/ >>>>>>>>> sec 0.933/sec 0.0156/sec total: 311 >>>>>>>>> 2026-02-24T14:06:48.056Z|00067|coverage|INFO|hmap_expand 837.2/sec >>>>>>>>> 69.767/sec 1.1628/sec total: 30539 >>>>>>>>> 2026-02-24T14:06:48.056Z|00068|coverage|INFO|hmap_reserve 0.4/sec >>>>>>>>> 0.033/sec 0.0006/sec total: 22553 >>>>>>>>> 2026-02-24T14:06:48.056Z|00069|coverage|INFO|txn_unchanged 2.4/sec >>>>>>>>> 0.200/sec 0.0033/sec total: 67 >>>>>>>>> 2026-02-24T14:06:48.056Z|00070|coverage|INFO|txn_incomplete >>>>>>>>> 1.4/sec >>>>>>>>> 0.117/sec 0.0019/sec total: 60 >>>>>>>>> 2026-02-24T14:06:48.056Z|00071|coverage|INFO|txn_success 0.6/sec >>>>>>>>> 0.050/sec 0.0008/sec total: 4 >>>>>>>>> 2026-02-24T14:06:48.056Z|00072|coverage|INFO|poll_create_node >>>>>>>>> 24.0/ >>>>>>>>> sec 2.000/sec 0.0333/sec total: 1335 >>>>>>>>> 2026-02-24T14:06:48.056Z|00073|coverage|INFO|poll_zero_timeout >>>>>>>>> 0.0/ >>>>>>>>> sec 0.000/sec 0.0000/sec total: 1 >>>>>>>>> 2026-02-24T14:06:48.056Z|00074|coverage|INFO|rconn_queued 0.8/sec >>>>>>>>> 0.067/sec 0.0011/sec total: 4 >>>>>>>>> 2026-02-24T14:06:48.056Z|00075|coverage|INFO|rconn_sent 0.8/sec >>>>>>>>> 0.067/sec 0.0011/sec total: 4 >>>>>>>>> 2026-02-24T14:06:48.056Z|00076|coverage|INFO|seq_change 9.2/sec >>>>>>>>> 0.767/sec 0.0128/sec total: 546 >>>>>>>>> 2026-02-24T14:06:48.056Z|00077|coverage|INFO|pstream_open 0.2/sec >>>>>>>>> 0.017/sec 0.0003/sec total: 1 >>>>>>>>> 2026-02-24T14:06:48.056Z|00078|coverage|INFO|stream_open 1.2/sec >>>>>>>>> 0.100/sec 0.0017/sec total: 6 >>>>>>>>> 2026-02-24T14:06:48.056Z|00079|coverage|INFO|long_poll_interval >>>>>>>>> 0.0/sec 0.000/sec 0.0000/sec total: 1 >>>>>>>>> 2026-02-24T14:06:48.056Z|00080|coverage|INFO|util_xalloc >>>>>>>>> 29035.4/sec >>>>>>>>> 2419.617/sec 40.3269/sec total: 2477649 >>>>>>>>> 2026-02-24T14:06:48.056Z|00081|coverage|INFO|vconn_received >>>>>>>>> 0.8/sec >>>>>>>>> 0.067/sec 0.0011/sec total: 4 >>>>>>>>> 2026-02-24T14:06:48.056Z|00082|coverage|INFO|vconn_sent 1.2/sec >>>>>>>>> 0.100/sec 0.0017/sec total: 6 >>>>>>>>> 2026-02-24T14:06:48.056Z|00083|coverage|INFO| >>>>>>>>> jsonrpc_recv_incomplete >>>>>>>>> 0.6/sec 0.050/sec 0.0008/sec total: 52 >>>>>>>>> 2026-02-24T14:06:48.056Z|00084|coverage|INFO|136 events never hit >>>>>>>>> 2026-02-24T14:06:48.056Z|00085|poll_loop|INFO|wakeup due to >>>>>>>>> [POLLIN] >>>>>>>>> on fd 29 (10.11.0.2:40496<->10.11.0.4:16641) at lib/stream- >>>>>>>>> fd.c:157 >>>>>>>>> (82% CPU usage) >>>>>>>>> 2026-02-24T14:06:48.097Z|00086|poll_loop|INFO|wakeup due to >>>>>>>>> [POLLIN] >>>>>>>>> on fd 29 (10.11.0.2:40496<->10.11.0.4:16641) at lib/stream- >>>>>>>>> fd.c:157 >>>>>>>>> (82% CPU usage) >>>>>>>>> 2026-02-24T14:06:48.104Z|00087|poll_loop|INFO|wakeup due to 0-ms >>>>>>>>> timeout at controller/ovn-controller.c:7558 (82% CPU usage) >>>>>>>>> 2026-02-24T14:06:48.283Z|00088|poll_loop|INFO|wakeup due to 0-ms >>>>>>>>> timeout at controller/ofctrl.c:692 (82% CPU usage) >>>>>>>>> 2026-02-24T14:06:48.870Z|00089|poll_loop|INFO|wakeup due to >>>>>>>>> [POLLIN] >>>>>>>>> on fd 33 (<->/var/run/openvswitch/br-int.mgmt) at lib/stream- >>>>>>>>> fd.c:153 >>>>>>>>> (82% CPU usage) >>>>>>>>> 2026-02-24T14:06:48.877Z|00090|poll_loop|INFO|wakeup due to >>>>>>>>> [POLLOUT] >>>>>>>>> on fd 33 (<->/var/run/openvswitch/br-int.mgmt) at lib/stream- >>>>>>>>> fd.c:153 >>>>>>>>> (82% CPU usage) >>>>>>>>> 2026-02-24T14:06:48.884Z|00091|poll_loop|INFO|wakeup due to >>>>>>>>> [POLLOUT] >>>>>>>>> on fd 33 (<->/var/run/openvswitch/br-int.mgmt) at lib/stream- >>>>>>>>> fd.c:153 >>>>>>>>> (82% CPU usage) >>>>>>>>> 2026-02-24T14:06:48.892Z|00092|poll_loop|INFO|wakeup due to >>>>>>>>> [POLLOUT] >>>>>>>>> on fd 33 (<->/var/run/openvswitch/br-int.mgmt) at lib/stream- >>>>>>>>> fd.c:153 >>>>>>>>> (82% CPU usage) >>>>>>>>> 2026-02-24T14:06:48.900Z|00093|poll_loop|INFO|wakeup due to >>>>>>>>> [POLLOUT] >>>>>>>>> on fd 33 (<->/var/run/openvswitch/br-int.mgmt) at lib/stream- >>>>>>>>> fd.c:153 >>>>>>>>> (82% CPU usage) >>>>>>>>> 2026-02-24T14:06:48.907Z|00094|poll_loop|INFO|wakeup due to >>>>>>>>> [POLLOUT] >>>>>>>>> on fd 33 (<->/var/run/openvswitch/br-int.mgmt) at lib/stream- >>>>>>>>> fd.c:153 >>>>>>>>> (82% CPU usage) >>>>>>>>> 2026-02-24T14:06:49.875Z|00095|memory|INFO|143124 kB peak >>>>>>>>> resident set >>>>>>>>> size after 10.5 seconds >>>>>>>>> 2026-02-24T14:06:49.875Z|00096|memory|INFO|idl-cells- >>>>>>>>> OVN_Southbound:301305 idl-cells-Open_vSwitch:25815 lflow-cache- >>>>>>>>> entries-cache-expr:10548 lflow-cache-entries-cache-matches:2413 >>>>>>>>> lflow- >>>>>>>>> cache-size-KB:32447 local_datapath_usage-KB:2 >>>>>>>>> ofctrl_desired_flow_usage-KB:8528 ofctrl_installed_flow_usage- >>>>>>>>> KB:6365 >>>>>>>>> ofctrl_rconn_packet_counter-KB:5161 ofctrl_sb_flow_ref_usage- >>>>>>>>> KB:3196 >>>>>>>>> oflow_update_usage-KB:1 >>>>>>>>> >>>>>>>>> Regards, >>>>>>>>> >>>>>>>>> Ilia Baikov >>>>>>>>> [email protected] >>>>>>>>> >>>>>>>>> 12.02.2026 21:22, Ilia Baikov пишет: >>>>>>>>>> Hi, >>>>>>>>>> Returning back to this issue after a while as I'm migrating >>>>>>>>>> ml2/ovs >>>>>>>>>> to ml2/ovn. >>>>>>>>>> Seems like the same issue from 2025 still persists. >>>>>>>>>> >>>>>>>>>> refs: >>>>>>>>>> [0]https://mail.openvswitch.org/pipermail/ovs-discuss/2025- >>>>>>>>>> February/053456.html >>>>>>>>>> [1]https://mail.openvswitch.org/pipermail/ovs-discuss/2025- >>>>>>>>>> March/053484.html >>>>>>>>>> >>>>>>>>>> case: >>>>>>>>>> Big L2 domain with border device learning IPs by flooding ARP. >>>>>>>>>> For >>>>>>>>>> some reason in case L2 device (nic with VLANs attached) is >>>>>>>>>> attached >>>>>>>>>> to the br-ex bridge, then after a while OVN stops sending DHCP >>>>>>>>>> packets (OFFER/ACK/etc). >>>>>>>>>> >>>>>>>>>> Anybody else observed the same issue? The only way to stabilize >>>>>>>>>> region is to switch to L3 networking using ovn-bgp-agent (eth0 is >>>>>>>>>> detached from br-ex, so no more ARPs delivered to ovn- >>>>>>>>>> controller), >>>>>>>>>> but there is monstrous overhead using kernel routing: IRQ >>>>>>>>>> higher up >>>>>>>>>> to x5-6 times, like 10-12% while L2 networking is just 2% which >>>>>>>>>> is fine. >>>>>>>>>> >>>>>>>>>> Meanwhile, no errors, warns, resubmit logs in logs. >>>>>>>>>> > _______________________________________________ discuss mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
