Skip to content

Commit

Permalink
binding: Fix race condition when claiming vif.
Browse files Browse the repository at this point in the history
When a vif is claimed, chassis is written in sb Port Binding.
If some vif update was received by the controller before the sb
notification of the chassis update (i.e. while pb->chassis = NULL),
the port was considered as TRACKED_REMOVED and ct_zone was flushed.
When, later, the pb->chassis update was received, ct_zone was flushed
again.
This issue seems to only have as side effect a few extra ct_zone flush while
the port is getting added.
The fix avoid the extra bump in the claim process and hence avoids
the unnecessary additional ct_flush.
This issue was causing some flaky failures of test
"Migration of CT zone from UUID to name" as unexpected ct_zone flus happened.

Signed-off-by: Xavier Simonart <[email protected]>
Acked-by: Ales Musil <[email protected]>
Signed-off-by: Numan Siddique <[email protected]>
  • Loading branch information
simonartxavier authored and numansiddique committed Oct 16, 2024
1 parent f74b38f commit ab1cfb9
Show file tree
Hide file tree
Showing 2 changed files with 74 additions and 1 deletion.
2 changes: 1 addition & 1 deletion controller/binding.c
Original file line number Diff line number Diff line change
Expand Up @@ -1620,7 +1620,7 @@ consider_vif_lport_(const struct sbrec_port_binding *pb,
b_ctx_out->if_mgr);
}
}
if (pb->chassis != b_ctx_in->chassis_rec
if (pb->chassis && pb->chassis != b_ctx_in->chassis_rec
&& !is_requested_additional_chassis(pb, b_ctx_in->chassis_rec)
&& if_status_is_port_claimed(b_ctx_out->if_mgr,
pb->logical_port)) {
Expand Down
73 changes: 73 additions & 0 deletions tests/system-ovn.at
Original file line number Diff line number Diff line change
Expand Up @@ -13929,3 +13929,76 @@ OVS_TRAFFIC_VSWITCHD_STOP(["/.*error receiving.*/d
/.*terminating with signal 15.*/d"])
AT_CLEANUP
])

OVN_FOR_EACH_NORTHD([
AT_SETUP([NXT_CT_FLUSH_ZONE count])
ovn_start --use-tcp-to-sb
OVS_TRAFFIC_VSWITCHD_START()
ADD_BR([br-int])

dnl Set external-ids in br-int needed for ovn-controller
PARSE_LISTENING_PORT([$ovs_base/ovn-sb/ovsdb-server.log], [TCP_PORT])
check ovs-vsctl \
-- set Open_vSwitch . external-ids:system-id=hv1 \
-- set Open_vSwitch . external-ids:ovn-remote=tcp:127.0.0.1:$TCP_PORT \
-- set Open_vSwitch . external-ids:ovn-encap-type=geneve \
-- set Open_vSwitch . external-ids:ovn-encap-ip=169.0.0.1 \
-- set bridge br-int fail-mode=secure other-config:disable-in-band=true

dnl Start ovn-controller
start_daemon ovn-controller
check ovn-appctl -t ovn-controller vlog/set dbg:ct_zone

# sw0-port1 -- sw0

check ovn-nbctl ls-add sw0
check ovn-nbctl lsp-add sw0 sw0-port1

# Make sure address is set in a different transaction.
sleep_sb
stop_ovsdb_controller_updates $TCP_PORT
check ovn-nbctl lsp-set-addresses sw0-port1 "50:54:00:00:00:01 192.168.0.2"

ovs-vsctl add-port br-int p1 -- \
set Interface p1 external_ids:iface-id=sw0-port1 -- set Interface p1 type=internal

# Make sure ovn-controller runs and claims the port.
ensure_controller_run

# Wake up sb, so that it can handle lsp-set-address, but no the pb->chassis (as updates from controller stillblocked)
wake_up_sb
ensure_controller_run

# And now restarts ovn-controller.
restart_ovsdb_controller_updates $TCP_PORT

wait_for_ports_up
ovn-nbctl --wait=hv sync

AT_CHECK([ovn-appctl -t ovn-controller ct-zone-list | sed "s/ [[0-9]]*/ ??/" | sort], [0], [dnl
sw0-port1 ??
sw0_dnat ??
sw0_snat ??
])

# Check that we did just the initial zone flush
AT_CHECK([grep -c "NXT_CT_FLUSH_ZONE" ovs-vswitchd.log], [0], [dnl
3
])

OVS_APP_EXIT_AND_WAIT([ovn-controller])

as ovn-sb
OVS_APP_EXIT_AND_WAIT([ovsdb-server])

as ovn-nb
OVS_APP_EXIT_AND_WAIT([ovsdb-server])

as northd
OVS_APP_EXIT_AND_WAIT([ovn-northd])

as
OVS_TRAFFIC_VSWITCHD_STOP(["/.*error receiving.*/d
/.*terminating with signal 15.*/d"])
AT_CLEANUP
])

0 comments on commit ab1cfb9

Please sign in to comment.