Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OCPBUGS-44185: Race in configure-ovs.sh affects bonding interface configuration. #4609

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

djlwilder
Copy link

@djlwilder djlwilder commented Sep 23, 2024

Bonded network configurations with mode=active-backup and fail_over_mac=follow are not functioning due to a race when activating network profiles. activate_nm_connections() attempts to activate all its generated profiles that are not currently in the "active" state. As autoconnect-slaves is set, once br-ex is activated the bond and all its slaves are automatically activated. Their state is set to "activating" until they become active. The "activating" state is not tested for therefor some of the subordinate profiles maybe activated multiple times causing a race in the bonding driver and incorrectly configuring the bond.

Link: #4605
Fixes: #4605

Please provide the following information:
Please see #4605
Testing: verify a 2 slave bond with mode=active-backup and fail_over_mac=follow converts to a working OVS config.
The slaves should not have the same MAC. Other configs should be tested as well.
pull request for inclusion in the changelog: Fix race in configure-ovs.sh that affects bonding interface configuration.

Bonded network configurations with mode=active-backup and
fail_over_mac=follow are not functioning due to a race when
activating network profiles. activate_nm_connections() attempts
to activate all its generated profiles that are not currently
in the "active" state. As autoconnect-slaves is set, once
br-ex is activated the bond and all its slaves are automatically
activated. Their state is set to "activating" until they become
active. The "activating" state is not tested for therefor some of
the subordinate profiles maybe activated multiple times causing a
race in the bonding driver and incorrectly configuring the bond.

Link: openshift#4605
Signed-off-by: David Wilder <[email protected]>
Copy link
Contributor

openshift-ci bot commented Sep 23, 2024

Hi @djlwilder. Thanks for your PR.

I'm waiting for a openshift member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-ci openshift-ci bot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Sep 23, 2024
@yuqi-zhang
Copy link
Contributor

/ok-to-test

@yuqi-zhang yuqi-zhang changed the title Race in configure-ovs.sh affects bonding interface configuration. OCPBUGS-44185: Race in configure-ovs.sh affects bonding interface configuration. Nov 4, 2024
@openshift-ci openshift-ci bot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Nov 4, 2024
@openshift-ci-robot openshift-ci-robot added jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. labels Nov 4, 2024
@openshift-ci-robot
Copy link
Contributor

@djlwilder: This pull request references Jira Issue OCPBUGS-44185, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.18.0) matches configured target version for branch (4.18.0)
  • bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @sergiordlr

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

Bonded network configurations with mode=active-backup and fail_over_mac=follow are not functioning due to a race when activating network profiles. activate_nm_connections() attempts to activate all its generated profiles that are not currently in the "active" state. As autoconnect-slaves is set, once br-ex is activated the bond and all its slaves are automatically activated. Their state is set to "activating" until they become active. The "activating" state is not tested for therefor some of the subordinate profiles maybe activated multiple times causing a race in the bonding driver and incorrectly configuring the bond.

Link: #4605
Fixes: #4605

Please provide the following information:
Please see #4605
Testing: verify a 2 slave bond with mode=active-backup and fail_over_mac=follow converts to a working OVS config.
The slaves should not have the same MAC. Other configs should be tested as well.
pull request for inclusion in the changelog: Fix race in configure-ovs.sh that affects bonding interface configuration.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot requested a review from sergiordlr November 4, 2024 14:31
@yuqi-zhang
Copy link
Contributor

/assign cybertron

@cybertron would you be able to take a look?

@djlwilder
Copy link
Author

/test e2e-azure-ovn-upgrade-out-of-change

@cybertron
Copy link
Member

The change seems fine to me, although I should note that I don't think we do much (if any) testing with the follow mode.

/cc @jcaamano @rbbratta

@openshift-ci openshift-ci bot requested review from jcaamano and rbbratta November 7, 2024 17:53
@djlwilder
Copy link
Author

Hi All, thanks for reviewing my PR. Please let me know if you need further clarification on the issue or the fix. The debug traces I attached to issue 4605 should help explain the interaction between the NetworkManager and the bonding driver when configure_ovs.sh is executed.

@prb112
Copy link

prb112 commented Nov 8, 2024

/retest-required

1 similar comment
@prb112
Copy link

prb112 commented Nov 9, 2024

/retest-required

@prb112
Copy link

prb112 commented Nov 12, 2024

Hey @jcaamano @rbbratta if you have feedback, that'd be great. Many thanks, Paul

@jcaamano
Copy link
Contributor

@djlwilder

Can you help me understand? If the profiles that we want active are active, and the profiles we want inactive are inactive, how is this a race in configure-ovs? If the observable state from configure-ovs matches what it wants, isn't the race in NetworkManager or some other place?

There is two things that are being taken care of here that would be at risk with this change:

  • activation order: unless we follow a specific order, some unwanted profiles might be activated as well. We can only control the order via explicitly activating.
  • ensuring things are active before configure-ovs quits

Unwanted profiles generally refers to profiles generated from kargs or leftovers from a previous boot if configure-ovs is told to use /etc/NetworkManager/system-connections instead of /run/NetworkManager/system-connections. The latter is not default and improbable because it is also not documented, but the former might be a more common issue. So have you tested two reboots with the bond originally defined though kargs?

There is also some other considerations:

  • what if the profile gets to activating state one instant after we check for it?
  • if ovs-if-phys0 has auto-connect slaves set, meaning the slaves can be re-activated when ovs-if-phys0 is activated, wouldn't the same thing happen?

Overall, this looks to me as not a race in configure-ovs so not something that we can absolutely fix in configure-ovs. If we are going to mitigate here, I would have two asks:

  • Learning more about what is the actual race condition. You mentioned in the issue These interfaces are not activated a second time leaving the state of the bond in an unpredictable state talking about the slave profiles. What does this exactly mean? It doesn't make sense to me because that is what you are actually preventing in this PR: for the profiles to be activated a second time. We are actually making sure the profiles we need active are active in the order we want, to avoid...well you guessed it, other race conditions.
  • crosschecking at the end that all profiles are activated

@djlwilder
Copy link
Author

@jcaamano

Can you help me understand? If the profiles that we want active are active, and the profiles we want inactive are inactive, how is this a race in configure-ovs? If the observable state from configure-ovs matches what it wants, isn't the race in NetworkManager or some other place?

Let me describe what I found in more detail so we can get on the same page.

In my setup activate_nm_connections is called with the following list of profiles:
br-ex ovs-if-phys0 enP32807p1s0-slave-ovs-clone enP49154p1s0-slave-ovs-clone ovs-if-br-ex

I have added some debug to show the state that NetworkManager reports for each profile as we process the list.

Here is a log showing the issue:
DEBUG-ShowNMStates: DebugLoc=top-of activate_nm_connections conn=br-ex State=None (1)
DEBUG-ShowNMStates: DebugLoc=top-of activate_nm_connections conn=ovs-if-phys0 State=None (1)
DEBUG-ShowNMStates: DebugLoc=top-of activate_nm_connections conn=enP32807p1s0-slave-ovs-clone State=None (1)
DEBUG-ShowNMStates: DebugLoc=top-of activate_nm_connections conn=enP49154p1s0-slave-ovs-clone State=None (1)
DEBUG-ShowNMStates: DebugLoc=top-of activate_nm_connections conn=ovs-if-br-ex State=None (1)
DEBUG-ShowNMStates: Attempt 1 to bring up connection br-ex
DEBUG-ShowNMStates: DebugLoc=Brought up connection br-ex successfully conn=br-ex State=activated
DEBUG-ShowNMStates: DebugLoc=Brought up connection br-ex successfully conn=ovs-if-phys0 State=activating (2)
DEBUG-ShowNMStates: DebugLoc=Brought up connection br-ex successfully conn=enP32807p1s0-slave-ovs-clone State=activating (2)
DEBUG-ShowNMStates: DebugLoc=Brought up connection br-ex successfully conn=enP49154p1s0-slave-ovs-clone State=activating (2)
DEBUG-ShowNMStates: DebugLoc=Brought up connection br-ex successfully conn=ovs-if-br-ex State=None
DEBUG-ShowNMStates: Attempt 1 to bring up connection ovs-if-phys0 (3)
DEBUG-ShowNMStates: DebugLoc=Brought up connection ovs-if-phys0 successfully conn=br-ex State=activated
DEBUG-ShowNMStates: DebugLoc=Brought up connection ovs-if-phys0 successfully conn=ovs-if-phys0 State=activating
DEBUG-ShowNMStates: DebugLoc=Brought up connection ovs-if-phys0 successfully conn=enP32807p1s0-slave-ovs-clone State=activating
DEBUG-ShowNMStates: DebugLoc=Brought up connection ovs-if-phys0 successfully conn=enP49154p1s0-slave-ovs-clone State=activating
DEBUG-ShowNMStates: DebugLoc=Brought up connection ovs-if-phys0 successfully conn=ovs-if-br-ex State=None
DEBUG-ShowNMStates: Attempt 1 to bring up connection enP32807p1s0-slave-ovs-clone (3)
DEBUG-ShowNMStates: DebugLoc=Brought up connection enP32807p1s0-slave-ovs-clone successfully conn=br-ex State=activated
DEBUG-ShowNMStates: DebugLoc=Brought up connection enP32807p1s0-slave-ovs-clone successfully conn=ovs-if-phys0 State=activated
DEBUG-ShowNMStates: DebugLoc=Brought up connection enP32807p1s0-slave-ovs-clone successfully conn=enP32807p1s0-slave-ovs-clone State=activated
DEBUG-ShowNMStates: DebugLoc=Brought up connection enP32807p1s0-slave-ovs-clone successfully conn=enP49154p1s0-slave-ovs-clone State=activated
DEBUG-ShowNMStates: DebugLoc=Brought up connection enP32807p1s0-slave-ovs-clone successfully conn=ovs-if-br-ex State=None
DEBUG-ShowNMStates: DebugLoc=Connection enP49154p1s0-slave-ovs-clone already activated. conn=br-ex State=activated (4)
DEBUG-ShowNMStates: DebugLoc=Connection enP49154p1s0-slave-ovs-clone already activated. conn=ovs-if-phys0 State=activated
DEBUG-ShowNMStates: DebugLoc=Connection enP49154p1s0-slave-ovs-clone already activated. conn=enP32807p1s0-slave-ovs-clone State=activated
DEBUG-ShowNMStates: DebugLoc=Connection enP49154p1s0-slave-ovs-clone already activated. conn=enP49154p1s0-slave-ovs-clone State=activated
DEBUG-ShowNMStates: DebugLoc=Connection enP49154p1s0-slave-ovs-clone already activated. conn=ovs-if-br-ex State=None
DEBUG-ShowNMStates: Attempt 1 to bring up connection ovs-if-br-ex
DEBUG-ShowNMStates: DebugLoc=Brought up connection ovs-if-br-ex successfully conn=br-ex State=activated
DEBUG-ShowNMStates: DebugLoc=Brought up connection ovs-if-br-ex successfully conn=ovs-if-phys0 State=activated
DEBUG-ShowNMStates: DebugLoc=Brought up connection ovs-if-br-ex successfully conn=enP32807p1s0-slave-ovs-clone State=activated
DEBUG-ShowNMStates: DebugLoc=Brought up connection ovs-if-br-ex successfully conn=enP49154p1s0-slave-ovs-clone State=activated
DEBUG-ShowNMStates: DebugLoc=Brought up connection ovs-if-br-ex successfully conn=ovs-if-br-ex State=activated

(1) At the start of activate_nm_connections() profiles have no state (shown as None)

(2) After br-ex is activated (nmcli conn up), ovs-if-phys0, enP32807p1s0-slave-ovs-clone and enP49154p1s0-slave-ovs-clone have a new state of "activating". Presumably NM is working to move these profiles to "active".

(3) As the state of "activating" is not checked ovs-if-phys0 and enP32807p1s0-slave-ovs-clone are activated as well (nmcli conn up).

(4) However, by the time we check the state of enP49154p1s0-slave-ovs-clone is already activated so no action is taken.

The timing here is critical, in this next test I added a "sleep 1" after br-ex is reported as active.
By giving NM more time NM was able to completed activating: ovs-if-phys0, enP32807p1s0-slave-ovs-clone, and enP49154p1s0-slave-ovs-clone so no additional activation's were needed. This is the race I spoke of.

DEBUG-ShowNMStates: DebugLoc=top-of activate_nm_connections conn=br-ex State=None
DEBUG-ShowNMStates: DebugLoc=top-of activate_nm_connections conn=ovs-if-phys0 State=None
DEBUG-ShowNMStates: DebugLoc=top-of activate_nm_connections conn=enP32807p1s0-slave-ovs-clone State=None
DEBUG-ShowNMStates: DebugLoc=top-of activate_nm_connections conn=enP49154p1s0-slave-ovs-clone State=None
DEBUG-ShowNMStates: DebugLoc=top-of activate_nm_connections conn=ovs-if-br-ex State=None
DEBUG-ShowNMStates: Attempt 1 to bring up connection br-ex
DEBUG-ShowNMStates: DebugLoc=Brought up connection br-ex successfully conn=br-ex State=activated
DEBUG-ShowNMStates: DebugLoc=Brought up connection br-ex successfully conn=ovs-if-phys0 State=activating
DEBUG-ShowNMStates: DebugLoc=Brought up connection br-ex successfully conn=enP32807p1s0-slave-ovs-clone State=activating
DEBUG-ShowNMStates: DebugLoc=Brought up connection br-ex successfully conn=enP49154p1s0-slave-ovs-clone State=activating
DEBUG-ShowNMStates: DebugLoc=Brought up connection br-ex successfully conn=ovs-if-br-ex State=None
DEBUG-ShowNMStates: Sleeping for 1 sec. **********
DEBUG-ShowNMStates: DebugLoc=Connection ovs-if-phys0 already activated. conn=br-ex State=activated
DEBUG-ShowNMStates: DebugLoc=Connection ovs-if-phys0 already activated. conn=ovs-if-phys0 State=activated
DEBUG-ShowNMStates: DebugLoc=Connection ovs-if-phys0 already activated. conn=enP32807p1s0-slave-ovs-clone State=activated
DEBUG-ShowNMStates: DebugLoc=Connection ovs-if-phys0 already activated. conn=enP49154p1s0-slave-ovs-clone State=activated
DEBUG-ShowNMStates: DebugLoc=Connection ovs-if-phys0 already activated. conn=ovs-if-br-ex State=None
DEBUG-ShowNMStates: DebugLoc=Connection enP32807p1s0-slave-ovs-clone already activated. conn=br-ex State=activated
DEBUG-ShowNMStates: DebugLoc=Connection enP32807p1s0-slave-ovs-clone already activated. conn=ovs-if-phys0 State=activated
DEBUG-ShowNMStates: DebugLoc=Connection enP32807p1s0-slave-ovs-clone already activated. conn=enP32807p1s0-slave-ovs-clone State=activated
DEBUG-ShowNMStates: DebugLoc=Connection enP32807p1s0-slave-ovs-clone already activated. conn=enP49154p1s0-slave-ovs-clone State=activated
DEBUG-ShowNMStates: DebugLoc=Connection enP32807p1s0-slave-ovs-clone already activated. conn=ovs-if-br-ex State=None
DEBUG-ShowNMStates: DebugLoc=Connection enP49154p1s0-slave-ovs-clone already activated. conn=br-ex State=activated
DEBUG-ShowNMStates: DebugLoc=Connection enP49154p1s0-slave-ovs-clone already activated. conn=ovs-if-phys0 State=activated
DEBUG-ShowNMStates: DebugLoc=Connection enP49154p1s0-slave-ovs-clone already activated. conn=enP32807p1s0-slave-ovs-clone State=activated
DEBUG-ShowNMStates: DebugLoc=Connection enP49154p1s0-slave-ovs-clone already activated. conn=enP49154p1s0-slave-ovs-clone State=activated
DEBUG-ShowNMStates: DebugLoc=Connection enP49154p1s0-slave-ovs-clone already activated. conn=ovs-if-br-ex State=None
DEBUG-ShowNMStates: Attempt 1 to bring up connection ovs-if-br-ex
DEBUG-ShowNMStates: DebugLoc=Brought up connection ovs-if-br-ex successfully conn=br-ex State=activated
DEBUG-ShowNMStates: DebugLoc=Brought up connection ovs-if-br-ex successfully conn=ovs-if-phys0 State=activated
DEBUG-ShowNMStates: DebugLoc=Brought up connection ovs-if-br-ex successfully conn=enP32807p1s0-slave-ovs-clone State=activated
DEBUG-ShowNMStates: DebugLoc=Brought up connection ovs-if-br-ex successfully conn=enP49154p1s0-slave-ovs-clone State=activated
DEBUG-ShowNMStates: DebugLoc=Brought up connection ovs-if-br-ex successfully conn=ovs-if-br-ex State=activated

Finally, here is debug output with my change to configure-ovs.sh (checking for state=activating).

DEBUG-ShowNMStates: DebugLoc=top-of activate_nm_connections conn=br-ex State=None
DEBUG-ShowNMStates: DebugLoc=top-of activate_nm_connections conn=ovs-if-phys0 State=None
DEBUG-ShowNMStates: DebugLoc=top-of activate_nm_connections conn=enP32807p1s0-slave-ovs-clone State=None
DEBUG-ShowNMStates: DebugLoc=top-of activate_nm_connections conn=enP49154p1s0-slave-ovs-clone State=None
DEBUG-ShowNMStates: DebugLoc=top-of activate_nm_connections conn=ovs-if-br-ex State=None
DEBUG-ShowNMStates: Attempt 1 to bring up connection br-ex
DEBUG-ShowNMStates: DebugLoc=Brought up connection br-ex successfully conn=br-ex State=activated
DEBUG-ShowNMStates: DebugLoc=Brought up connection br-ex successfully conn=ovs-if-phys0 State=activating
DEBUG-ShowNMStates: DebugLoc=Brought up connection br-ex successfully conn=enP32807p1s0-slave-ovs-clone State=activating
DEBUG-ShowNMStates: DebugLoc=Brought up connection br-ex successfully conn=enP49154p1s0-slave-ovs-clone State=activating
DEBUG-ShowNMStates: DebugLoc=Brought up connection br-ex successfully conn=ovs-if-br-ex State=None
DEBUG-ShowNMStates: DebugLoc=Connection ovs-if-phys0 already activated or activating. conn=br-ex State=activated
DEBUG-ShowNMStates: DebugLoc=Connection ovs-if-phys0 already activated or activating. conn=ovs-if-phys0 State=activating
DEBUG-ShowNMStates: DebugLoc=Connection ovs-if-phys0 already activated or activating. conn=enP32807p1s0-slave-ovs-clone State=activating
DEBUG-ShowNMStates: DebugLoc=Connection ovs-if-phys0 already activated or activating. conn=enP49154p1s0-slave-ovs-clone State=activating
DEBUG-ShowNMStates: DebugLoc=Connection ovs-if-phys0 already activated or activating. conn=ovs-if-br-ex State=None
DEBUG-ShowNMStates: DebugLoc=Connection enP32807p1s0-slave-ovs-clone already activated or activating. conn=br-ex State=activated
DEBUG-ShowNMStates: DebugLoc=Connection enP32807p1s0-slave-ovs-clone already activated or activating. conn=ovs-if-phys0 State=activated
DEBUG-ShowNMStates: DebugLoc=Connection enP32807p1s0-slave-ovs-clone already activated or activating. conn=enP32807p1s0-slave-ovs-clone State=activated
DEBUG-ShowNMStates: DebugLoc=Connection enP32807p1s0-slave-ovs-clone already activated or activating. conn=enP49154p1s0-slave-ovs-clone State=activated
DEBUG-ShowNMStates: DebugLoc=Connection enP32807p1s0-slave-ovs-clone already activated or activating. conn=ovs-if-br-ex State=None
DEBUG-ShowNMStates: DebugLoc=Connection enP49154p1s0-slave-ovs-clone already activated or activating. conn=br-ex State=activated
DEBUG-ShowNMStates: DebugLoc=Connection enP49154p1s0-slave-ovs-clone already activated or activating. conn=ovs-if-phys0 State=activated
DEBUG-ShowNMStates: DebugLoc=Connection enP49154p1s0-slave-ovs-clone already activated or activating. conn=enP32807p1s0-slave-ovs-clone State=activated
DEBUG-ShowNMStates: DebugLoc=Connection enP49154p1s0-slave-ovs-clone already activated or activating. conn=enP49154p1s0-slave-ovs-clone State=activated
DEBUG-ShowNMStates: DebugLoc=Connection enP49154p1s0-slave-ovs-clone already activated or activating. conn=ovs-if-br-ex State=None
DEBUG-ShowNMStates: Attempt 1 to bring up connection ovs-if-br-ex
DEBUG-ShowNMStates: DebugLoc=Brought up connection ovs-if-br-ex successfully conn=br-ex State=activated
DEBUG-ShowNMStates: DebugLoc=Brought up connection ovs-if-br-ex successfully conn=ovs-if-phys0 State=activated
DEBUG-ShowNMStates: DebugLoc=Brought up connection ovs-if-br-ex successfully conn=enP32807p1s0-slave-ovs-clone State=activated
DEBUG-ShowNMStates: DebugLoc=Brought up connection ovs-if-br-ex successfully conn=enP49154p1s0-slave-ovs-clone State=activated
DEBUG-ShowNMStates: DebugLoc=Brought up connection ovs-if-br-ex successfully conn=ovs-if-br-ex State=activated

So the problem is not if the profiles are active or not its how you got there. I suspect that by setting autoconnect-slaves=1 we are allowing NM
to activate some profiles for us thus removing our ability to manage then (I need to verify this).

The next question is why this is a problem. The debug traces I attached to issue 4605 should help explain hoe this affects the bonding driver.

Thank you for your help.

@djlwilder
Copy link
Author

Can you help me understand? If the profiles that we want active are active, and the profiles we want inactive are inactive, how is this a race in configure-ovs? If the observable state from configure-ovs matches what it wants, isn't the race in NetworkManager or some other place?

There is two things that are being taken care of here that would be at risk with this change:

* activation order: unless we follow a specific order, some unwanted profiles might be activated as well. We can only control the order via explicitly activating.

Anytime the NM has set a "activating" state we may have lost control over activation order already.

* ensuring things are active before configure-ovs quits

I agree

Unwanted profiles generally refers to profiles generated from kargs or leftovers from a previous boot if configure-ovs is told to use /etc/NetworkManager/system-connections instead of /run/NetworkManager/system-connections. The latter is not default and improbable because it is also not documented, but the former might be a more common issue. So have you tested two reboots with the bond originally defined though kargs?

I have not tested with kargs, will do.

There is also some other considerations:

* what if the profile gets to `activating` state one instant after we check for it?

I agree, we have no control how long a NM will keep a profile in activating state so that is going to be a problem. Finding a solution that is not dependent on a activating state would be ideal.

* if ovs-if-phys0 has auto-connect slaves set, meaning the slaves can be re-activated when ovs-if-phys0 is activated, wouldn't the same thing happen?

Are you saying if "nmcli con up ovs-if-phys0" is run some point after configure-ovs is run? I definitely need to test this. Although I only see it as an issue if we follow up by activating the slaves one at a time as the script will do. (see below for more on this).

Overall, this looks to me as not a race in configure-ovs so not something that we can absolutely fix in configure-ovs. If we are going to mitigate here, I would have two asks:

* Learning more about what is the actual race condition. You mentioned in the issue `These interfaces are not activated a second time leaving the state of the bond in an unpredictable state` talking about the slave profiles. What does this exactly mean? It doesn't make sense to me because that is what you are actually preventing in this PR: for the profiles to be activated a second time. We are actually making sure the profiles we need active are active in the order we want, to avoid...well you guessed it, other race conditions.

I hope this better explain the problem. In my example the problem was triggered because one *-slave-ovs-clone was explicitly activated and the other was not this is due to the unpredictable delay between activating->activated. If the timing was such that neither or both were explicitly activated the outcome would be different. (see also note above)

Here is another approach that might address your concerns:

  • I am not checking for state="activating" to prevent the issue with checking reading the state to early.
  • I follow the order of profiles, for profiles that can have a state of activating just wait untill the state="activated".
    # Activate all interfaces that are not yet active
    # NetworkManager will activate slave interfaces for us once the slave's
    # parent has been activated. So don't activate slaves, just wait for them to become
    # active. If slave done become active after a suficent time give up and move on.
    # TBD: this works with slave_type=bond,  what about slave_type=team?

    for i in {1..10}; do
      echo "Attempt $i to bring up connection $conn"
  
      if $is_slave; then
        active_state=$(nmcli -g GENERAL.STATE conn show "$conn")
        if [ "$active_state" == "activated" ]; then
           s=0
           break
        fi
      else
           nmcli conn up "$conn" && s=0 && break || s=$?
      fi
      sleep 5
    done
    if [ $s -eq 0 ]; then
      echo "Brought up connection $conn successfully"
      if $is_slave; then
        master_interfaces["$master_interface"]=true
      fi
    elif ! $is_slave; then
      echo "ERROR: Cannot bring up connection $conn after $i attempts"
      return $s
    fi
    mod_nm_conn "$conn" connection.autoconnect yes
* crosschecking at the end that all profiles are activated

Ok, what should I do if some profiles don't get activated?

@jcaamano
Copy link
Contributor

jcaamano commented Nov 25, 2024

The next question is why this is a problem. The debug traces I attached to issue 4605 should help explain hoe this affects the bonding driver.

I guess this is my question as well and thus my statement that the race must be somewhere else as activating a profile multiple times should not be a problem. However this is not a question we can answer, rather the RHEL team should. Anyway, I recognize the need to avoid the problem here as much as we can.

Anytime the NM has set a "activating" state we may have lost control over activation order already.

We are following the same order as the implicit activation order, except for the slaves which are technically activated at the same time and we have observed races with that implicit activation as well in the past like: t0: slave 1 activating, slave 2 activating. t1: slave 1 active, slave 2 activating. t2: slave 1 active, slave 2 becomes active and triggers slave 1 re-activation. t3: slave 1 and 2 active but bad state somewhere.

With this I am saying that you might be fixing one race and introducing another one we already dealt with in the past. And that is why root causing this to RHEL becomes relevant.

Are you saying if "nmcli con up ovs-if-phys0" is run some point after configure-ovs is run? I definitely need to test this. Although I only see it as an issue if we follow up by activating the slaves one at a time as the script will do. (see below for more on this).

Ignore, I thought the slave was already activating before we got into activate_nm_connections. I understand better what is going on now.

Here is another approach that might address your concerns:

  • I am not checking for state="activating" to prevent the issue with checking reading the state to early.
  • I follow the order of profiles, for profiles that can have a state of activating just wait untill the state="activated".

Ok, what should I do if some profiles don't get activated?

Would this work for you:

...
        master_interfaces["$master_interface"]=false
      fi
fi

# slaves should implicitly activate, give them a chance to do so
if $is_slave; then
    timeout 5s bash -c "while isNotActivated "$conn"; do sleep 1; done" || {
        echo "WARNING: slave $conn did not implicitly activate in 5s, activating explicitly..."
    }
fi

# Do not activate interfaces that are already active
...

@djlwilder
Copy link
Author

I guess this is my question as well and thus my statement that the race must be somewhere else as activating a profile multiple times should not be a problem. However this is not a question we can answer, rather the RHEL team should. Anyway, I recognize the need to avoid the problem here as much as we can.

I am all for finding root cause. In this case I think that root cause is a limitation in the bonding drive coupled with how network manager interacts with it. I can open a bugzilla against RHEL for this. If you agree I would like to continue the path of avoiding the issue as long as we don't introduce more problems. I like your latest change suggestion, this should prevent the problem I am seeing. Let me experiment with this and I will post code and results soon.

With bonded network configurations slaves interfaces will be
implicitly activate after br-ex is explicitly activated. This
implicit activation can take a number of seconds, during this
time if one and only one slave is explicitly activated the bonding
driver may set the same MAC address to both slaves. This will
cause the bond to fail when option fail_over_mac=follow is set.
This change gives bond slaves up to 5 seconds to implicitly
activate preventing the issue.

Link: openshift#4605
Signed-off-by: David Wilder <[email protected]>
Copy link
Contributor

openshift-ci bot commented Dec 9, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: djlwilder
Once this PR has been reviewed and has the lgtm label, please assign umohnani8 for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@djlwilder
Copy link
Author

I updated my fork with the changes as we discussed and ran unit tests with bonding configured and fail_over_mac=follow. Platform: PowerVM

# oc version
Client Version: 4.16.20
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: 4.16.20
Kubernetes Version: v1.29.9+5865c5b

Unit tests:

  1. Configure bond with kargs:
    kargs:append=bond=bond0:enP32771p1s0,enP49154p1s0:mode=active-backup,fail_over_mac=follow ip=192.168.42.6::192.168.42.1:255.255.255.0::bond0:none:192.168.42.1
    Booted node, verified that the bond configured correctly and node became ready.

  2. Rebooted the node twice using kargs:
    No issues seen with either boot. Verified that the bond configured correctly and node became ready.

  3. Configure bond by injecting files to /etc/NetworkManager/system-connections

  • Manually add files
    $ ls /etc/NetworkManager/system-connections/
    bond0.nmconnection enP32771p1s0.nmconnection enP49154p1s0.nmconnection
  • Remove files from /run/NetworkManager/system-connections/
  • reboot node
    No issues seen. Verified that the bond configured correctly and node became ready.
  • Verified generated configuration:
    $ ls /run/NetworkManager/system-connections/
    bond0.nmconnection enP49154p1s0-slave-ovs-clone.nmconnection ovs-if-phys0.nmconnection
    br-ex.nmconnection enP49154p1s0.nmconnection ovs-port-br-ex.nmconnection
    enP32771p1s0-slave-ovs-clone.nmconnection lo.nmconnection ovs-port-phys0.nmconnection
    enP32771p1s0.nmconnection ovs-if-br-ex.nmconnection
  1. Missing slave test.
  • Removed one slave from from node's configuration (hyper-visor configuration).
  • Booted node.
    No issues seen. Verified that the bond configured correctly with a single slave and node became ready.
  1. Bond fail-over test.
    Booted node, verified bond is configured and working.
$ cat /sys/class/net/bond0/bonding/slaves
enP32771p1s0 enP49154p1s0

$ cat /sys/class/net/bond0/bonding/active_slave 
enP32771p1s0

$ echo enP49154p1s0 >/sys/class/net/bond0/bonding/active_slave

$ cat /sys/class/net/bond0/bonding/active_slave
enP49154p1s0

Verified bond0 is functional and slave MACS have swapped.

@djlwilder
Copy link
Author

/retest-required

@djlwilder
Copy link
Author

/retest

@djlwilder
Copy link
Author

/test e2e-azure-ovn-upgrade-out-of-change

@prb112
Copy link

prb112 commented Jan 2, 2025

/test e2e-gcp-op-ocl

1 similar comment
@djlwilder
Copy link
Author

/test e2e-gcp-op-ocl

Copy link
Contributor

openshift-ci bot commented Jan 6, 2025

@djlwilder: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-gcp-op-ocl 7671280 link false /test e2e-gcp-op-ocl

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@djlwilder
Copy link
Author

I guess this is my question as well and thus my statement that the race must be somewhere else as activating a profile multiple times should not be a problem. However this is not a question we can answer, rather the RHEL team should. Anyway, I recognize the need to avoid the problem here as much as we can.

@jcaamano
I reproduced the problem outside of OpenShift on RHEL9.6 and with upstream net-next kernel. I filled this bug against RHEL 9.6.
RHEL-73616 - RHEL9.6: Using nmcli to activate (up) or deactivate (down) the active slaves breaks the bond.

Have you had a chance to look at my updated commit?

Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. ok-to-test Indicates a non-member PR verified by an org member that is safe to test.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Network bonding configuration not working with fail_over_mac=follow
7 participants