Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix veth device move #368

Merged
merged 3 commits into from
May 15, 2024
Merged

Conversation

jtluka
Copy link
Collaborator

@jtluka jtluka commented May 9, 2024

Description

Resolves an issue when moving a veth device to a namespace. The device will
have _nl_link_update populated with data from generic Device creation (device
appears in the namespace). This data may cause issues for certain device
operations like setting the link up.

To avoid this, clearing the _nl_link_update while remapping the device
solves the issue.

Fixes #367

Tests

Tested with the reproducer from the issue both in container setup and normal setup.

I scheduled test run in RH labs that include a single run of each recipe type, beaker job id J:9261245

Reviews

@olichtne

jtluka added 2 commits May 9, 2024 15:45
In particular for veth device pair, when first device from the pair is
moved to netns, the peer name is generated from the device name with
additional index suffix that is incorrect. For example for the devicepair of
{lveth0, peer_lveth0}, when lveth0 is moved to a namespace, the peer is
set to peer_lveth00.

To fix this, we can pass the peer name directly in the remap_device
call.

Signed-off-by: Jan Tluka <[email protected]>
Resolves an issue when moving a veth device to a namespace. The device will
have _nl_link_update populated with data from generic Device creation (device
appears in the namespace). This data may cause issues for certain device
operations like setting the link up.

To avoid this, clearing the _nl_link_update while remapping the device
solves the issue.

Fixes LNST-project#367

Signed-off-by: Jan Tluka <[email protected]>
@jtluka
Copy link
Collaborator Author

jtluka commented May 9, 2024

I hit an issue when running SRIOVNetnsTcRecipe:

LNST Controller crashed with an exception:
Traceback (most recent call last):
  File "/mnt/tests/data.lnst.anl.eng.rdu2.dc.redhat.com/data-server-content/gitlab-tasks/beaker-lnst-tasks/master.tar.gz/lnst/test-runner/test-runner/./do-my-test", line 35, in main
    ctl.run(recipe, multimatch=bool(params.get("MULTIMATCH", False)))
  File "/root/virtualenvs/rhextensions-lnst-Xo1BSm3a-py3.9/lib/python3.9/site-packages/lnst/Controller/Controller.py", line 172, in run
    recipe.test()
  File "/root/rhextensions-lnst/lnst/RHExtensions/RHRecipeMixin.py", line 109, in test
    super(RHRecipeMixin, self).test()
  File "/root/virtualenvs/rhextensions-lnst-Xo1BSm3a-py3.9/lib/python3.9/site-packages/lnst/Recipes/ENRT/BaseEnrtRecipe.py", line 207, in test
    with self._test_wide_context() as main_config:
  File "/usr/lib64/python3.9/contextlib.py", line 119, in __enter__
    return next(self.gen)
  File "/root/virtualenvs/rhextensions-lnst-Xo1BSm3a-py3.9/lib/python3.9/site-packages/lnst/Recipes/ENRT/BaseEnrtRecipe.py", line 214, in _test_wide_context
    config = self.test_wide_configuration()
  File "/root/virtualenvs/rhextensions-lnst-Xo1BSm3a-py3.9/lib/python3.9/site-packages/lnst/Recipes/ENRT/BaseSRIOVNetnsTcRecipe.py", line 170, in test_wide_configuration
    host.newns.vf_eth0 = vf_dev
  File "/root/virtualenvs/rhextensions-lnst-Xo1BSm3a-py3.9/lib/python3.9/site-packages/lnst/Controller/Namespace.py", line 221, in __setattr__
    if not self._custom_setattr(name, value):
  File "/root/virtualenvs/rhextensions-lnst-Xo1BSm3a-py3.9/lib/python3.9/site-packages/lnst/Controller/Namespace.py", line 170, in _custom_setattr
    self._machine.remote_device_set_netns(value, self, old_ns)
  File "/root/virtualenvs/rhextensions-lnst-Xo1BSm3a-py3.9/lib/python3.9/site-packages/lnst/Controller/Machine.py", line 138, in remote_device_set_netns
    if dev.peer_name:
  File "/root/virtualenvs/rhextensions-lnst-Xo1BSm3a-py3.9/lib/python3.9/site-packages/lnst/Devices/RemoteDevice.py", line 129, in __getattr__
    attr = getattr(self._dev_cls, name)
AttributeError: type object 'Device' has no attribute 'peer_name'

I will add try/except block to catch this.

Copy link
Collaborator

@olichtne olichtne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The exception you see for sriov recipe is weird as it should still be a RemoteDevice and the peer_name is a normal class property which should always be present.... at worst it should return None

But otherwise this looks good so ack to merge this in case the tests pass for phil.

@SirPhuttel
Copy link
Contributor

@jtluka, @olichtne, the fix works for me, thanks for the quick help!

@jtluka jtluka merged commit 3246484 into LNST-project:master May 15, 2024
3 checks passed
@jtluka jtluka deleted the fix-veth-device-move branch May 15, 2024 12:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Exception after moving both veth peers into netns
3 participants