-
Notifications
You must be signed in to change notification settings - Fork 400
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add multipath-tools to openSUSE CI container #2353
Comments
makes sense, yes. |
@aafeijoo-suse @mwilck It seems quite a few tests are now failing after c08ae40 - and only on the openSUSE container. Perhaps Test 03 failure is the best test to understand why simply installing multipath-tools made this test fail. |
I would consider that a test artifact. I have strong evidence that booting opensuse with multipath works reliably in practice. |
Thanks for helping out @mwilck . The failing tests cases pass if the multipath dracut module is explicitly omitted (or if systemd module is omitted). I think these failures are not about multipath expected to work (happy path), but it is more about multipath gets involved even when it is not expected and introduces some kind of side effect when it is not used or not expected to be in use. Test-03 is not specifically expected to test multipath dracut module. Here is the link to the openSUSE run - https://github.com/dracutdevs/dracut/actions/runs/5098827739/jobs/9166201077 These tests are passing in fedora (and on arch in a similar conditions) - https://github.com/dracutdevs/dracut/actions/runs/5098827739/jobs/9166200202 One difference between a passing and failing run that I have spotted is on fedora (passing case) there does not seem to be a /etc/multipath.conf file exists. This is the line that is only in fedora but not in openSUSE
I also noticed that in fedora container "/etc/multipath/" directory exists but in openSUSE it does not. In addition openSUSE container seem to include 56-multipath.rules but fedora does not. At this point my guess is that somehow this is about systemd service dependencies - as omitting systemd but including multipath on openSUSE container also resolves the issue. Just to state the obvious - it could very well be that both fedora (and arch) and the tests have the bugs and openSUSE is acting as expected. |
As as an additional data point - a PR to disable multipath dracut container by default (only include it if it is explicitly needed) seem to make all CI tests pass on openSUSE (including test 3,13,14) - #2382 |
The problem may be related to the fact that this test features a separate file system for /usr.
I am not sure how the block device layout is supposed to be. But the above log shows that two dm devices holding the file systems We can see that multipathd is being stopped before the
introduced by 3c244c7. Perhaps we should try removing this directive (for testing only) and see if it improves matters. The idea of 3c244c7 (stop multipath before cleaning up udev db) is not entirely wrong, but note that multipathd never writes to the udev db and thus can't corrupt it. If anything, udevd itself should be terminated before the cleanup service runs. OTOH, the message "Starting Cleaning Up and Shutting Down Daemons" occurs only after the timeout. So it's rather unlikely that this was causing multipathd to be stopped. But if it wasn't that, what else? |
Test 13 and Test 14 do not have /usr on a seperate filesystem and yet as similar issue is observed with those tests.
Related PR: #2499 Maybe for now we can just disable multipath module for these failing tests. |
I noticed that multipath module is not tested on the openSUSE CI container as multipath-tools are not installed.
Installing multipath-tools into the container seemed to actually expose some potential issues and dracut CI test failures (see #2258).
@mwilck @aafeijoo-suse Would you support adding multipath-tools to the Opensuse CI container ?
The text was updated successfully, but these errors were encountered: