-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lib: wait for fw loaded before starting any test #1033
Conversation
typo in title, staring -> starting. |
case-lib/config.sh
Outdated
# Set MAX Polling time to check FW Loading. If FW is already loaded, it will | ||
# return immediately. Default value is set to 60 seconds, because i915 driver | ||
# timeout is 60 seconds. | ||
MAX_WAIT_FW_LOADING=${MAX_WAIT_FW_LOADING:-60} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If i915 driver is 60 seconds, shouldn't this be 65 seconds?
Also: can you please add at least one reference for more context? Maybe some of these:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If i915 driver is 60 seconds, shouldn't this be 65 seconds?
Good point, I think about this too. But there are some other delays already applied top of this. so it should have plenty of room
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But there are some other delays already applied top of this
Once we merge this then we should get rid of those.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also: this is only a maximum. The test will start as soon as the firmware has booted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay, I will update default value to 70. if you plan to remove all extra delays, I need some extra margin
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@plbossart should we add a counter and print a warning when waiting more than 10s?
Note we already discussed this in test-specific #862. There we set a 10s maximum 1 year ago and it has never spuriously failed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
couple of comments below @fredoh9, nice work indeed.
case-lib/config.sh
Outdated
# Set MAX Polling time to check FW Loading. If FW is already loaded, it will | ||
# return immediately. Default value is set to 60 seconds, because i915 driver | ||
# timeout is 60 seconds. | ||
MAX_WAIT_FW_LOADING=${MAX_WAIT_FW_LOADING:-60} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the relationship between DSP firmware load and i915? is this that we first request the i915 modules before downloading the firmware?
and if yes, should we use the same routine as for the kmod remove/modprobe test where we have an iteration that stops on BOOT_COMPLETE.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@plbossart should we add a counter and print a warning when waiting more than 10s?
I think you missed this question
case-lib/config.sh
Outdated
# Set MAX Polling time to check FW Loading. If FW is already loaded, it will | ||
# return immediately. Default value is set to 60 seconds, because i915 driver | ||
# timeout is 60 seconds. | ||
MAX_WAIT_FW_LOADING=${MAX_WAIT_FW_LOADING:-60} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But there are some other delays already applied top of this
Once we merge this then we should get rid of those.
case-lib/config.sh
Outdated
# Set MAX Polling time to check FW Loading. If FW is already loaded, it will | ||
# return immediately. Default value is set to 60 seconds, because i915 driver | ||
# timeout is 60 seconds. | ||
MAX_WAIT_FW_LOADING=${MAX_WAIT_FW_LOADING:-60} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because it's the "last chance" TIMEOUT in our CI it's very long, how long Fred?. Moreover we power-cycle the device after such a CI TIMEOUT. When combined with another unrelated issue #979, the whole test run ends up in a very time-consuming tail spin on that device. |
TIMEOUT value is dependent on each test case. we used to set 2-4 times higher value of normal runtime. For an example, check-suspend-resume-50.sh has 6000 second. TIMEOUT has become an expensive way to waste resources. |
This is going to check whether the sof-test can be started after boot/reboot. If FW LOADING is not a prerequisite for the test, this can be skipped, by defining NO_POLL_FW_LOADING=true Another benefit with this change, if FW is not loaded it will be failed instead of TIMEOUT. Signed-off-by: Fred Oh <[email protected]>
00e538a
to
6a6b7a2
Compare
@plbossart @marc-hb I can print out warning every 10 sec. instead of zero output for MAX wait time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels good but incomplete. Before merging this I would like to see a good analysis -- do some tests actually need that variable right now or can it be done later?
@paulstelian97 this will help a lot in scenarios where the GPU driver is missing or failing and audio it timing out and where other issues make the situation much worse. I just unresolved the comment where I asked to provide some links for more context; you could start with #979. |
That would be great. I think we could merge this and then add these warnings later? |
yes, let's merge this and see. If we need anything improvement, I will address them together |
I will monitor upcoming the test results |
I think @lyakh found an issue with this new feature: kernel logs seem to be missing when the firmware does not boot. Example: https://sof-ci.01.org/sofpr/PR7584/build7585/devicetest/index.html In jenkins logs generic_test/13429/
This is very likely because the new code runs "too early". |
This is going to check whether the sof-test can be started after boot/reboot.
If FW LOADING is not a prerequisite for the test, this can be skipped, by defining NO_POLL_FW_LOADING=true
Another benefit with this change, if FW is not loaded it will be failed instead of TIMEOUT.