-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] arecord TGLU_SKU0A32_SDCA ########+| MAX #3766
Comments
nor clear to me what the issue is. The PARITY error happens from time to time, it's not fatal. |
Please always share test run ID for reference: 13893?model=TGLU_SKU0A32_SDCA&testcase=check-pause-resume-capture-100 Very similar failure in 13892?model=TGLU_SKU0A32_SDCA&testcase=check-pause-resume-capture-10. I'm not sure the "parity error" is related to the failure, the user space logs at this second link look the same but without any "parity error". |
I was trying to figure out the failure from yesterday also. PARITY error is NOT the root cause of the failure. |
The failure is in expect script, that's why hard to pin point the error. The expect script trigger pause/resume command by sending spacebar key. I suspect pause/resume fails but I don't see obvious FW log either. Trying to debug more. |
Easy to reproduce the problem. // capture hw:0,4, first pause works but resume fails I cannot reproduce the problem with manually. I tried same order though. |
Timings then? Near impossible to reproduce |
thanks @marc-hb.
Don't know how to fix this yet. Will move this to sof-test repo. |
While the script should handle MAX better, why is MAX suddenly OK? This test never failed like this before so this really looks like a real bug and recent regression to me. It's not even a short duration MAX, it lasts for a while. Did you listen at the capture? cc: @plbossart PS: @aiChaoSONG found an |
There are huge pop in begining of hw:0,1. For 3 seconds, some 'MAX' output was detected
100% DC only with hw:0,4. Due to 100% DC, arecord output is always MAX. |
UPDATE: the git versions below are IRRELEVANT. The issue has been narrowed down to ALSA configuration differences. Last known good daily test run ID 13859 Start Time: 2022-07-10 21:30:24 UTC
First bad daily test run ID 13983 Start Time: 2022-07-12 01:34:48 UTC
Only 4 SOF commits difference:
About 60 Linux commits difference:
|
This also started happening with the SOF v2.2 release branch so this is very likely a recent kernel regression @plbossart Same failure in internal daily v2.2 run 13929?model=TGLU_SKU0A32_SDCA&testcase=check-pause-resume-capture-10 and other v2.2 |
I tried with daily build 2022-07-10 (below has HASHes), I have same issues with MAX volume.
Hard to tell the connection, but I replaced the NVMe SSD 7/8 (Fri), It may be something to do with the SSD upgrade and leave the laptop without the back cover? |
jf-tglu-sku0a32-sdca-01 works fine, both jf-tglu-sku0a32-sdca-02/jf-tglu-sku0a32-sdca-03 have MAX volume issue. I see some differences. First one has Ubuntu 20.04 and alsa version 1.2.2. The latter has Ubuntu 22.04 and alsa version is 1.2.6. I see some of alsa contents are different. I can't store/restore alsa settings from Ubuntu 20.04 to Ubuntu 22.04. Trying to match close possible. |
For failed device, alsa setting was Updated with this after referencing from jf-tglu-sku0a32-sdca-01, then works fine for both of the device. |
It seems that this issue is related to
But the question is that why does pausing |
This question is already answered. When playback starts volume is 100% due to high gain. expect script don't know what to do with this output. With thesofproject/sof-test#931, error message is more readable. |
I will repeat my question: Why on earth do we care about volume settings when doing the pause_push/pause_release transitions? This has nothing to do with mixer changes, which can happen both when the stream is running or paused. ALSA provides different 'strreaming' and 'control' interfaces. this test is only about the former. |
I compared alsa contents and double checked capture output from jf-tglu-sku0a32-sdca-01 (working, Ubuntu20.04) This is wave capture hw:0.4. Note, left and right channel gain is different but no saturation. $ arecord -D hw:0,4 -r 48000 -c 2 -f S16_LE -d 10 sdw4_dut01_capture.wav
For hw:0,1, the Capture Switch was off(false), So only silence was recorded.
|
@fredoh9 so now that we've established that it's a arecord problem (pause is irrelevant), can you dump the rt714 control settings in the Ubuntu 20.04 and Ubuntu 22.04 cases? One possible change is that UCM provides default boot values, that may be the difference. See https://github.com/alsa-project/alsa-ucm-conf/blob/master/ucm2/codecs/rt715-sdca/init.conf This sets the FU02 values while you look at FU06. Could be a red-herring or not... |
The value of 124 used for FU02 is way too much, it lead to instant saturation. The 0dB value (47) is a much better initialization value. BugLink: thesofproject/linux#3766 Signed-off-by: Pierre-Louis Bossart <[email protected]>
The value of 124 used for FU02 is way too much, with saturation even in low-volume cases. The 0dB value (47) is a much better initialization value. BugLink: thesofproject/linux#3766 Signed-off-by: Pierre-Louis Bossart <[email protected]>
I tested with 47 in relatively silent office in normal distance for laptop user. The captured waveform is nice and good. |
CI detected this issue on TGLU_SKU0A3E_SDW. I think this is because @fredoh9 replaced the NVME to solve the https://github.com/intel-innersource/drivers.audio.ci.sof-framework/issues/233 and installed Ubuntu22.04 LTS on jf-tglu-sku0a3e-sdw-01, therefore, some default amixer values may be changed after this.
I checked the UCM2, the volume is set to 63 in init.conf:
|
The value of 124 used for FU02 is way too much, with saturation even in low-volume cases. The 0dB value (47) is a much better initialization value. BugLink: thesofproject/linux#3766 Resolves: #193 Signed-off-by: Pierre-Louis Bossart <[email protected]> Signed-off-by: Jaroslav Kysela <[email protected]>
@keqiaozhang You should not manually change anything. Run 'alsactrl init', reboot and see if the problem still exists with default settings. If we still have an issue, report it with a different issue and we change the settings. We should not fix things quickly without a trace of what the issue was. |
In addition to my previous remark, the TGLU_SKU0A3E_SDW device has nothing to do with TGLU_SKU0A32_SDCA @keqiaozhang. Let's track issues separately please. |
Let's track the rt711 issue in #3804 |
@keqiaozhang can we close this issue for RT715 now that the proper gain was set? |
All TGLU_SKU0A32_SDCA have proper gain now. Haven't seen same issue for a long time. Closing the issue. |
@marc-hb is this with IPC3/stable2.2 or IPC4/main? We have a known topology issue where the IIR is not included in some paths, the implementation is inconsistent between topology1 and topology2. |
Isn't TGLU_SKU0A32_SDCA a real product? I mean, not a development platform. |
yeah, but it's TGL and no one complained in the last 3 years.... We can't fix everything. |
ok, will close as "wontfix" as soon as thesofproject/sof-test#1222 is merged. Just for the record, some commands useful for interactively testing this:
|
linux #3750 -> sof-test -> #3766
UPDATED SUMMARY by @marc-hb : this is a plain
arecord
issue. Reproduction does not require any test code and it does not require pause/resume. It was found by sheer luck when an upgrade from Ubuntu 2020 to 2022 turned on a capture setting. It was found in pause/resume testing because no other test looks at whatarecord
captures.Original description below.
Describe the bug
We observed this issue in CI, This codec error happens when testing pause/resume related cases. At first, I suspected it's a upstream regression, this issue can be reproduced on 2 TGLU_SKU0A32_SDCA(
jf-tglu-sku0a32-sdca-02
andjf-tglu-sku0a32-sdca-03
) devices with 100% reproduction rate when testing pause/resume onhw:0,4
. But I didn't see such issue onjf-tglu-sku0a32-sdca-01
. Since we replaced the NVME forjf-tglu-sku0a32-sdca-02
andjf-tglu-sku0a32-sdca-03
recently, so not sure if it's a hardware specific issue. I will do further debugging.To Reproduce
~/sof-test/test-case/check-pause-resume.sh -c 100 -m capture
Reproduction Rate
100%
Environment
Screenshots or console output
dmesg
dmesg.txt
logger.txt
The text was updated successfully, but these errors were encountered: