Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG][LNL] alsabat capture test sometimes fails when headset capture silent -> flaky MIC jack detect #9018

Closed
fredoh9 opened this issue Apr 10, 2024 · 35 comments
Assignees
Labels
bug Something isn't working as expected LNL Applies to Lunar Lake platform P1 Blocker bugs or important features
Milestone

Comments

@fredoh9
Copy link
Contributor

fredoh9 commented Apr 10, 2024

CLOSED as DUPLICATE of thesofproject/linux#4681


Describe the bug
LNL L0 RT711 headset capture sometimes silent and alsabat capture test fails.
Platform is LNLM_SDW_AIOC, but alsabat test on RT711 codec only.
alsabat playback test looks consistent but alsabat capture test has mixed passes and failures.

mtrace doesn't have much difference.

Good case dmesg:

[ 4505.939899] kernel: snd_soc_rt711_sdca:rt711_sdca_headset_detect: rt711-sdca sdw:0:0:025d:0711:01: rt711_sdca_headset_detect, detected_mode=0x5
[ 4505.939930] kernel: snd_soc_rt711_sdca:rt711_sdca_jack_detect_handler: rt711-sdca sdw:0:0:025d:0711:01: in rt711_sdca_jack_detect_handler, jack_type=0x3

Fail case dmesg:

[ 4550.810855] kernel: snd_soc_rt711_sdca:rt711_sdca_headset_detect: rt711-sdca sdw:0:0:025d:0711:01: rt711_sdca_headset_detect, detected_mode=0x3
[ 4550.810886] kernel: snd_soc_rt711_sdca:rt711_sdca_jack_detect_handler: rt711-sdca sdw:0:0:025d:0711:01: in rt711_sdca_jack_detect_handler, jack_type=0x1

To Reproduce
Frequency doesn't matter.
TPLG=/lib/firmware/intel/sof-ipc4-tplg/sof-lnl-rt711-l0-rt1316-l23-rt714-l1.tplg MODEL=LNLM_SDW_AIOC SOF_TEST_INTERVAL=5 ~/sof-test/test-case/check-alsabat.sh -c hw:sofsoundwire,1 -p hw:CODEC,0 -C 2 -F 997

Reproduction Rate
Very easy to reproduce, one fail the other pass or vice versa.

Expected behavior
Capture output should have sinewave always.

Screenshots or console output
captured wave files:
bat_wav_pass_and_fail.zip

dmesg snapshot:
lnl_dmesg_bad.txt
lnl_dmesg_good.txt

cc:

@fredoh9 fredoh9 added bug Something isn't working as expected LNL Applies to Lunar Lake platform labels Apr 10, 2024
@marc-hb marc-hb added the P1 Blocker bugs or important features label Apr 11, 2024
@lgirdwood lgirdwood added this to the v2.10 milestone Apr 16, 2024
@wszypelt
Copy link

@fredoh9 Can I ask you for logs with the IPC Payloads?

@lgirdwood
Copy link
Member

@fredoh9 what about other codecs ? Does it pass or fail i.e. is it codec specific ?

@pjdobrowolski
Copy link
Contributor

Please check latest sof_fw.

@kv2019i
Copy link
Collaborator

kv2019i commented Apr 23, 2024

@fredoh9 When is the last sighting (w.r.t. @pjdobrowolski 's question above)?

@kv2019i
Copy link
Collaborator

kv2019i commented Apr 30, 2024

@pjdobrowolski @fredoh9 This is still seen, latest in 29th Apr daily test plan. Also seen on PR CI runs on LNL today.

@fredoh9
Copy link
Contributor Author

fredoh9 commented Apr 30, 2024

@fredoh9 Can I ask you for logs with the IPC Payloads?

I will capture the log with IPC payloads.

@fredoh9 what about other codecs ? Does it pass or fail i.e. is it codec specific ?

We don't have the problem with HDA codec.

@plbossart
Copy link
Member

detected_mode=0x3 means the codec did not detect a capture mic so it's not really a surprise that the capture is silent...
We should log the 'Headset Mic Jack' control value in sof-tests, that would tell us if the test can work or not...

@fredoh9
Copy link
Contributor Author

fredoh9 commented Apr 30, 2024

@wszypelt please find attached dmesg with ipc payloads and mtrace. This is when test failed, output wave has just silence.

dmesg_ipc_payloads.txt
mtrace.txt

@marc-hb
Copy link
Collaborator

marc-hb commented May 1, 2024

Another failure today (with logs and .wav file) in public https://sof-ci.01.org/sofpr/PR9013/build4396/devicetest/index.html

Many more failures and logs in daily tests, see a long list in internal issue 561. This is failing on at least one device (out of 3) in more than 50% of test runs.

Interestingly, when one device fails then it tends to fail ALL headset capture tests (or none)

@lgirdwood
Copy link
Member

@fredoh9 @marc-hb what is the test capture device here - the fail capture WAV is all 0s. I would expect to see analog noise (i.e. LSBs toggling) if we capture silence rather than all 0s. i.e.

hexdump -C ~/Downloads/bat_wav_pass_and_fail\ \(1\)/bat.wav.fail 
00000000  52 49 46 46 44 62 05 00  57 41 56 45 66 6d 74 20  |RIFFDb..WAVEfmt |
00000010  10 00 00 00 01 00 02 00  80 bb 00 00 00 ee 02 00  |................|
00000020  04 00 10 00 64 61 74 61  20 62 05 00 00 00 00 00  |....data b......|
00000030  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00056240  00 00 00 00 00 00 00 00  00 00 00 00              |............|
0005624c

@marc-hb
Copy link
Collaborator

marc-hb commented May 1, 2024

Based on discussions in the older, longer and internal issue 561, the main suspect is jack detection. Does this help?

@lgirdwood
Copy link
Member

Based on discussions in the older, longer and internal issue 561, the main suspect is jack detection. Does this help?

Yep, jack detect could power off the ADCs and give 0s.

@plbossart
Copy link
Member

@fredoh9 Just to double-check, are we using the same add-in cards for MTL and LNL RVPs?

I am not sure why jack detection seems to be flaky only on LNL, this is a codec functionality which doesn't really have any correlation with the SOC-side of things.

@fredoh9
Copy link
Contributor Author

fredoh9 commented May 1, 2024

@plbossart yes, we are using same AIOC4.1 for MTL/LNL

Just remembered, AIOC4.1 is same but for LNL, we were running out of the cards, I found a box of un-used AIOC4.1 in the storage room, I shared that AIOCs. I can try to swap the codec board between MTL and LNL also.

@plbossart
Copy link
Member

I think it's worth exploring if we are dealing with board-specific issues. If I am not mistaken we have zero issues with MTL+AIOC, so if we swap the boards and see a problem appearing no MTL then the board is the problem.

Also are we using the same dongles or plugs for the analog loopback?

@fredoh9
Copy link
Contributor Author

fredoh9 commented May 1, 2024

same brand USB soundcard is used but other cables and Y splitter are varying, brand, length, quality etc.

@marc-hb
Copy link
Collaborator

marc-hb commented May 1, 2024

but other cables and Y splitter are varying, brand, length, quality etc.

If these are "randomly" and relatively evenly distributed across MTL and LNL then we can ignore them because the reproduction rate is very high on LNL (~ 30%) and absolutely 0% on MTL

@fredoh9
Copy link
Contributor Author

fredoh9 commented May 1, 2024

I swapped the AIOC between jf-mtlp-rvp-sdw-7 and jf-lnlm-rvp-sdw-1

  • Before doing anything, double check the alsabat test. only LNL has pass to fail problem
  • First only swap AIOC board only => same failure on only LNL
  • Then swap USB sound and cables => same failure on only LNL

MTL shows very solid pass but LNL is not. The jack connected is in RVP not the codec board. Not sure if there is a difference in the schematic, should not be.

@plbossart
Copy link
Member

@fredoh9 what do you mean by "The jack connected is in RVP not the codec board."

Does this mean we have analog wires going from the connector on RVP all the way to the codec board?
That's a recipe for failure indeed.

I've never seen this done, usually when there's an add-on board with a jack codec, it comes with its own connector.
In other words, the ONLY wires between the RVP and the add-on board are the 'digital' parts for HDaudio/SoundWire/I2C/I2S. Analog is off-limits on flex cables/connectors...

@bardliao what do you think?

@bardliao
Copy link
Collaborator

bardliao commented May 3, 2024

@fredoh9 Can you try with options snd_soc_sof_sdw quirk=0x2? Use RT711_JD2 as the jack detection source. As I remember, AIOC4.1's JD source is RT711_JD2.

@plbossart
Copy link
Member

Apparently we are NOT using the RT711 on the AOIC board, but the one on the RVP motherboard @bardliao

@bardliao
Copy link
Collaborator

bardliao commented May 3, 2024

Apparently we are NOT using the RT711 on the AOIC board, but the one on the RVP motherboard @bardliao

Wait, I think all LNL SDW in CI pool use external AIOC, no? If we use the on board RT711, why did @fredoh9 swap the AIOC for experiment?

@plbossart
Copy link
Member

plbossart commented May 3, 2024

Apparently we are NOT using the RT711 on the AOIC board, but the one on the RVP motherboard @bardliao

Wait, I think all LNL SDW in CI pool use external AIOC, no? If we use the on board RT711, why did @fredoh9 swap the AIOC for experiment?

Because we thought the RVP/AIOC connection could be a problem, but later we realized that the AIOC is irrelevant. It's the RVP that's the problem apparently. We could double-check this by removing the AIOC completely and see if we still have alsa-bat/jack detection issues.

@fredoh9
Copy link
Contributor Author

fredoh9 commented May 3, 2024

I tried only onboard rt711 without AIOC connection at all. I had same problem. LNL with/without AIOC has consistent problem, which is good.

@plbossart
Copy link
Member

Thanks @fredoh9 for this, now I think it makes sense. We can point to the LNL RVP as having jack detection issues. I guess we really need the selected_mode override that @shumingfan started in PR thesofproject/linux#4969

@marc-hb
Copy link
Collaborator

marc-hb commented May 3, 2024

Because we're looking at jack detection issues, I had this outlandish validation idea of... walking to the lab and simply trying to manually plug and unplug various jacks into the jf-lnlm-rvp-sdw-1 board itself (NOT the AIOC) and see what happens. It turned out to be a good idea.

Long story short:

  • SW_HEADPHONE_INSERT detection seems 100% reliable
  • SW_MICROPHONE_INSERT detection is flaky

SW_MICROPHONE_INSERT is even LESS reliable with the splitter we use (see picture in thesofproject/linux#4681) but it's still not predictable even with the basic 3-rings and 4-rings headphone and headset that I tried.

In theory, with headset:

evtest

Event: time 1714768171.329763, type 5 (EV_SW), code 2 (SW_HEADPHONE_INSERT), value 1
Event: time 1714768171.329763, type 5 (EV_SW), code 4 (SW_MICROPHONE_INSERT), value 1
Event: time 1714768171.329763, -------------- SYN_REPORT ------------

with headphones:

Event: time 1714768266.928076, type 5 (EV_SW), code 2 (SW_HEADPHONE_INSERT), value 1
Event: time 1714768266.928076, -------------- SYN_REPORT ------------

In practice I get one or the other RANDOMLY with everything I tried to plug in.

Interestingly enough, in rare cases I also get a burst of these despite me not actually pressing any button:

Event: time 1714768134.485890, type 1 (EV_KEY), code 114 (KEY_VOLUMEDOWN), value 0
Event: time 1714768134.485890, -------------- SYN_REPORT ------------
Event: time 1714768134.693872, type 1 (EV_KEY), code 114 (KEY_VOLUMEDOWN), value 1
Event: time 1714768134.693872, -------------- SYN_REPORT ------------
Event: time 1714768134.693941, type 1 (EV_KEY), code 114 (KEY_VOLUMEDOWN), value 0
Event: time 1714768134.693941, -------------- SYN_REPORT ------------
Event: time 1714768134.901633, type 1 (EV_KEY), code 114 (KEY_VOLUMEDOWN), value 1
Event: time 1714768134.901633, -------------- SYN_REPORT ------------
Event: time 1714768134.901692, type 1 (EV_KEY), code 114 (KEY_VOLUMEDOWN), value 0
Event: time 1714768134.901692, -------------- SYN_REPORT ------------
Event: time 1714768135.109444, type 1 (EV_KEY), code 114 (KEY_VOLUMEDOWN), value 1
Event: time 1714768135.109444, -------------- SYN_REPORT ------------
Event: time 1714768135.110116, type 1 (EV_KEY), code 114 (KEY_VOLUMEDOWN), value 0
Event: time 1714768135.110116, -------------- SYN_REPORT ------------
Event: time 1714768135.314664, type 1 (EV_KEY), code 114 (KEY_VOLUMEDOWN), value 1
Event: time 1714768135.314664, -------------- SYN_REPORT ------------

@marc-hb marc-hb changed the title [BUG][LNL] alsabat capture test sometimes fails when headset capture silent [BUG][LNL] alsabat capture test sometimes fails when headset capture silent -> flaky MIC jack detect May 3, 2024
@bardliao
Copy link
Collaborator

bardliao commented May 6, 2024

@marc-hb @fredoh9 I thought the issue only happens with the splitter, but no issue with normal headset. Can you try with different values of options snd_soc_sof_sdw quirk to see if any of the values work? As I know, MTL RVP on board rt711 JD source is JD2_100K, and AIOC4.1's JD source is JD2. But I don't know about LNL RVP on board rt711.
In theory, it should be realizable if you set options snd_soc_sof_sdw quirk=0x2 when you test with LNL RVP + AIOC.
And one of options snd_soc_sof_sdw quirk=0x1, 0x2, or 0x3 should work when you test with LNL RVP on board rt711.
Please test with a normal headset first. And switch to the splitter once the headset is detected realizably.

@fredoh9
Copy link
Contributor Author

fredoh9 commented May 8, 2024

@bardliao @plbossart @marc-hb
I tried with 'options snd_soc_sof_sdw quirk=2', in fact tried 1,2,3 all. I don't see any difference.
This doesn't need thesofproject/linux#4969, but just in case I tried with/without thesofproject/linux#4969, no difference.

@marc-hb

This comment was marked as off-topic.

@plbossart
Copy link
Member

NOCODEC issues are tracked here @marc-hb #9123

This is a different setup since the loopback is at the SSP level, i.e. not dependent on any codec/jack

@kv2019i
Copy link
Collaborator

kv2019i commented May 14, 2024

@fredoh9 @plbossart @marc-hb Agree there are two distinct issues here. Should we move this SDW one to kernel. Looking at the bat.wav captures of failing cases, this seems more like a codec issue.

The nocodec issue (#9123) seems more like a potential FW issue as the start of capture is clean, but then we have errors in captured PCM samples. This doesn't look like a codec issue at all. Should we move 4964 to FW and this SDW issue to kernel?
FYI @abonislawski @dnikodem on the potential FW issue above. See e.g. bat.wav in https://sof-ci.01.org/linuxpr/PR4899/build2589/devicetest/index.html?model=LNLM_RVP_NOCODEC&testcase=check-alsabat-headset-playback-821

@plbossart
Copy link
Member

Agree with @kv2019i the two issues seem quite different and well partitioned between kernel/firmware.
I'll move #9123 to firmware.

@plbossart
Copy link
Member

plbossart commented May 14, 2024

we already have a kernel issues for jack detection/AIOC, closing this one. Let's use thesofproject/linux#4681 which has historically been reported first.

@marc-hb
Copy link
Collaborator

marc-hb commented May 17, 2024

but it's still not predictable even with the basic 3-rings and 4-rings headphone and headset that I tried.

That was wrong sorry, I was using some bad headphones! I plugged and unplugged various (good!) jacks hundreds of times again and the microphone detection is reliable EXCEPT when connecting a LINE OUT to the microphone. More at:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working as expected LNL Applies to Lunar Lake platform P1 Blocker bugs or important features
Projects
None yet
Development

No branches or pull requests

8 participants