Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LNL] multiple cases failing to HDMI playback on SDW configurations with gpu_bind disabled #5098

Closed
kv2019i opened this issue Jul 8, 2024 · 22 comments
Assignees
Labels
bug Something isn't working display audio Audio to HDMI/DisplayPort via HDA LNL Applies to Lunar Lake platform P2 Critical bugs or normal features SDW Applies to SoundWire bus for codec connection

Comments

@kv2019i
Copy link
Collaborator

kv2019i commented Jul 8, 2024

Some interaction with recently merged kernel and FW PRs has caused a high rate failure to occur in PR testing:
https://sof-ci.01.org/sofpr/PR9116/build6350/devicetest/index.html

2024-07-08 13:40:30 UTC [REMOTE_INFO] ===== Testing: (Round: 1/1) (PCM: HDMI1 [hw:0,5]) (Loop: 1/1) =====
2024-07-08 13:40:30 UTC [REMOTE_COMMAND] aplay   -Dhw:0,5 -r 48000 -c 2 -f S16_LE -d 10 /dev/zero -v -q
aplay: set_params:1416: Unable to install hw params:

The gpu_bind should be disabled in sof-dev kernel, so not sure why HDMI playback is attempted.

Related PRs merged recently:

As this is seen in PR testing marking as P1.

@kv2019i kv2019i added bug Something isn't working P1 Blocker bugs or important features SDW Applies to SoundWire bus for codec connection display audio Audio to HDMI/DisplayPort via HDA LNL Applies to Lunar Lake platform labels Jul 8, 2024
@kv2019i
Copy link
Collaborator Author

kv2019i commented Jul 8, 2024

FYI @lyakh

@marc-hb
Copy link
Collaborator

marc-hb commented Jul 8, 2024

Observed in today's daily run https://sof-ci.ostc.intel.com/#/result/planresultdetail/43591?model=LNLM_SDW_AIOC&testcase=check-playback-all-formats on jf-lnlm-rvp-sdw-1

Other LNL configurations are indeed not affected.

[  161.355590] kernel: soundwire_cadence:cdns_init_clock_ctrl: soundwire_intel soundwire_intel.link.0: mclk 19200000 max 4800000 row 50 col 4
[  161.355635] kernel: soundwire_cadence:cdns_init_clock_ctrl: soundwire_intel soundwire_intel.link.3: mclk 19200000 max 4800000 row 50 col 4
[  161.355708] kernel: soundwire_bus:sdw_modify_slave_status: rt1316-sdca sdw:0:2:025d:1316:01: initializing enumeration and init completion for Slave 1
[  161.355718] kernel: soundwire_cadence:cdns_init_clock_ctrl: soundwire_intel soundwire_intel.link.2: mclk 19200000 max 4800000 row 50 col 4
[  161.356138] kernel: sof-audio-pci-intel-lnl 0000:00:1f.3: ASoC: error at snd_soc_dai_hw_params on iDisp1 Pin: -22
[  161.356276] kernel: snd_sof:sof_pcm_hw_free: sof-audio-pci-intel-lnl 0000:00:1f.3: pcm: free stream 5 dir 0
[  161.356561] kernel: snd_sof:sof_pcm_close: sof-audio-pci-intel-lnl 0000:00:1f.3: pcm: close stream 5 dir 0
[  161.357248] kernel: soundwire_cadence:cdns_update_slave_status_work: soundwire_intel soundwire_intel.link.0: Slave status change: 0x2
[  161.357268] kernel: soundwire_bus:sdw_handle_slave_status: soundwire sdw-master-0-0: Slave attached, programming device number

@marc-hb
Copy link
Collaborator

marc-hb commented Jul 9, 2024

Could this be caused by some device-specific configuration? It did not happen on ba-lnlm-rvp-sdw-01 in the July 7th (planresultdetail/43565) and July 9th (planresultdetail/43642) daily tests. Also not in https://sof-ci.01.org/softestpr/PR1218/build625/devicetest/index.html

EDIT: failed on
ba-lnlm-rvp-sdw-03 in https://sof-ci.01.org/sofpr/PR9276/build6355/devicetest/index.html
https://sof-ci.01.org/softestpr/PR1218/build604/devicetest/index.html

jf-lnlm-rvp-sdw-1 in https://sof-ci.01.org/softestpr/PR1218/build600/devicetest/index.html

@ssavati
Copy link

ssavati commented Jul 15, 2024

This issue still reproducible on latest. Currntly we have WA "NO_HDMI_MODE=true" is set on device enviroment so we are not seeing issue in CI results.
cc: @kv2019i @plbossart @lgirdwood

@kv2019i
Copy link
Collaborator Author

kv2019i commented Jul 16, 2024

I'll take a look at this, but FYI to @ujfalusi and @ranj063 in case we need to switch.

@kv2019i kv2019i assigned kv2019i and unassigned jsarha Jul 16, 2024
@ujfalusi
Copy link
Collaborator

Only affecting LNL, TGL/MTL HDMI is working fine?

@ssavati
Copy link

ssavati commented Jul 16, 2024

@ujfalusi this is not observed on MTL. I will check on TGL and update

@kv2019i
Copy link
Collaborator Author

kv2019i commented Jul 16, 2024

I think @ujfalusi @bardliao @plbossart there's a problem in sof_sdw mach driver handling the case where display driver is not available and no HDMI PCms are available:
Jul 08 13:41:39 kernel: snd_soc_sof_sdw:sof_card_dai_links_create: sof_sdw sof_sdw: sdw 5, ssp 0, dmic 0, hdmi 0, bt: 0

But topology has (as it should) the HDMI nodes:

Jul 08 13:41:39 kernel: snd_sof:sof_dai_load: sof-audio-pci-intel-lnl 0000:00:1f.3: tplg: load pcm HDMI1
Jul 08 13:41:39 kernel: snd_sof:sof_dai_load: sof-audio-pci-intel-lnl 0000:00:1f.3: tplg: load pcm HDMI2
Jul 08 13:41:39 kernel: snd_sof:sof_dai_load: sof-audio-pci-intel-lnl 0000:00:1f.3: tplg: load pcm HDMI3

@plbossart
Copy link
Member

I don't understand how the display driver became unavailable?

@plbossart
Copy link
Member

plbossart commented Jul 16, 2024

we have this in the configuration: https://sof-ci.ostc.intel.com/#/result/planresultdetail/43591?model=LNLM_SDW_AIOC&testcase=verify-kernel-boot-log

/sys/module/snd_hda_core/parameters/gpu_bind:0

why is this value cleared?

static int gpu_bind = -1;
module_param(gpu_bind, int, 0644);
MODULE_PARM_DESC(gpu_bind, "Whether to bind sound component to GPU "
			   "(1=always, 0=never, -1=on nomodeset(default))");

looks like a stale CI configuration to me, if we want to test HDMI this should not be cleared.

@kv2019i
Copy link
Collaborator Author

kv2019i commented Jul 16, 2024

@plbossart wrote:

I don't understand how the display driver became unavailable?

It wasn't available in sof-dev yet for this platform (not marked as stable yet in kernel --> this can be overridden in the device configuration -> let me go and check this particular device).

UPDATE: edit, we still have commit 003bd60 in sof-dev and we can't remove until we pull in stable version of the xe support or we change the test device configurations to apply a force probe.

@ujfalusi
Copy link
Collaborator

I think @ujfalusi @bardliao @plbossart there's a problem in sof_sdw mach driver handling the case where display driver is not available and no HDMI PCms are available: Jul 08 13:41:39 kernel: snd_soc_sof_sdw:sof_card_dai_links_create: sof_sdw sof_sdw: sdw 5, ssp 0, dmic 0, hdmi 0, bt: 0

But topology has (as it should) the HDMI nodes:

Jul 08 13:41:39 kernel: snd_sof:sof_dai_load: sof-audio-pci-intel-lnl 0000:00:1f.3: tplg: load pcm HDMI1
Jul 08 13:41:39 kernel: snd_sof:sof_dai_load: sof-audio-pci-intel-lnl 0000:00:1f.3: tplg: load pcm HDMI2
Jul 08 13:41:39 kernel: snd_sof:sof_dai_load: sof-audio-pci-intel-lnl 0000:00:1f.3: tplg: load pcm HDMI3

@kv2019i, we should have dummy links for the HDMI PCMs to probe. They will not work, but they need to be there to be able to load the topology.

@plbossart
Copy link
Member

We do have dummy links, the problem is not the probe:

	for (i = 0; i < hdmi_num; i++) {
		char *name = devm_kasprintf(dev, GFP_KERNEL, "iDisp%d", i + 1);
		char *cpu_dai_name = devm_kasprintf(dev, GFP_KERNEL, "iDisp%d Pin", i + 1);
		char *codec_name, *codec_dai_name;

		if (intel_ctx->hdmi.idisp_codec) {
			codec_name = "ehdaudio0D2";
			codec_dai_name = devm_kasprintf(dev, GFP_KERNEL,
							"intel-hdmi-hifi%d", i + 1);
		} else {
			codec_name = "snd-soc-dummy";
			codec_dai_name = "snd-soc-dummy-dai";
		}

		ret = asoc_sdw_init_simple_dai_link(dev, *dai_links, be_id, name,
						    1, 0, // HDMI only supports playback
						    cpu_dai_name, platform_component->name,
						    ARRAY_SIZE(platform_component),
						    codec_name, codec_dai_name,
						    i == 0 ? sof_sdw_hdmi_init : NULL, NULL);
		if (ret)
			return ret;

		(*dai_links)++;
	}

It's the error on hw_params that needs to be root-caused.

@plbossart
Copy link
Member

the debug log is misleading

	dev_dbg(dev, "sdw %d, ssp %d, dmic %d, hdmi %d, bt: %d\n",
		sdw_be_num, ssp_num, dmic_num,
		intel_ctx->hdmi.idisp_codec ? hdmi_num : 0, bt_num);

hdmi_num is 4 on TGL and 3 on all other devices, so we do create 3+ links.

@ujfalusi
Copy link
Collaborator

The HDMI PCM never worked when there were no HDMI hardware, it has been like this with HDA devices also. The hw_params fails because of the missing real DAI.

@kv2019i
Copy link
Collaborator Author

kv2019i commented Jul 16, 2024

That's not true @ujfalusi , this has been working but has been broken at some point. It seems some of the changes to HDA DAI ops now return -EINVAL when dummy codec driver is connected. This DID work in the past.

UPDATE: I can confirm this is broken on TGL as well if HDMI is disable via codec_mask. This did work in the past, will bisect to see where this got broken.

@ujfalusi
Copy link
Collaborator

@kv2019i, I'm not sure about past, but now it is not working on tgl either:

[   32.290196] snd_soc_core:dpcm_be_dai_hw_params:  iDisp1: ASoC: hw_params BE iDisp1
[   32.290205] sof-audio-pci-intel-tgl 0000:00:1f.3: ASoC: error at snd_soc_dai_hw_params on iDisp1 Pin: -22
[   32.290212] snd_soc_core:dpcm_be_dai_hw_params:  HDMI1: ASoC: dpcm_be_dai_hw_params() failed at iDisp1 (-22)
[   32.290219] snd_soc_core:dpcm_fe_dai_hw_free:  HDMI1: ASoC: hw_free FE HDMI1

@kv2019i
Copy link
Collaborator Author

kv2019i commented Jul 16, 2024

I'm sure about the past :) -- but this is not just for debug, this is actual product config for HDA where there is Intel GPU is disabled for reason or another. Granted most of these laptops use the non-SOF driver, but there are actual product configs with dmic (=SOF) and some other GPU, so this dummy codec construct must work!

@ujfalusi
Copy link
Collaborator

@kv2019i, I trust your memory. It did not worked on 18.09.2023: #4594 (comment)

Can this be the reason: #4659 ?
We don't register HDMI dais when there is no HDMI, before that PR we registered the dais multiple times (analog would register the HDMI also and HDMI would register the analog), causing warnings.

@kv2019i kv2019i changed the title [LNL] multiple cases failing to HDMI playback on SDW configurations [LNL] multiple cases failing to HDMI playback on SDW configurations with gpu_bind disabled Jul 16, 2024
@kv2019i kv2019i added P2 Critical bugs or normal features and removed P1 Blocker bugs or important features labels Jul 16, 2024
@kv2019i
Copy link
Collaborator Author

kv2019i commented Jul 16, 2024

No, it's not #4659 -- this is probably older.

I'll lower the priority now as this is not hit at card probe and normal applications will not open the HDMI if no monitor is detected (and no monitor ever will on these devices). So the remaining open is Pulseaudio/Pipewire habit of opening the PCMs and doing a hw_params query. Maybe -EINVAL is ok for this case as well (and my memory really malfunctions here). If so, we can close this.

@kv2019i
Copy link
Collaborator Author

kv2019i commented Jul 16, 2024

Tested with upstream 6.8 kernel and pipewire 0.3.79 (versions used in 24.04LTS) and the -EINVAL errors at pipewire start are handled correctly and rest of audio functionalty is ok. So I'll close this as works-as-expected ad we can track the test device configuration issues elsewhere.

@kv2019i kv2019i closed this as completed Jul 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working display audio Audio to HDMI/DisplayPort via HDA LNL Applies to Lunar Lake platform P2 Critical bugs or normal features SDW Applies to SoundWire bus for codec connection
Projects
None yet
Development

No branches or pull requests

6 participants