-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ASoC: SOF: debug: show firmware/topology prefix/names #4902
ASoC: SOF: debug: show firmware/topology prefix/names #4902
Conversation
First step to remove use of kernel parameters Link: thesofproject/linux#4902 Link: thesofproject#1166 Signed-off-by: Pierre-Louis Bossart <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lovely
29b4652
to
58c830e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
an ask to add the information in debugfs to simplify CI scripts.
The TEST scripts, NOT the CI scripts.
= the scripts that EVERYONE should run, not the obscure Jenkins scripts or Github workflows no one knows about.
I mean this PR is not to please one particular and demanding user. It's useful for everyone interested in testing their code (= too few people?)
58c830e
to
c82788b
Compare
First step to remove use of kernel parameters Link: thesofproject/linux#4902 Link: thesofproject#1166 Signed-off-by: Pierre-Louis Bossart <[email protected]>
First step to remove use of kernel parameters Link: thesofproject/linux#4902 Link: thesofproject#1166 Signed-off-by: Pierre-Louis Bossart <[email protected]>
c82788b
to
39a5f1d
Compare
@ujfalusi This PR was at least useful to detect that we never had a default_lib_path for LNL, see second commit... |
@plbossart, one checkpatch issue should be addressed. |
The SOF driver has multiple profiles to select firmware/topology prefix/names depending on the platform and ipc_type, and each of those fields can be overridden with kernel parameters. This results in some cases in confusion on what configuration is actually used in a given test. We currently log the firmware and topology names in the kernel logs, but there's been an ask to add the information in debugfs to simplify test scripts used by developers and CI. This isn't meant to be a stable ABI used by apps, changes will be allowed as needed. Closes: thesofproject#3867 Signed-off-by: Pierre-Louis Bossart <[email protected]>
The commit cd6f2a2 ("ASoC: SOF: Intel: Set the default firmware library path for IPC4") added the default_lib_path field for all platforms, but this was missed when LunarLake was later introduced. Fixes: 64a63d9 ("ASoC: SOF: Intel: LNL: Add support for Lunarlake platform") Signed-off-by: Pierre-Louis Bossart <[email protected]>
5dd8b15
to
67e1b4f
Compare
Done. I didn't realize my post-commit hook was missing. |
First step to remove use of kernel parameters Link: thesofproject/linux#4902 Link: thesofproject#1166 Signed-off-by: Pierre-Louis Bossart <[email protected]>
First step to remove use of kernel parameters Link: thesofproject/linux#4902 Link: #1166 Signed-off-by: Pierre-Louis Bossart <[email protected]>
debugfs_create_str("fw_path", 0444, fw_profile, | ||
(char **)&plat_data->fw_filename_prefix); | ||
debugfs_create_str("fw_lib_path", 0444, fw_profile, | ||
(char **)&plat_data->fw_lib_prefix); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is page_fault_oops with fw_lib_path
in some configurations
Found by... testing!
root@jf-cml-rvp-sdw-3:~# cat /sys/kernel/debug/sof/fw_profile/fw_lib_path
Killed
Apr 22 23:33:23 jf-cml-rvp-sdw-3 kernel: BUG: kernel NULL pointer dereference, address: 0000000000000000
Apr 22 23:33:23 jf-cml-rvp-sdw-3 kernel: #PF: supervisor read access in kernel mode
Apr 22 23:33:23 jf-cml-rvp-sdw-3 kernel: #PF: error_code(0x0000) - not-present page
Apr 22 23:33:23 jf-cml-rvp-sdw-3 kernel: PGD 0 P4D 0
Apr 22 23:33:23 jf-cml-rvp-sdw-3 kernel: Oops: 0000 [#5] PREEMPT SMP NOPTI
Apr 22 23:33:23 jf-cml-rvp-sdw-3 kernel: CPU: 8 PID: 2096 Comm: grep Tainted: G D 6.9.0-rc3-g0c12720aaa21 #dev
Apr 22 23:33:23 jf-cml-rvp-sdw-3 kernel: Hardware name: Intel Corporation CometLake Client Platform/CometLake U DDR4 HR RVP, BIOS CMLSFWR1.R00.1245.D00.1906>
Apr 22 23:33:23 jf-cml-rvp-sdw-3 kernel: RIP: 0010:strlen+0x4/0x30
Apr 22 23:33:23 jf-cml-rvp-sdw-3 kernel: Code: f7 75 ec 31 c0 c3 cc cc cc cc 48 89 f8 c3 cc cc cc cc 0f 1f 40 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 9>
Apr 22 23:33:23 jf-cml-rvp-sdw-3 kernel: RSP: 0018:ffffaa4b438c7e28 EFLAGS: 00010246
Apr 22 23:33:23 jf-cml-rvp-sdw-3 kernel: RAX: ffff8d2750c389b8 RBX: ffff8d2743c6b800 RCX: 0000000000000006
Apr 22 23:33:23 jf-cml-rvp-sdw-3 kernel: RDX: 0000000000000007 RSI: 0000558abe473000 RDI: 0000000000000000
Apr 22 23:33:23 jf-cml-rvp-sdw-3 kernel: RBP: ffff8d2743c6b800 R08: 0000000000000000 R09: 0000000000000000
Apr 22 23:33:23 jf-cml-rvp-sdw-3 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
Apr 22 23:33:23 jf-cml-rvp-sdw-3 kernel: R13: 0000000000018000 R14: ffffaa4b438c7f08 R15: ffff8d275f0ce8f8
Apr 22 23:33:23 jf-cml-rvp-sdw-3 kernel: FS: 00007f942669e740(0000) GS:ffff8d2893600000(0000) knlGS:0000000000000000
Apr 22 23:33:23 jf-cml-rvp-sdw-3 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr 22 23:33:23 jf-cml-rvp-sdw-3 kernel: CR2: 0000000000000000 CR3: 000000013bdf6004 CR4: 00000000003706f0
Apr 22 23:33:23 jf-cml-rvp-sdw-3 kernel: Call Trace:
Apr 22 23:33:23 jf-cml-rvp-sdw-3 kernel: <TASK>
Apr 22 23:33:23 jf-cml-rvp-sdw-3 kernel: ? __die+0x24/0x70
Apr 22 23:33:23 jf-cml-rvp-sdw-3 kernel: ? page_fault_oops+0x15a/0x450
Apr 22 23:33:23 jf-cml-rvp-sdw-3 kernel: ? exc_page_fault+0x68/0x1e0
Apr 22 23:33:23 jf-cml-rvp-sdw-3 kernel: ? asm_exc_page_fault+0x26/0x30
Apr 22 23:33:23 jf-cml-rvp-sdw-3 kernel: ? strlen+0x4/0x30
Apr 22 23:33:23 jf-cml-rvp-sdw-3 kernel: debugfs_read_file_str+0x4d/0xf0
Apr 22 23:33:23 jf-cml-rvp-sdw-3 kernel: vfs_read+0xab/0x340
Apr 22 23:33:23 jf-cml-rvp-sdw-3 kernel: ? __do_sys_newfstatat+0x35/0x60
Apr 22 23:33:23 jf-cml-rvp-sdw-3 kernel: ksys_read+0x69/0xf0
Apr 22 23:33:23 jf-cml-rvp-sdw-3 kernel: do_syscall_64+0xad/0x1b0
Apr 22 23:33:23 jf-cml-rvp-sdw-3 kernel: entry_SYSCALL_64_after_hwframe+0x72/0x7a
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a CML platform....
I can't recall is fw_lib_path is relevant for IPC3 @ujfalusi, we could just skip the file create for IPC3 I guess.
At any rate the scripts only used this for IPC4 platforms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I confirm the oops happened only with the stable-v2.2 firmware branch. But it did happen with ALL 3 devices tested for that branch.
Because of various sudo issues (hopefully last fix in thesofproject/sof-test#1178) fw_lib_path
had never been tested yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, fw_lib_path is going to be NULL with IPC3, probably we can have have a static const "N/A" string here and give that to debugfs_create_str()
?
Or skip creating the file in case of IPC3, it is not applicable after all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marc-hb what's the preferred result for scripts? A N/A reply or the file not created at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either can be scripted easily but no file at all is safer because there's less risk of confusing it with something valid:
https://en.wikipedia.org/wiki/Null_pointer a.k.a. the "billion dollar mistake".
Note we have no script trying to read this brand new field yet (= how I just found this bug)
We currently log the firmware and topology names in the kernel logs, but there's been an ask to add the information in debugfs to simplify CI scripts.
This isn't meant to be a stable ABI used by apps, changes will be allowed as needed.
Closes: #3867