Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Devices failing to boot due to out-of-bound regression #428

Closed
crazoes opened this issue Aug 2, 2024 · 2 comments
Closed

Devices failing to boot due to out-of-bound regression #428

crazoes opened this issue Aug 2, 2024 · 2 comments

Comments

@crazoes
Copy link

crazoes commented Aug 2, 2024

[    4.004111] amdgpu 0000:04:00.0: amdgpu: Trusted Memory Zone (TMZ) feature enabled
[    4.011887] ================================================================================
[    4.020326] UBSAN: shift-out-of-bounds in ./include/linux/log2.h:57:13
[    4.026856] shift exponent 32 is too large for 32-bit type 'long unsigned int'
[    4.034079] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 6.1.103-rc3 #1
[    4.035071] Hardware name: LENOVO Morphius/Morphius, BIOS Google_Morphius.13434.60.0 10/08/2020
[    4.035071] Call Trace:
[    4.035071]  ? show_stack+0x35/0x3b
[    4.035071]  dump_stack_lvl+0x4a/0x5c
[    4.035071]  dump_stack+0xd/0x10
[    4.035071]  ubsan_epilogue+0x8/0x2b
[    4.035071]  __ubsan_handle_shift_out_of_bounds.cold+0x59/0xf3
[    4.035071]  __roundup_pow_of_two+0x26/0x3a
[    4.035071]  amdgpu_vm_adjust_size.cold+0x29/0x21c
[    4.035071]  ? __lock_release.isra.0+0x5c/0x170
[    4.035071]  gmc_v9_0_sw_init+0x126/0x5c0
[    4.035071]  ? nbio_v7_0_vcn_doorbell_range+0x80/0x80
[    4.035071]  amdgpu_device_ip_init+0xa0/0x745
[    4.035071]  amdgpu_device_init.cold+0x4a5/0x908
[    4.035071]  amdgpu_driver_load_kms+0x13/0xf0
[    4.035071]  amdgpu_pci_probe+0xe4/0x320
[    4.035071]  pci_device_probe+0x96/0x110
[    4.035071]  ? sysfs_create_link+0x1d/0x40
[    4.035071]  really_probe+0xc6/0x250
[    4.035071]  ? _raw_spin_unlock_irq+0x1d/0x40
[    4.126667] tsc: Refined TSC clocksource calibration: 2595.125 MHz
[    4.035071]  ? pm_runtime_barrier+0x52/0x90
[    4.133388] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x25683efc55a, max_idle_ns: 440795294610 ns
[    4.035071]  __driver_probe_device+0x84/0xe0
[    4.035071]  ? lock_release+0x62/0x110
[    4.035071]  driver_probe_device+0x23/0x110
[    4.147575]  __driver_attach+0x9b/0x190
[    4.147575]  ? __device_attach_driver+0x120/0x120
[    4.147575]  bus_for_each_dev+0x66/0xa0
[    4.147575]  driver_attach+0x19/0x20
[    4.147575]  ? __device_attach_driver+0x120/0x120
[    4.147575]  bus_add_driver+0x16f/0x1d0
[    4.147575]  driver_register+0x7a/0xd0
[    4.147575]  ? drm_sched_fence_slab_init+0x2c/0x2c
[    4.147575]  __pci_register_driver+0x60/0x70
[    4.147575]  amdgpu_init+0x54/0x5e
[    4.147575]  do_one_initcall+0x65/0x260
[    4.147575]  do_initcalls+0xf4/0x112
[    4.147575]  kernel_init_freeable+0x162/0x193
[    4.147575]  ? rest_init+0x1d0/0x1d0
[    4.147575]  kernel_init+0x18/0x140
[    4.147575]  ? rest_init+0x1d0/0x1d0
[    4.147575]  ? schedule_tail_wrapper+0x9/0xc
[    4.147575]  ? rest_init+0x1d0/0x1d0
[    4.147575]  ret_from_fork+0x1c/0x28
[    4.234306] ================================================================================
[    4.242741] [drm] vm size is 262144 GB, 3 levels, block size is 9-bit, fragment size is 9-bit
[    4.251272] clocksource: Switched to clocksource tsc
[    4.251302] amdgpu 0000:04:00.0: amdgpu: VRAM: 64M 0x000000F400000000 - 0x000000F403FFFFFF (64M used)
[    4.265566] amdgpu 0000:04:00.0: amdgpu: GART: 1024M 0x0000000000000000 - 0x000000003FFFFFFF
[    4.274009] amdgpu 0000:04:00.0: amdgpu: AGP: 267419648M 0x000000F800000000 - 0x0000FFFFFFFFFFFF
[    4.282841] [drm] Detected VRAM RAM=64M, BAR=256M
[    4.283212] kwatchdog (59) used greatest stack depth: 7180 bytes left
[    4.287559] [drm] RAM width 64bits DDR4
[    4.287967] [drm] amdgpu: 64M of VRAM memory ready

Many x86_64 and i386 are failing to boot due to the above regression on stable-rc kernels. It has been happening since long time now and needs to be investigated.

hp-14b-na0052xx-zork
lenovo-TPad-C13-Yoga-zork
asus-CM1400CXA-dalboz
hp-x360-14a-cb0001xx-zork
hp-11A-G6-EE-grunt
hp-14-db0003na-grunt
acer-R721T-grunt
lenovo-TPad-C13-Yoga-zork
hp-x360-14a-cb0001xx-zork
lenovo-TPad-C13-Yoga-zork
hp-x360-14a-cb0001xx-zork

More details can be checked on grafana dashboard under stable-rc tree

@nuclearcat
Copy link
Member

I guess this issue is resolved?

@musamaanjum
Copy link

Closing as I've not seen this error from some time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants