Skip to content
This repository was archived by the owner on Aug 5, 2022. It is now read-only.
This repository was archived by the owner on Aug 5, 2022. It is now read-only.

BXT: i915: GPU HANG followed by full crash after suspend #9

@jlaitine

Description

@jlaitine

On joule compute module, soon after doing
echo freeze > /sys/power/state
the device will crash after it wakes up from suspend.

The kernel version is 4.9.27-intel-pk-standard
The xf86-video-intel version is xf86-video-intel/2_2.99.917

This is the crash dump:

[   84.844432] [drm] GPU HANG: ecode 9:0:0xfffffffe, reason: Hang on render ring, action: reset
[   84.853874] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[   84.864454] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[   84.874444] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[   84.885309] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[   84.895397] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[   84.902889] drm/i915: Resetting chip after gpu hang
[   84.910547] BUG: unable to handle kernel NULL pointer dereference at 0000000000000070
[   84.919385] IP: [<ffffffffa022e4c2>] reset_common_ring+0xa2/0x130 [i915]
[   84.926941] PGD 0 [   84.928998] 
[   84.930670] Oops: 0000 [#1] PREEMPT SMP
[   84.934971] Modules linked in: intel_ipu4_isys_mod_bxtB0(O) videobuf2_v4l2 videobuf2_core intel_ipu4_psys_mod_bxtB0(O) int
el_ipu4_mmu_bxtB0(O) intel_ipu4_mod_bxtB0(O) iova intel_ipu4_acpi(O) videobuf2_dma_contig videobuf2_memops videobuf_core dw97
14(O) crlmodule(O) v4l2_common videodev media rfcomm usb_f_mtp usb_f_ecm u_ether usb_f_acm u_serial libcomposite configfs snd
_soc_wm8998 extcon_arizona snd_soc_arizona arizona_micsupp extcon_core snd_soc_core snd_compress ac97_bus arizona_ldo1 gpio_a
rizona iptable_nat nf_nat_ipv4 nf_nat bnep iptable_mangle snd_hda_codec_hdmi mei_spd gpio_keys intel_rapl x86_pkg_temp_therma
l intel_powerclamp coretemp efivars clk_wcove typec_wcove arc4 gpio_wcove iwlmvm(O) mac80211(O) pwm_lpss_pci pwm_lpss btusb b
trtl btbcm iwlwifi(O) spi_pxa2xx_platform snd_hda_intel cfg80211(O) snd_hda_codec snd_hda_core compat(O) snd_pcm i915 fdp_i2c
 fdp i2c_designware_platform i2c_designware_core nci mei_me snd_timer processor_thermal_device dwc3_pci nfc mei intel_soc_dts
_iosf at24 bq25890_charger atmel_mxt_ts nvmem_core hci_uart btintel int3400_thermal acpi_thermal_rel video int3403_thermal in
t340x_thermal_zone soc_button_array nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables nf_conntrack_ipv4 nf_defrag_i
pv4 xt_tcpudp xt_conntrack nf_conntrack iptable_filter ip_tables x_tables uio arizona_i2c 5xx_comms_leds(O)
[   85.068404] CPU: 0 PID: 472 Comm: kworker/0:2 Tainted: G           O    4.9.27-intel-pk-standard #1
[   85.078561] Hardware name: Intel Corp. 570x DVT2/SDS, BIOS GTPP1H3A.X64.0143.B30.1706022158 06/02/2017
[   85.089030] Workqueue: events_long i915_hangcheck_elapsed [i915]
[   85.095779] task: ffff880179cba340 task.stack: ffffc900007f8000
[   85.102424] RIP: 0010:[<ffffffffa022e4c2>]  [<ffffffffa022e4c2>] reset_common_ring+0xa2/0x130 [i915]
[   85.112706] RSP: 0018:ffffc900007fbb30  EFLAGS: 00010246
[   85.118668] RAX: 0000000000000000 RBX: ffff88016aaa8500 RCX: 0000000080000006
[   85.126674] RDX: 0000000000003fd8 RSI: ffff880178d06000 RDI: ffff88017a3a0200
[   85.134681] RBP: ffffc900007fbb48 R08: 0000000000000017 R09: ffffc90010001000
[   85.142692] R10: 0000000000000000 R11: ffff88017b161800 R12: ffff880179c9a000
[   85.150704] R13: 0000000000000000 R14: ffffffff819130f0 R15: ffff880179c9a000
[   85.158718] FS:  0000000000000000(0000) GS:ffff88017fc00000(0000) knlGS:0000000000000000
[   85.167805] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   85.174274] CR2: 0000000000000070 CR3: 0000000002e07000 CR4: 00000000003406f0
[   85.182303] Stack:
[   85.184555]  ffff880178932800 ffff880179ac27d8 ffff88016aaa8500 ffffc900007fbbc0
[   85.192887]  ffffffffa021a9b8 ffff880179ac06e8 0000000000000286 0000000000000286
[   85.201219]  ffffffffa0238a78 ffff880179ac0000 0000000000000000 ffff880179ac0000
[   85.209551] Call Trace:
[   85.212311]  [<ffffffffa021a9b8>] i915_gem_reset+0x148/0x3b0 [i915]
[   85.219364]  [<ffffffffa0238a78>] ? intel_uncore_forcewake_put+0x48/0x60 [i915]
[   85.227575]  [<ffffffff819130f0>] ? bit_wait_io_timeout+0x70/0x70
[   85.234425]  [<ffffffffa01dd29c>] i915_reset+0xdc/0x170 [i915]
[   85.240978]  [<ffffffffa01e274d>] i915_reset_and_wakeup+0x13d/0x150 [i915]
[   85.248711]  [<ffffffffa01e63b6>] i915_handle_error+0x206/0x220 [i915]
[   85.256042]  [<ffffffff8140878d>] ? scnprintf+0x3d/0x70
[   85.261923]  [<ffffffffa022ca0c>] hangcheck_declare_hang+0xcc/0xe0 [i915]
[   85.269562]  [<ffffffffa022bf64>] ? intel_engine_get_active_head+0xb4/0xe0 [i915]
[   85.277982]  [<ffffffffa022cca9>] i915_hangcheck_elapsed+0x289/0x2b0 [i915]
[   85.285800]  [<ffffffff81094dce>] process_one_work+0x1de/0x4c0
[   85.292348]  [<ffffffff810950f8>] worker_thread+0x48/0x4e0
[   85.298506]  [<ffffffff810950b0>] ? process_one_work+0x4c0/0x4c0
[   85.305270]  [<ffffffff8109a257>] kthread+0xd7/0xf0
[   85.310745]  [<ffffffff8109a180>] ? kthread_park+0x60/0x60
[   85.316905]  [<ffffffff81917652>] ret_from_fork+0x22/0x30
[   85.322965] Code: 8b 83 80 00 00 00 c7 40 3c ff ff ff ff 48 8b bb 80 00 00 00 e8 80 36 00 00 8b 05 0a 55 0b 00 85 c0 75 71
 4d 8b ac 24 58 02 00 00 <49> 8b 45 70 48 39 43 70 74 50 4d 85 ed 74 13 48 c7 c0 a0 81 58 
[   85.344442] RIP  [<ffffffffa022e4c2>] reset_common_ring+0xa2/0x130 [i915]
[   85.352081]  RSP <ffffc900007fbb30>
[   85.355984] CR2: 0000000000000070
[   85.373716] ---[ end trace d4d4d62e81cbe6bc ]---
[   85.382625] BUG: unable to handle kernel paging request at ffffffffffffffd8
[   85.390467] IP: [<ffffffff8109acb1>] kthread_data+0x11/0x20
[   85.396736] PGD 2e08067 [   85.399378] PUD 2e0a067 
PMD 0 [   85.402813] 
[   85.404484] Oops: 0000 [#2] PREEMPT SMP
[   85.408786] Modules linked in: intel_ipu4_isys_mod_bxtB0(O) videobuf2_v4l2 videobuf2_core intel_ipu4_psys_mod_bxtB0(O) int
el_ipu4_mmu_bxtB0(O) intel_ipu4_mod_bxtB0(O) iova intel_ipu4_acpi(O) videobuf2_dma_contig videobuf2_memops videobuf_core dw97
14(O) crlmodule(O) v4l2_common videodev media rfcomm usb_f_mtp usb_f_ecm u_ether usb_f_acm u_serial libcomposite configfs snd
_soc_wm8998 extcon_arizona snd_soc_arizona arizona_micsupp extcon_core snd_soc_core snd_compress ac97_bus arizona_ldo1 gpio_a
rizona iptable_nat nf_nat_ipv4 nf_nat bnep iptable_mangle snd_hda_codec_hdmi mei_spd gpio_keys intel_rapl x86_pkg_temp_therma
l intel_powerclamp coretemp efivars clk_wcove typec_wcove arc4 gpio_wcove iwlmvm(O) mac80211(O) pwm_lpss_pci pwm_lpss btusb b
trtl btbcm iwlwifi(O) spi_pxa2xx_platform snd_hda_intel cfg80211(O) snd_hda_codec snd_hda_core compat(O) snd_pcm i915 fdp_i2c
 fdp i2c_designware_platform i2c_designware_core nci mei_me snd_timer processor_thermal_device dwc3_pci nfc mei intel_soc_dts
_iosf at24 bq25890_charger atmel_mxt_ts nvmem_core hci_uart btintel int3400_thermal acpi_thermal_rel video int3403_thermal in
t340x_thermal_zone soc_button_array nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables nf_conntrack_ipv4 nf_defrag_i
pv4 xt_tcpudp xt_conntrack nf_conntrack iptable_filter ip_tables x_tables uio arizona_i2c 5xx_comms_leds(O)
[   85.542261] CPU: 0 PID: 472 Comm: kworker/0:2 Tainted: G      D    O    4.9.27-intel-pk-standard #1
[   85.552419] Hardware name: Intel Corp. 570x DVT2/SDS, BIOS GTPP1H3A.X64.0143.B30.1706022158 06/02/2017
[   85.562877] task: ffff880179cba340 task.stack: ffffc900007f8000
[   85.569521] RIP: 0010:[<ffffffff8109acb1>]  [<ffffffff8109acb1>] kthread_data+0x11/0x20
[   85.578514] RSP: 0018:ffffc900007fbe68  EFLAGS: 00010002
[   85.584474] RAX: 0000000000000000 RBX: ffff88017fc17500 RCX: 0000000000000000
[   85.592483] RDX: ffff88017b005000 RSI: ffff880179cba3c0 RDI: ffff880179cba340
[   85.600492] RBP: ffffc900007fbe70 R08: 0000000000000000 R09: 0000000000000000
[   85.608501] R10: 0000000000000000 R11: ffff880179cba3c0 R12: ffff880179cba340
[   85.616501] R13: 0000000000000000 R14: ffff880179cba808 R15: 0000000000017500
[   85.624511] FS:  0000000000000000(0000) GS:ffff88017fc00000(0000) knlGS:0000000000000000
[   85.633595] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   85.640044] CR2: 0000000000000028 CR3: 000000017abd3000 CR4: 00000000003406f0
[   85.648054] Stack:
[   85.650306]  ffffffff8109602e ffffc900007fbec8 ffffffff819125be ffffc900007fbee0
[   85.658651]  ffffffff8107ead6 0000000000000000 0000000000000000 ffff880179cba340
[   85.666993]  ffffc900007fbf10 ffffc900007fbb20 0000000000000000 0000000000000009
[   85.675334] Call Trace:
[   85.678075]  [<ffffffff8109602e>] ? wq_worker_sleeping+0xe/0x80
[   85.684721]  [<ffffffff819125be>] __schedule+0x35e/0x5a0
[   85.690683]  [<ffffffff8107ead6>] ? release_task+0x2d6/0x3c0
[   85.697035]  [<ffffffff810a59e8>] do_task_dead+0x38/0x40
[   85.702996]  [<ffffffff8108033f>] do_exit+0x79f/0xb00
[   85.708657]  [<ffffffff819187d7>] rewind_stack_do_exit+0x17/0x20
[   85.715400] Code: 80 04 00 00 48 c7 c7 a8 7d bc 81 e8 7a 0f fe ff eb ca 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 87 68
 04 00 00 55 48 89 e5 5d <48> 8b 40 d8 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 
[   85.737001] RIP  [<ffffffff8109acb1>] kthread_data+0x11/0x20
[   85.743361]  RSP <ffffc900007fbe68>
[   85.747292] CR2: ffffffffffffffd8
[   85.751010] ---[ end trace d4d4d62e81cbe6bd ]---
[   85.756160] Fixing recursive fault but reboot is needed!
[   85.762094] BUG: scheduling while atomic: kworker/0:2/472/0x00000003
[   85.769194] Modules linked in: intel_ipu4_isys_mod_bxtB0(O) videobuf2_v4l2 videobuf2_core intel_ipu4_psys_mod_bxtB0(O) int
el_ipu4_mmu_bxtB0(O) intel_ipu4_mod_bxtB0(O) iova intel_ipu4_acpi(O) videobuf2_dma_contig videobuf2_memops videobuf_core dw97
14(O) crlmodule(O) v4l2_common videodev media rfcomm usb_f_mtp usb_f_ecm u_ether usb_f_acm u_serial libcomposite configfs snd
_soc_wm8998 extcon_arizona snd_soc_arizona arizona_micsupp extcon_core snd_soc_core snd_compress ac97_bus arizona_ldo1 gpio_a
rizona iptable_nat nf_nat_ipv4 nf_nat bnep iptable_mangle snd_hda_codec_hdmi mei_spd gpio_keys intel_rapl x86_pkg_temp_therma
l intel_powerclamp coretemp efivars clk_wcove typec_wcove arc4 gpio_wcove iwlmvm(O) mac80211(O) pwm_lpss_pci pwm_lpss btusb b
trtl btbcm iwlwifi(O) spi_pxa2xx_platform snd_hda_intel cfg80211(O) snd_hda_codec snd_hda_core compat(O) snd_pcm i915 fdp_i2c
 fdp i2c_designware_platform i2c_designware_core nci mei_me snd_timer processor_thermal_device dwc3_pci nfc mei intel_soc_dts
_iosf at24 bq25890_charger atmel_mxt_ts nvmem_core hci_uart btintel int3400_thermal acpi_thermal_rel video int3403_thermal in
t340x_thermal_zone soc_button_array nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables nf_conntrack_ipv4 nf_defrag_i
pv4 xt_tcpudp xt_conntrack nf_conntrack iptable_filter ip_tables x_tables uio arizona_i2c 5xx_comms_leds(O)
[   85.901747] CPU: 0 PID: 472 Comm: kworker/0:2 Tainted: G      D    O    4.9.27-intel-pk-standard #1
[   85.911860] Hardware name: Intel Corp. 570x DVT2/SDS, BIOS GTPP1H3A.X64.0143.B30.1706022158 06/02/2017
[   85.922271]  ffffc900007fbe60 ffffffff813fc6da ffff88017fc17500 ffff880179cba340
[   85.930552]  ffffc900007fbe70 ffffffff810a0cbf ffffc900007fbec8 ffffffff81912654
[   85.938830]  ffffc900007fbee0 ffffffff81149e01 0000000000000008 ffffc900007fbef0
[   85.947120] Call Trace:
[   85.949851]  [<ffffffff813fc6da>] dump_stack+0x4d/0x63
[   85.955592]  [<ffffffff810a0cbf>] __schedule_bug+0x4f/0x70
[   85.961721]  [<ffffffff81912654>] __schedule+0x3f4/0x5a0
[   85.967657]  [<ffffffff81149e01>] ? printk+0x48/0x50
[   85.973203]  [<ffffffff8191283d>] schedule+0x3d/0x90
[   85.978747]  [<ffffffff810804da>] do_exit+0x93a/0xb00
[   85.984388]  [<ffffffff819187d7>] rewind_stack_do_exit+0x17/0x20

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions