[Linux Kernel] How to Detect User Space Crashes in the Kernel Log
[Linux Kernel] How to Detect User Space Crashes in the Kernel Log
Environment:
Background
During system bring-up, unexpected issues can arise, including crashes in user space. One common challenge is that these crashes do not appear in the kernel log by default. However, the Linux kernel includes a feature that allows tracing user space crash signatures in the kernel log.
This feature is enabled using the CONFIG_DEBUG_USER option.
How to Enable CONFIG_DEBUG_USER
Step 1: Turn On CONFIG_DEBUG_USER
To enable this feature, add the CONFIG_DEBUG_USER=y configuration to the kernel. Below is an example patch for the Raspbian kernel configuration file:
diff --git a/arch/arm/configs/bcm2711_defconfig b/arch/arm/configs/bcm2711_defconfig
index be7a837cc2d6..dc7052f49285 100644
--- a/arch/arm/configs/bcm2711_defconfig
+++ b/arch/arm/configs/bcm2711_defconfig
@@ -6,6 +6,7 @@ CONFIG_GENERIC_IRQ_DEBUGFS=y
CONFIG_NO_HZ=y
CONFIG_HIGH_RES_TIMERS=y
CONFIG_BPF_SYSCALL=y
+CONFIG_DEBUG_USER=y
CONFIG_PREEMPT_VOLUNTARY=y
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_BSD_PROCESS_ACCT_V3=y
With this change, the kernel will compile code blocks guarded by CONFIG_DEBUG_USER, such as the one in arch/arm/mm/fault.c, that is seen as follows:
static void
__do_user_fault(unsigned long addr, unsigned int fsr, unsigned int sig,
int code, struct pt_regs *regs)
{
struct task_struct *tsk = current;
if (addr > TASK_SIZE)
harden_branch_predictor();
#ifdef CONFIG_DEBUG_USER
if (((user_debug & UDBG_SEGV) && (sig == SIGSEGV)) ||
((user_debug & UDBG_BUS) && (sig == SIGBUS))) {
pr_err("8<--- cut here ---\n");
pr_err("%s: unhandled page fault (%d) at 0x%08lx, code 0x%03x\n",
tsk->comm, sig, addr, fsr);
show_pte(KERN_ERR, tsk->mm, addr);
show_regs(regs);
}
#endif
Recommended by LinkedIn
Step 2: Update Kernel Command Line
Add the user_debug=31 parameter to the kernel command line to enable detailed logging of user space crashes. Below is an example patch for the device tree source file:
diff --git a/arch/arm/boot/dts/bcm2711-rpi-4-b.dts b/arch/arm/boot/dts/bcm2711-rpi-4-b.dts
index 8c0ab39beea1..33d36b4f89fa 100644
--- a/arch/arm/boot/dts/bcm2711-rpi-4-b.dts
+++ b/arch/arm/boot/dts/bcm2711-rpi-4-b.dts
@@ -282,7 +282,7 @@
/ {
chosen {
- bootargs = "coherent_pool=1M 8250.nr_uarts=1 snd_bcm2835.enable_compat_alsa=0 snd_bcm2835.enable_hdmi=1";
+ bootargs = "coherent_pool=1M 8250.nr_uarts=1 snd_bcm2835.enable_compat_alsa=0 snd_bcm2835.enable_hdmi=1 user_debug=31";
};
aliases {
After applying these patches, rebuild the kernel and flash the new image onto your device, such as a Raspberry Pi.
How to View User Space Crash Logs
Once the feature is enabled and the patches are applied, there is one thing to configure -
When the target device boots up, you need to enable the print-fatal-signals feature. You can do this by running the following command:
echo 1 > /proc/sys/kernel/print-fatal-signals
Once you use above command, the configuration is ready. If the user space crash occurs, the kernel log will show user space crash signatures, such as segmentation faults. Below is an example of what you might see in the kernel log:
<7>[ 82.248793 / 05-21 22:27:31.925][2] rpi.ui: unhandled page fault (11) at 0x00000000, code 0x005
<1>[ 82.248810 / 05-21 22:27:31.925][2] pgd = da0fc000
<1>[ 82.252580 / 05-21 22:27:31.925][2] [00000000] *pgd=00000000
<6>[ 82.258764 / 05-21 22:27:31.935][2] CPU: 2 PID: 999 Comm: rpi.ui Tainted: G W 4.14.66
<6>[ 82.258781 / 05-21 22:27:31.935][2] task: d7e0d840 ti: d7f02000 task.ti: d7f02000
<6>[ 82.258793 / 05-21 22:27:31.935][2] PC is at 0x73a7c9c0
<6>[ 82.258803 / 05-21 22:27:31.935][2] LR is at 0x0
<6>[ 82.258814 / 05-21 22:27:31.935][2] pc : [<73a7c9c0>] lr : [<00000000>] psr: 00070030
<6>[ 82.258814 / 05-21 22:27:31.935][2] sp : 885bc480 ip : 00000000 fp : 131e3218
<6>[ 82.258830 / 05-21 22:27:31.935][2] r10: 71488688 r9 : 92778200 r8 : 00000000
<6>[ 82.258840 / 05-21 22:27:31.935][2] r7 : 131e2bf0 r6 : 00000000 r5 : 00000001 r4 : 00263c64
<6>[ 82.258851 / 05-21 22:27:31.935][2] r3 : 131e3218 r2 : 71488688 r1 : 00000000 r0 : 71488688
<6>[ 82.258863 / 05-21 22:27:31.935][2] Flags: nzcv IRQs on FIQs on Mode USER_32 ISA Thumb Segment user
<6>[ 82.258873 / 05-21 22:27:31.935][2] Control: 10c0383d Table: 9a0fc06a DAC: 00000051
<6>[ 82.258885 / 05-21 22:27:31.935][2] CPU: 2 PID: 999 Comm: pi.ui Tainted: G W 4.14.66-gbbd0bc4-dirty #3
<6>[ 82.258907 / 05-21 22:27:31.935][2] [<c010e400>] (unwind_backtrace) from [<c010b4a8>] (show_stack+0x10/0x14)
<6>[ 82.258924 / 05-21 22:27:31.935][2] [<c010b4a8>] (show_stack) from [<c0fd0740>] (dump_stack+0x78/0x98)
<6>[ 82.258940 / 05-21 22:27:31.935][2] [<c0fd0740>] (dump_stack) from [<c0115f8c>] (__do_user_fault+0x108/0x19c)
<6>[ 82.258957 / 05-21 22:27:31.935][2] [<c0115f8c>] (__do_user_fault) from [<c0fe05b8>] (do_page_fault+0x33c/0x3e8)
<6>[ 82.258972 / 05-21 22:27:31.935][2] [<c0fe05b8>] (do_page_fault) from [<c0100404>] (do_DataAbort+0x34/0x184)
<6>[ 82.258986 / 05-21 22:27:31.935][2] [<c0100404>] (do_DataAbort) from [<c0fdec3c>] (__dabt_usr+0x3c/0x40)
<6>[ 82.258997 / 05-21 22:27:31.935][2] Exception stack(0xd7f03fb0 to 0xd7f03ff8)
<6>[ 82.259009 / 05-21 22:27:31.935][2] 3fa0: 71488688 00000000 71488688 131e3218
<6>[ 82.259023 / 05-21 22:27:31.935][2] 3fc0: 00263c64 00000001 00000000 131e2bf0 00000000 92778200 71488688 131e3218
<6>[ 82.259036 / 05-21 22:27:31.935][2] 3fe0: 00000000 885bc480 00000000 73a7c9c0 00070030 ffffffff
Conclusion
Enabling CONFIG_DEBUG_USER and adding user_debug=31 to the kernel command line are simple but powerful steps for tracing user space crashes. Personally, I frequently use this feature during system bring-up to identify and resolve user space crashes efficiently. It’s a valuable tool for debugging and improving system reliability.