Kernel Panic Procedures
This article is a work in progress or otherwise under review and does not represent current policy.
Contents
- Synopsis
- Preliminary Notes
- Obtaining a Kernel Dump
- Finding which line caused the crash
- Backtrace through trap() in GDB
- Example Crash: Force Panic from WSCons via KVM: Dell DRAC4
- What Now
- Processing the core dump
Synopsis
Although a few official NetBSD.org documents exist on the topics of using the advanced kernel debugging using KGDB (Kernelized GNU Debugger (GDB)), there are few documents which formalize a "Kernel Panic/Crash Reporting Procedure" using a combination of DDB (the minimalist in-kernel debugger) in combination with GDB after the crash.
http://www.netbsd.org/docs/kernel/#ddb
Preliminary Notes
If the problem is easily re-created, try to obtain a kernel backtrace
The DDB is the minimalist kernel Debugger added by options DDB
to the kernel
Obtain a backtrace at the db{0}>
prompt using the bt
command
Search the Mailing List Archives and Query the NetBSD.org PR database for reports of similar issues.
Post the problem for the discussion on the appropriate mailing list.
Obtaining a Kernel Dump
A kernel dump is possible to obtain from many kernel panics. When at the DB prompt, simply execute:
db{0}> sync
The dump of memory will be written to the swap partition.
At boot time the swap file coredump will be saved to /var/crash
.
The default settings to control this behaviour is in /etc/defaults/rc.conf
and can be overriden in /etc/rc.conf
savecore=yes
savecore_flags="-N /netbsd -z"
savecore_dir="/var/crash"
A gzip(1) compressed file will be available for analysis with gdb(1) or crash(8). To load the core dump into gdb
, after uncompressing it with gunzip(1), use target kvm /path/to/netbsd.core
.
Your swap partition must be at least the size of your physical RAM
Your /var/crash
partition must have sufficient space to hold the same file.
Finding which line caused the crash
With a back trace, it's possible to translate the an address to a line in source code.
Stopped in pid 496.1 (gdb) at netbsd:breakpoint+0x5: leave
To find the address of breakpoint function in the running kernel, use nm(1).
nm /netbsd | grep breakpoint
ffffffff8021df70 T breakpoint
ffffffff8079d944 T db_breakpoint_cmd
ffffffff81644b38 d db_breakpoint_list
ffffffff81644b30 d db_breakpoints_inserted
ffffffff8079d892 T db_clear_breakpoints
ffffffff8079d7d0 t db_find_breakpoint
ffffffff8079d824 T db_find_breakpoint_here
ffffffff81644b40 d db_free_breakpoints
ffffffff81644b48 d db_next_free_breakpoint
ffffffff8079d835 T db_set_breakpoints
Then add 0x5
to the address (0x5
is obtained from the panic message above, not a fixed value for all) and use addr2line(1)
addr2line -e /netbsd ffffffff8021df75
Backtrace through trap() in GDB
In gdb(1) import the stack script and run the stack
command.
(gdb) source /usr/src/sys/arch/i386/gdbscripts/stack
See port-i386/10313 for more info.
Example Crash: Force Panic from WSCons via KVM: Dell DRAC4
You can invoke the kernel debugger from the console on amd64/i386 using the special key sequence: Control+Alt+Esc. See the "Entering the debugger" section of ddb(9) for the key sequence on other platforms.
Once in the debugger, you can instruct the KDB to run a preliminary backtrace to get a general idea of what went wrong using the bt
command.
You can then force a sync of the file system and and dump of the kernel memory into the swap partition using the sync
command.
On the subsequent boot, the /etc/rc.d/savecore
script will perform the necessary tasks to archive and gzip(1) the dump.
You can then load the core dump into gdb(1) or crash(8)
What Now
You can submit the feedback as a PR to the NetBSD GNATS system.
Processing the core dump
Hubert Feyrer has a great guide to analyzing kernel panic core dumps
Additionally, the following command below can be used to create a relatively useful backtrace:
localhost# cd /var/crash localhost# gunzip -d *gz localhost# gdb --symbols=/netbsd.gdb --quiet --eval-command="file /netbsd.gdb" \ --eval-command="target kvm netbsd.1.core" --eval-command "bt" \ --eval-command "list" --eval-command "info all-registers" 2>&1 Load new symbol table from "/netbsd.gdb"? (y or n) y Reading symbols from /netbsd.gdb...done. #0 0xc047c9f8 in cpu_reboot (howto=256, bootstr=0x0) at /usr/src/sys/arch/i386/i386/machdep.c:927 927 dumpsys(); #0 0xc047c9f8 in cpu_reboot (howto=256, bootstr=0x0) at /usr/src/sys/arch/i386/i386/machdep.c:927 #1 0xc01c3f2a in db_sync_cmd (addr=-1065223264, have_addr=false, count=-1071881791, modif=0xcc883c04 "[BINARY]") at /usr/src/sys/ddb/db_command.c:1304 #2 0xc01c45fa in db_command (last_cmdp=0xc07dfe3c) at /usr/src/sys/ddb/db_command.c:926 #3 0xc01c4856 in db_command_loop () at /usr/src/sys/ddb/db_command.c:583 #4 0xc01c7320 in db_trap (type=1, code=0) at /usr/src/sys/ddb/db_trap.c:101 #5 0xc0478855 in kdb_trap (type=1, code=0, regs=0xcc883e3c) at /usr/src/sys/arch/i386/i386/db_interface.c:229 #6 0xc047efe2 in trap (frame=0xcc883e3c) at /usr/src/sys/arch/i386/i386/trap.c:350 #7 0xc010cb80 in calltrap () #8 0xc047717c in breakpoint () #9 0xc02e3676 in wskbd_translate (id=0xc0833ae0, type=2, value=<value optimized out>) at /usr/src/sys/dev/wscons/wskbd.c:1586 #10 0xc02e386e in wskbd_input (dev=0xcc888800, type=2, value=1) at /usr/src/sys/dev/wscons/wskbd.c:682 #11 0xc054c27a in pckbd_input (vsc=0xcc0cc6a8, data=1) at /usr/src/sys/dev/pckbport/pckbd.c:584 #12 0xc02ba80d in pckbcintr (vsc=0xcc0d6ebc) at /usr/src/sys/dev/ic/pckbc.c:607 #13 0xc0465798 in intr_biglock_wrapper (vp=0xc2e853c0) at /usr/src/sys/arch/x86/x86/intr.c:617 #14 0xc01036d9 in Xintr_ioapic_edge3 () #15 0xc0477234 in x86_mwait () Previous frame inner to this frame (corrupt stack?) 922 /* Disable interrupts. */ 923 splhigh(); 924 925 /* Do a dump if requested. */ 926 if ((howto & (RB_DUMP | RB_HALT)) == RB_DUMP) 927 dumpsys(); 928 929 haltsys: 930 doshutdownhooks(); 931 eax 0x0 0 ecx 0x0 0 edx 0x0 0 ebx 0x100 256 esp 0xcc883bb8 0xcc883bb8 ebp 0xcc883bc0 0xcc883bc0 esi 0xc07dfe3c -1065484740 edi 0x0 0 eip 0xc047c9f8 0xc047c9f8 <cpu_reboot+368> eflags 0x0 [ ] cs 0x0 0 ss 0x0 0 ds 0x0 0 es 0x0 0 fs 0x0 0 gs 0x0 0 st0 0 (raw 0x00000000000000000000) st1 0 (raw 0x00000000000000000000) st2 0 (raw 0x00000000000000000000) st3 0 (raw 0x00000000000000000000) st4 0 (raw 0x00000000000000000000) st5 0 (raw 0x00000000000000000000) st6 0 (raw 0x00000000000000000000) st7 0 (raw 0x00000000000000000000) fctrl 0x0 0 fstat 0x0 0 ftag 0x0 0 fiseg 0x0 0 fioff 0x0 0 foseg 0x0 0 fooff 0x0 0 fop 0x0 0 xmm0 xmm1 xmm2 xmm3 xmm4 xmm5 xmm6 xmm7 mm0 mm1 mm2 mm3 mm4 mm5 mm6 mm7