1*4882a593SmuzhiyunBug hunting 2*4882a593Smuzhiyun=========== 3*4882a593Smuzhiyun 4*4882a593SmuzhiyunKernel bug reports often come with a stack dump like the one below:: 5*4882a593Smuzhiyun 6*4882a593Smuzhiyun ------------[ cut here ]------------ 7*4882a593Smuzhiyun WARNING: CPU: 1 PID: 28102 at kernel/module.c:1108 module_put+0x57/0x70 8*4882a593Smuzhiyun Modules linked in: dvb_usb_gp8psk(-) dvb_usb dvb_core nvidia_drm(PO) nvidia_modeset(PO) snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd soundcore nvidia(PO) [last unloaded: rc_core] 9*4882a593Smuzhiyun CPU: 1 PID: 28102 Comm: rmmod Tainted: P WC O 4.8.4-build.1 #1 10*4882a593Smuzhiyun Hardware name: MSI MS-7309/MS-7309, BIOS V1.12 02/23/2009 11*4882a593Smuzhiyun 00000000 c12ba080 00000000 00000000 c103ed6a c1616014 00000001 00006dc6 12*4882a593Smuzhiyun c1615862 00000454 c109e8a7 c109e8a7 00000009 ffffffff 00000000 f13f6a10 13*4882a593Smuzhiyun f5f5a600 c103ee33 00000009 00000000 00000000 c109e8a7 f80ca4d0 c109f617 14*4882a593Smuzhiyun Call Trace: 15*4882a593Smuzhiyun [<c12ba080>] ? dump_stack+0x44/0x64 16*4882a593Smuzhiyun [<c103ed6a>] ? __warn+0xfa/0x120 17*4882a593Smuzhiyun [<c109e8a7>] ? module_put+0x57/0x70 18*4882a593Smuzhiyun [<c109e8a7>] ? module_put+0x57/0x70 19*4882a593Smuzhiyun [<c103ee33>] ? warn_slowpath_null+0x23/0x30 20*4882a593Smuzhiyun [<c109e8a7>] ? module_put+0x57/0x70 21*4882a593Smuzhiyun [<f80ca4d0>] ? gp8psk_fe_set_frontend+0x460/0x460 [dvb_usb_gp8psk] 22*4882a593Smuzhiyun [<c109f617>] ? symbol_put_addr+0x27/0x50 23*4882a593Smuzhiyun [<f80bc9ca>] ? dvb_usb_adapter_frontend_exit+0x3a/0x70 [dvb_usb] 24*4882a593Smuzhiyun [<f80bb3bf>] ? dvb_usb_exit+0x2f/0xd0 [dvb_usb] 25*4882a593Smuzhiyun [<c13d03bc>] ? usb_disable_endpoint+0x7c/0xb0 26*4882a593Smuzhiyun [<f80bb48a>] ? dvb_usb_device_exit+0x2a/0x50 [dvb_usb] 27*4882a593Smuzhiyun [<c13d2882>] ? usb_unbind_interface+0x62/0x250 28*4882a593Smuzhiyun [<c136b514>] ? __pm_runtime_idle+0x44/0x70 29*4882a593Smuzhiyun [<c13620d8>] ? __device_release_driver+0x78/0x120 30*4882a593Smuzhiyun [<c1362907>] ? driver_detach+0x87/0x90 31*4882a593Smuzhiyun [<c1361c48>] ? bus_remove_driver+0x38/0x90 32*4882a593Smuzhiyun [<c13d1c18>] ? usb_deregister+0x58/0xb0 33*4882a593Smuzhiyun [<c109fbb0>] ? SyS_delete_module+0x130/0x1f0 34*4882a593Smuzhiyun [<c1055654>] ? task_work_run+0x64/0x80 35*4882a593Smuzhiyun [<c1000fa5>] ? exit_to_usermode_loop+0x85/0x90 36*4882a593Smuzhiyun [<c10013f0>] ? do_fast_syscall_32+0x80/0x130 37*4882a593Smuzhiyun [<c1549f43>] ? sysenter_past_esp+0x40/0x6a 38*4882a593Smuzhiyun ---[ end trace 6ebc60ef3981792f ]--- 39*4882a593Smuzhiyun 40*4882a593SmuzhiyunSuch stack traces provide enough information to identify the line inside the 41*4882a593SmuzhiyunKernel's source code where the bug happened. Depending on the severity of 42*4882a593Smuzhiyunthe issue, it may also contain the word **Oops**, as on this one:: 43*4882a593Smuzhiyun 44*4882a593Smuzhiyun BUG: unable to handle kernel NULL pointer dereference at (null) 45*4882a593Smuzhiyun IP: [<c06969d4>] iret_exc+0x7d0/0xa59 46*4882a593Smuzhiyun *pdpt = 000000002258a001 *pde = 0000000000000000 47*4882a593Smuzhiyun Oops: 0002 [#1] PREEMPT SMP 48*4882a593Smuzhiyun ... 49*4882a593Smuzhiyun 50*4882a593SmuzhiyunDespite being an **Oops** or some other sort of stack trace, the offended 51*4882a593Smuzhiyunline is usually required to identify and handle the bug. Along this chapter, 52*4882a593Smuzhiyunwe'll refer to "Oops" for all kinds of stack traces that need to be analyzed. 53*4882a593Smuzhiyun 54*4882a593SmuzhiyunIf the kernel is compiled with ``CONFIG_DEBUG_INFO``, you can enhance the 55*4882a593Smuzhiyunquality of the stack trace by using file:`scripts/decode_stacktrace.sh`. 56*4882a593Smuzhiyun 57*4882a593SmuzhiyunModules linked in 58*4882a593Smuzhiyun----------------- 59*4882a593Smuzhiyun 60*4882a593SmuzhiyunModules that are tainted or are being loaded or unloaded are marked with 61*4882a593Smuzhiyun"(...)", where the taint flags are described in 62*4882a593Smuzhiyunfile:`Documentation/admin-guide/tainted-kernels.rst`, "being loaded" is 63*4882a593Smuzhiyunannotated with "+", and "being unloaded" is annotated with "-". 64*4882a593Smuzhiyun 65*4882a593Smuzhiyun 66*4882a593SmuzhiyunWhere is the Oops message is located? 67*4882a593Smuzhiyun------------------------------------- 68*4882a593Smuzhiyun 69*4882a593SmuzhiyunNormally the Oops text is read from the kernel buffers by klogd and 70*4882a593Smuzhiyunhanded to ``syslogd`` which writes it to a syslog file, typically 71*4882a593Smuzhiyun``/var/log/messages`` (depends on ``/etc/syslog.conf``). On systems with 72*4882a593Smuzhiyunsystemd, it may also be stored by the ``journald`` daemon, and accessed 73*4882a593Smuzhiyunby running ``journalctl`` command. 74*4882a593Smuzhiyun 75*4882a593SmuzhiyunSometimes ``klogd`` dies, in which case you can run ``dmesg > file`` to 76*4882a593Smuzhiyunread the data from the kernel buffers and save it. Or you can 77*4882a593Smuzhiyun``cat /proc/kmsg > file``, however you have to break in to stop the transfer, 78*4882a593Smuzhiyunsince ``kmsg`` is a "never ending file". 79*4882a593Smuzhiyun 80*4882a593SmuzhiyunIf the machine has crashed so badly that you cannot enter commands or 81*4882a593Smuzhiyunthe disk is not available then you have three options: 82*4882a593Smuzhiyun 83*4882a593Smuzhiyun(1) Hand copy the text from the screen and type it in after the machine 84*4882a593Smuzhiyun has restarted. Messy but it is the only option if you have not 85*4882a593Smuzhiyun planned for a crash. Alternatively, you can take a picture of 86*4882a593Smuzhiyun the screen with a digital camera - not nice, but better than 87*4882a593Smuzhiyun nothing. If the messages scroll off the top of the console, you 88*4882a593Smuzhiyun may find that booting with a higher resolution (e.g., ``vga=791``) 89*4882a593Smuzhiyun will allow you to read more of the text. (Caveat: This needs ``vesafb``, 90*4882a593Smuzhiyun so won't help for 'early' oopses.) 91*4882a593Smuzhiyun 92*4882a593Smuzhiyun(2) Boot with a serial console (see 93*4882a593Smuzhiyun :ref:`Documentation/admin-guide/serial-console.rst <serial_console>`), 94*4882a593Smuzhiyun run a null modem to a second machine and capture the output there 95*4882a593Smuzhiyun using your favourite communication program. Minicom works well. 96*4882a593Smuzhiyun 97*4882a593Smuzhiyun(3) Use Kdump (see Documentation/admin-guide/kdump/kdump.rst), 98*4882a593Smuzhiyun extract the kernel ring buffer from old memory with using dmesg 99*4882a593Smuzhiyun gdbmacro in Documentation/admin-guide/kdump/gdbmacros.txt. 100*4882a593Smuzhiyun 101*4882a593SmuzhiyunFinding the bug's location 102*4882a593Smuzhiyun-------------------------- 103*4882a593Smuzhiyun 104*4882a593SmuzhiyunReporting a bug works best if you point the location of the bug at the 105*4882a593SmuzhiyunKernel source file. There are two methods for doing that. Usually, using 106*4882a593Smuzhiyun``gdb`` is easier, but the Kernel should be pre-compiled with debug info. 107*4882a593Smuzhiyun 108*4882a593Smuzhiyungdb 109*4882a593Smuzhiyun^^^ 110*4882a593Smuzhiyun 111*4882a593SmuzhiyunThe GNU debugger (``gdb``) is the best way to figure out the exact file and line 112*4882a593Smuzhiyunnumber of the OOPS from the ``vmlinux`` file. 113*4882a593Smuzhiyun 114*4882a593SmuzhiyunThe usage of gdb works best on a kernel compiled with ``CONFIG_DEBUG_INFO``. 115*4882a593SmuzhiyunThis can be set by running:: 116*4882a593Smuzhiyun 117*4882a593Smuzhiyun $ ./scripts/config -d COMPILE_TEST -e DEBUG_KERNEL -e DEBUG_INFO 118*4882a593Smuzhiyun 119*4882a593SmuzhiyunOn a kernel compiled with ``CONFIG_DEBUG_INFO``, you can simply copy the 120*4882a593SmuzhiyunEIP value from the OOPS:: 121*4882a593Smuzhiyun 122*4882a593Smuzhiyun EIP: 0060:[<c021e50e>] Not tainted VLI 123*4882a593Smuzhiyun 124*4882a593SmuzhiyunAnd use GDB to translate that to human-readable form:: 125*4882a593Smuzhiyun 126*4882a593Smuzhiyun $ gdb vmlinux 127*4882a593Smuzhiyun (gdb) l *0xc021e50e 128*4882a593Smuzhiyun 129*4882a593SmuzhiyunIf you don't have ``CONFIG_DEBUG_INFO`` enabled, you use the function 130*4882a593Smuzhiyunoffset from the OOPS:: 131*4882a593Smuzhiyun 132*4882a593Smuzhiyun EIP is at vt_ioctl+0xda8/0x1482 133*4882a593Smuzhiyun 134*4882a593SmuzhiyunAnd recompile the kernel with ``CONFIG_DEBUG_INFO`` enabled:: 135*4882a593Smuzhiyun 136*4882a593Smuzhiyun $ ./scripts/config -d COMPILE_TEST -e DEBUG_KERNEL -e DEBUG_INFO 137*4882a593Smuzhiyun $ make vmlinux 138*4882a593Smuzhiyun $ gdb vmlinux 139*4882a593Smuzhiyun (gdb) l *vt_ioctl+0xda8 140*4882a593Smuzhiyun 0x1888 is in vt_ioctl (drivers/tty/vt/vt_ioctl.c:293). 141*4882a593Smuzhiyun 288 { 142*4882a593Smuzhiyun 289 struct vc_data *vc = NULL; 143*4882a593Smuzhiyun 290 int ret = 0; 144*4882a593Smuzhiyun 291 145*4882a593Smuzhiyun 292 console_lock(); 146*4882a593Smuzhiyun 293 if (VT_BUSY(vc_num)) 147*4882a593Smuzhiyun 294 ret = -EBUSY; 148*4882a593Smuzhiyun 295 else if (vc_num) 149*4882a593Smuzhiyun 296 vc = vc_deallocate(vc_num); 150*4882a593Smuzhiyun 297 console_unlock(); 151*4882a593Smuzhiyun 152*4882a593Smuzhiyunor, if you want to be more verbose:: 153*4882a593Smuzhiyun 154*4882a593Smuzhiyun (gdb) p vt_ioctl 155*4882a593Smuzhiyun $1 = {int (struct tty_struct *, unsigned int, unsigned long)} 0xae0 <vt_ioctl> 156*4882a593Smuzhiyun (gdb) l *0xae0+0xda8 157*4882a593Smuzhiyun 158*4882a593SmuzhiyunYou could, instead, use the object file:: 159*4882a593Smuzhiyun 160*4882a593Smuzhiyun $ make drivers/tty/ 161*4882a593Smuzhiyun $ gdb drivers/tty/vt/vt_ioctl.o 162*4882a593Smuzhiyun (gdb) l *vt_ioctl+0xda8 163*4882a593Smuzhiyun 164*4882a593SmuzhiyunIf you have a call trace, such as:: 165*4882a593Smuzhiyun 166*4882a593Smuzhiyun Call Trace: 167*4882a593Smuzhiyun [<ffffffff8802c8e9>] :jbd:log_wait_commit+0xa3/0xf5 168*4882a593Smuzhiyun [<ffffffff810482d9>] autoremove_wake_function+0x0/0x2e 169*4882a593Smuzhiyun [<ffffffff8802770b>] :jbd:journal_stop+0x1be/0x1ee 170*4882a593Smuzhiyun ... 171*4882a593Smuzhiyun 172*4882a593Smuzhiyunthis shows the problem likely is in the :jbd: module. You can load that module 173*4882a593Smuzhiyunin gdb and list the relevant code:: 174*4882a593Smuzhiyun 175*4882a593Smuzhiyun $ gdb fs/jbd/jbd.ko 176*4882a593Smuzhiyun (gdb) l *log_wait_commit+0xa3 177*4882a593Smuzhiyun 178*4882a593Smuzhiyun.. note:: 179*4882a593Smuzhiyun 180*4882a593Smuzhiyun You can also do the same for any function call at the stack trace, 181*4882a593Smuzhiyun like this one:: 182*4882a593Smuzhiyun 183*4882a593Smuzhiyun [<f80bc9ca>] ? dvb_usb_adapter_frontend_exit+0x3a/0x70 [dvb_usb] 184*4882a593Smuzhiyun 185*4882a593Smuzhiyun The position where the above call happened can be seen with:: 186*4882a593Smuzhiyun 187*4882a593Smuzhiyun $ gdb drivers/media/usb/dvb-usb/dvb-usb.o 188*4882a593Smuzhiyun (gdb) l *dvb_usb_adapter_frontend_exit+0x3a 189*4882a593Smuzhiyun 190*4882a593Smuzhiyunobjdump 191*4882a593Smuzhiyun^^^^^^^ 192*4882a593Smuzhiyun 193*4882a593SmuzhiyunTo debug a kernel, use objdump and look for the hex offset from the crash 194*4882a593Smuzhiyunoutput to find the valid line of code/assembler. Without debug symbols, you 195*4882a593Smuzhiyunwill see the assembler code for the routine shown, but if your kernel has 196*4882a593Smuzhiyundebug symbols the C code will also be available. (Debug symbols can be enabled 197*4882a593Smuzhiyunin the kernel hacking menu of the menu configuration.) For example:: 198*4882a593Smuzhiyun 199*4882a593Smuzhiyun $ objdump -r -S -l --disassemble net/dccp/ipv4.o 200*4882a593Smuzhiyun 201*4882a593Smuzhiyun.. note:: 202*4882a593Smuzhiyun 203*4882a593Smuzhiyun You need to be at the top level of the kernel tree for this to pick up 204*4882a593Smuzhiyun your C files. 205*4882a593Smuzhiyun 206*4882a593SmuzhiyunIf you don't have access to the source code you can still debug some crash 207*4882a593Smuzhiyundumps using the following method (example crash dump output as shown by 208*4882a593SmuzhiyunDave Miller):: 209*4882a593Smuzhiyun 210*4882a593Smuzhiyun EIP is at +0x14/0x4c0 211*4882a593Smuzhiyun ... 212*4882a593Smuzhiyun Code: 44 24 04 e8 6f 05 00 00 e9 e8 fe ff ff 8d 76 00 8d bc 27 00 00 213*4882a593Smuzhiyun 00 00 55 57 56 53 81 ec bc 00 00 00 8b ac 24 d0 00 00 00 8b 5d 08 214*4882a593Smuzhiyun <8b> 83 3c 01 00 00 89 44 24 14 8b 45 28 85 c0 89 44 24 18 0f 85 215*4882a593Smuzhiyun 216*4882a593Smuzhiyun Put the bytes into a "foo.s" file like this: 217*4882a593Smuzhiyun 218*4882a593Smuzhiyun .text 219*4882a593Smuzhiyun .globl foo 220*4882a593Smuzhiyun foo: 221*4882a593Smuzhiyun .byte .... /* bytes from Code: part of OOPS dump */ 222*4882a593Smuzhiyun 223*4882a593Smuzhiyun Compile it with "gcc -c -o foo.o foo.s" then look at the output of 224*4882a593Smuzhiyun "objdump --disassemble foo.o". 225*4882a593Smuzhiyun 226*4882a593Smuzhiyun Output: 227*4882a593Smuzhiyun 228*4882a593Smuzhiyun ip_queue_xmit: 229*4882a593Smuzhiyun push %ebp 230*4882a593Smuzhiyun push %edi 231*4882a593Smuzhiyun push %esi 232*4882a593Smuzhiyun push %ebx 233*4882a593Smuzhiyun sub $0xbc, %esp 234*4882a593Smuzhiyun mov 0xd0(%esp), %ebp ! %ebp = arg0 (skb) 235*4882a593Smuzhiyun mov 0x8(%ebp), %ebx ! %ebx = skb->sk 236*4882a593Smuzhiyun mov 0x13c(%ebx), %eax ! %eax = inet_sk(sk)->opt 237*4882a593Smuzhiyun 238*4882a593Smuzhiyunfile:`scripts/decodecode` can be used to automate most of this, depending 239*4882a593Smuzhiyunon what CPU architecture is being debugged. 240*4882a593Smuzhiyun 241*4882a593SmuzhiyunReporting the bug 242*4882a593Smuzhiyun----------------- 243*4882a593Smuzhiyun 244*4882a593SmuzhiyunOnce you find where the bug happened, by inspecting its location, 245*4882a593Smuzhiyunyou could either try to fix it yourself or report it upstream. 246*4882a593Smuzhiyun 247*4882a593SmuzhiyunIn order to report it upstream, you should identify the mailing list 248*4882a593Smuzhiyunused for the development of the affected code. This can be done by using 249*4882a593Smuzhiyunthe ``get_maintainer.pl`` script. 250*4882a593Smuzhiyun 251*4882a593SmuzhiyunFor example, if you find a bug at the gspca's sonixj.c file, you can get 252*4882a593Smuzhiyunits maintainers with:: 253*4882a593Smuzhiyun 254*4882a593Smuzhiyun $ ./scripts/get_maintainer.pl -f drivers/media/usb/gspca/sonixj.c 255*4882a593Smuzhiyun Hans Verkuil <hverkuil@xs4all.nl> (odd fixer:GSPCA USB WEBCAM DRIVER,commit_signer:1/1=100%) 256*4882a593Smuzhiyun Mauro Carvalho Chehab <mchehab@kernel.org> (maintainer:MEDIA INPUT INFRASTRUCTURE (V4L/DVB),commit_signer:1/1=100%) 257*4882a593Smuzhiyun Tejun Heo <tj@kernel.org> (commit_signer:1/1=100%) 258*4882a593Smuzhiyun Bhaktipriya Shridhar <bhaktipriya96@gmail.com> (commit_signer:1/1=100%,authored:1/1=100%,added_lines:4/4=100%,removed_lines:9/9=100%) 259*4882a593Smuzhiyun linux-media@vger.kernel.org (open list:GSPCA USB WEBCAM DRIVER) 260*4882a593Smuzhiyun linux-kernel@vger.kernel.org (open list) 261*4882a593Smuzhiyun 262*4882a593SmuzhiyunPlease notice that it will point to: 263*4882a593Smuzhiyun 264*4882a593Smuzhiyun- The last developers that touched the source code (if this is done inside 265*4882a593Smuzhiyun a git tree). On the above example, Tejun and Bhaktipriya (in this 266*4882a593Smuzhiyun specific case, none really envolved on the development of this file); 267*4882a593Smuzhiyun- The driver maintainer (Hans Verkuil); 268*4882a593Smuzhiyun- The subsystem maintainer (Mauro Carvalho Chehab); 269*4882a593Smuzhiyun- The driver and/or subsystem mailing list (linux-media@vger.kernel.org); 270*4882a593Smuzhiyun- the Linux Kernel mailing list (linux-kernel@vger.kernel.org). 271*4882a593Smuzhiyun 272*4882a593SmuzhiyunUsually, the fastest way to have your bug fixed is to report it to mailing 273*4882a593Smuzhiyunlist used for the development of the code (linux-media ML) copying the 274*4882a593Smuzhiyundriver maintainer (Hans). 275*4882a593Smuzhiyun 276*4882a593SmuzhiyunIf you are totally stumped as to whom to send the report, and 277*4882a593Smuzhiyun``get_maintainer.pl`` didn't provide you anything useful, send it to 278*4882a593Smuzhiyunlinux-kernel@vger.kernel.org. 279*4882a593Smuzhiyun 280*4882a593SmuzhiyunThanks for your help in making Linux as stable as humanly possible. 281*4882a593Smuzhiyun 282*4882a593SmuzhiyunFixing the bug 283*4882a593Smuzhiyun-------------- 284*4882a593Smuzhiyun 285*4882a593SmuzhiyunIf you know programming, you could help us by not only reporting the bug, 286*4882a593Smuzhiyunbut also providing us with a solution. After all, open source is about 287*4882a593Smuzhiyunsharing what you do and don't you want to be recognised for your genius? 288*4882a593Smuzhiyun 289*4882a593SmuzhiyunIf you decide to take this way, once you have worked out a fix please submit 290*4882a593Smuzhiyunit upstream. 291*4882a593Smuzhiyun 292*4882a593SmuzhiyunPlease do read 293*4882a593Smuzhiyun:ref:`Documentation/process/submitting-patches.rst <submittingpatches>` though 294*4882a593Smuzhiyunto help your code get accepted. 295*4882a593Smuzhiyun 296*4882a593Smuzhiyun 297*4882a593Smuzhiyun--------------------------------------------------------------------------- 298*4882a593Smuzhiyun 299*4882a593SmuzhiyunNotes on Oops tracing with ``klogd`` 300*4882a593Smuzhiyun------------------------------------ 301*4882a593Smuzhiyun 302*4882a593SmuzhiyunIn order to help Linus and the other kernel developers there has been 303*4882a593Smuzhiyunsubstantial support incorporated into ``klogd`` for processing protection 304*4882a593Smuzhiyunfaults. In order to have full support for address resolution at least 305*4882a593Smuzhiyunversion 1.3-pl3 of the ``sysklogd`` package should be used. 306*4882a593Smuzhiyun 307*4882a593SmuzhiyunWhen a protection fault occurs the ``klogd`` daemon automatically 308*4882a593Smuzhiyuntranslates important addresses in the kernel log messages to their 309*4882a593Smuzhiyunsymbolic equivalents. This translated kernel message is then 310*4882a593Smuzhiyunforwarded through whatever reporting mechanism ``klogd`` is using. The 311*4882a593Smuzhiyunprotection fault message can be simply cut out of the message files 312*4882a593Smuzhiyunand forwarded to the kernel developers. 313*4882a593Smuzhiyun 314*4882a593SmuzhiyunTwo types of address resolution are performed by ``klogd``. The first is 315*4882a593Smuzhiyunstatic translation and the second is dynamic translation. 316*4882a593SmuzhiyunStatic translation uses the System.map file. 317*4882a593SmuzhiyunIn order to do static translation the ``klogd`` daemon 318*4882a593Smuzhiyunmust be able to find a system map file at daemon initialization time. 319*4882a593SmuzhiyunSee the klogd man page for information on how ``klogd`` searches for map 320*4882a593Smuzhiyunfiles. 321*4882a593Smuzhiyun 322*4882a593SmuzhiyunDynamic address translation is important when kernel loadable modules 323*4882a593Smuzhiyunare being used. Since memory for kernel modules is allocated from the 324*4882a593Smuzhiyunkernel's dynamic memory pools there are no fixed locations for either 325*4882a593Smuzhiyunthe start of the module or for functions and symbols in the module. 326*4882a593Smuzhiyun 327*4882a593SmuzhiyunThe kernel supports system calls which allow a program to determine 328*4882a593Smuzhiyunwhich modules are loaded and their location in memory. Using these 329*4882a593Smuzhiyunsystem calls the klogd daemon builds a symbol table which can be used 330*4882a593Smuzhiyunto debug a protection fault which occurs in a loadable kernel module. 331*4882a593Smuzhiyun 332*4882a593SmuzhiyunAt the very minimum klogd will provide the name of the module which 333*4882a593Smuzhiyungenerated the protection fault. There may be additional symbolic 334*4882a593Smuzhiyuninformation available if the developer of the loadable module chose to 335*4882a593Smuzhiyunexport symbol information from the module. 336*4882a593Smuzhiyun 337*4882a593SmuzhiyunSince the kernel module environment can be dynamic there must be a 338*4882a593Smuzhiyunmechanism for notifying the ``klogd`` daemon when a change in module 339*4882a593Smuzhiyunenvironment occurs. There are command line options available which 340*4882a593Smuzhiyunallow klogd to signal the currently executing daemon that symbol 341*4882a593Smuzhiyuninformation should be refreshed. See the ``klogd`` manual page for more 342*4882a593Smuzhiyuninformation. 343*4882a593Smuzhiyun 344*4882a593SmuzhiyunA patch is included with the sysklogd distribution which modifies the 345*4882a593Smuzhiyun``modules-2.0.0`` package to automatically signal klogd whenever a module 346*4882a593Smuzhiyunis loaded or unloaded. Applying this patch provides essentially 347*4882a593Smuzhiyunseamless support for debugging protection faults which occur with 348*4882a593Smuzhiyunkernel loadable modules. 349*4882a593Smuzhiyun 350*4882a593SmuzhiyunThe following is an example of a protection fault in a loadable module 351*4882a593Smuzhiyunprocessed by ``klogd``:: 352*4882a593Smuzhiyun 353*4882a593Smuzhiyun Aug 29 09:51:01 blizard kernel: Unable to handle kernel paging request at virtual address f15e97cc 354*4882a593Smuzhiyun Aug 29 09:51:01 blizard kernel: current->tss.cr3 = 0062d000, %cr3 = 0062d000 355*4882a593Smuzhiyun Aug 29 09:51:01 blizard kernel: *pde = 00000000 356*4882a593Smuzhiyun Aug 29 09:51:01 blizard kernel: Oops: 0002 357*4882a593Smuzhiyun Aug 29 09:51:01 blizard kernel: CPU: 0 358*4882a593Smuzhiyun Aug 29 09:51:01 blizard kernel: EIP: 0010:[oops:_oops+16/3868] 359*4882a593Smuzhiyun Aug 29 09:51:01 blizard kernel: EFLAGS: 00010212 360*4882a593Smuzhiyun Aug 29 09:51:01 blizard kernel: eax: 315e97cc ebx: 003a6f80 ecx: 001be77b edx: 00237c0c 361*4882a593Smuzhiyun Aug 29 09:51:01 blizard kernel: esi: 00000000 edi: bffffdb3 ebp: 00589f90 esp: 00589f8c 362*4882a593Smuzhiyun Aug 29 09:51:01 blizard kernel: ds: 0018 es: 0018 fs: 002b gs: 002b ss: 0018 363*4882a593Smuzhiyun Aug 29 09:51:01 blizard kernel: Process oops_test (pid: 3374, process nr: 21, stackpage=00589000) 364*4882a593Smuzhiyun Aug 29 09:51:01 blizard kernel: Stack: 315e97cc 00589f98 0100b0b4 bffffed4 0012e38e 00240c64 003a6f80 00000001 365*4882a593Smuzhiyun Aug 29 09:51:01 blizard kernel: 00000000 00237810 bfffff00 0010a7fa 00000003 00000001 00000000 bfffff00 366*4882a593Smuzhiyun Aug 29 09:51:01 blizard kernel: bffffdb3 bffffed4 ffffffda 0000002b 0007002b 0000002b 0000002b 00000036 367*4882a593Smuzhiyun Aug 29 09:51:01 blizard kernel: Call Trace: [oops:_oops_ioctl+48/80] [_sys_ioctl+254/272] [_system_call+82/128] 368*4882a593Smuzhiyun Aug 29 09:51:01 blizard kernel: Code: c7 00 05 00 00 00 eb 08 90 90 90 90 90 90 90 90 89 ec 5d c3 369*4882a593Smuzhiyun 370*4882a593Smuzhiyun--------------------------------------------------------------------------- 371*4882a593Smuzhiyun 372*4882a593Smuzhiyun:: 373*4882a593Smuzhiyun 374*4882a593Smuzhiyun Dr. G.W. Wettstein Oncology Research Div. Computing Facility 375*4882a593Smuzhiyun Roger Maris Cancer Center INTERNET: greg@wind.rmcc.com 376*4882a593Smuzhiyun 820 4th St. N. 377*4882a593Smuzhiyun Fargo, ND 58122 378*4882a593Smuzhiyun Phone: 701-234-7556 379