xref: /OK3568_Linux_fs/kernel/Documentation/admin-guide/bug-hunting.rst (revision 4882a59341e53eb6f0b4789bf948001014eff981)
1*4882a593SmuzhiyunBug hunting
2*4882a593Smuzhiyun===========
3*4882a593Smuzhiyun
4*4882a593SmuzhiyunKernel bug reports often come with a stack dump like the one below::
5*4882a593Smuzhiyun
6*4882a593Smuzhiyun	------------[ cut here ]------------
7*4882a593Smuzhiyun	WARNING: CPU: 1 PID: 28102 at kernel/module.c:1108 module_put+0x57/0x70
8*4882a593Smuzhiyun	Modules linked in: dvb_usb_gp8psk(-) dvb_usb dvb_core nvidia_drm(PO) nvidia_modeset(PO) snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd soundcore nvidia(PO) [last unloaded: rc_core]
9*4882a593Smuzhiyun	CPU: 1 PID: 28102 Comm: rmmod Tainted: P        WC O 4.8.4-build.1 #1
10*4882a593Smuzhiyun	Hardware name: MSI MS-7309/MS-7309, BIOS V1.12 02/23/2009
11*4882a593Smuzhiyun	 00000000 c12ba080 00000000 00000000 c103ed6a c1616014 00000001 00006dc6
12*4882a593Smuzhiyun	 c1615862 00000454 c109e8a7 c109e8a7 00000009 ffffffff 00000000 f13f6a10
13*4882a593Smuzhiyun	 f5f5a600 c103ee33 00000009 00000000 00000000 c109e8a7 f80ca4d0 c109f617
14*4882a593Smuzhiyun	Call Trace:
15*4882a593Smuzhiyun	 [<c12ba080>] ? dump_stack+0x44/0x64
16*4882a593Smuzhiyun	 [<c103ed6a>] ? __warn+0xfa/0x120
17*4882a593Smuzhiyun	 [<c109e8a7>] ? module_put+0x57/0x70
18*4882a593Smuzhiyun	 [<c109e8a7>] ? module_put+0x57/0x70
19*4882a593Smuzhiyun	 [<c103ee33>] ? warn_slowpath_null+0x23/0x30
20*4882a593Smuzhiyun	 [<c109e8a7>] ? module_put+0x57/0x70
21*4882a593Smuzhiyun	 [<f80ca4d0>] ? gp8psk_fe_set_frontend+0x460/0x460 [dvb_usb_gp8psk]
22*4882a593Smuzhiyun	 [<c109f617>] ? symbol_put_addr+0x27/0x50
23*4882a593Smuzhiyun	 [<f80bc9ca>] ? dvb_usb_adapter_frontend_exit+0x3a/0x70 [dvb_usb]
24*4882a593Smuzhiyun	 [<f80bb3bf>] ? dvb_usb_exit+0x2f/0xd0 [dvb_usb]
25*4882a593Smuzhiyun	 [<c13d03bc>] ? usb_disable_endpoint+0x7c/0xb0
26*4882a593Smuzhiyun	 [<f80bb48a>] ? dvb_usb_device_exit+0x2a/0x50 [dvb_usb]
27*4882a593Smuzhiyun	 [<c13d2882>] ? usb_unbind_interface+0x62/0x250
28*4882a593Smuzhiyun	 [<c136b514>] ? __pm_runtime_idle+0x44/0x70
29*4882a593Smuzhiyun	 [<c13620d8>] ? __device_release_driver+0x78/0x120
30*4882a593Smuzhiyun	 [<c1362907>] ? driver_detach+0x87/0x90
31*4882a593Smuzhiyun	 [<c1361c48>] ? bus_remove_driver+0x38/0x90
32*4882a593Smuzhiyun	 [<c13d1c18>] ? usb_deregister+0x58/0xb0
33*4882a593Smuzhiyun	 [<c109fbb0>] ? SyS_delete_module+0x130/0x1f0
34*4882a593Smuzhiyun	 [<c1055654>] ? task_work_run+0x64/0x80
35*4882a593Smuzhiyun	 [<c1000fa5>] ? exit_to_usermode_loop+0x85/0x90
36*4882a593Smuzhiyun	 [<c10013f0>] ? do_fast_syscall_32+0x80/0x130
37*4882a593Smuzhiyun	 [<c1549f43>] ? sysenter_past_esp+0x40/0x6a
38*4882a593Smuzhiyun	---[ end trace 6ebc60ef3981792f ]---
39*4882a593Smuzhiyun
40*4882a593SmuzhiyunSuch stack traces provide enough information to identify the line inside the
41*4882a593SmuzhiyunKernel's source code where the bug happened. Depending on the severity of
42*4882a593Smuzhiyunthe issue, it may also contain the word **Oops**, as on this one::
43*4882a593Smuzhiyun
44*4882a593Smuzhiyun	BUG: unable to handle kernel NULL pointer dereference at   (null)
45*4882a593Smuzhiyun	IP: [<c06969d4>] iret_exc+0x7d0/0xa59
46*4882a593Smuzhiyun	*pdpt = 000000002258a001 *pde = 0000000000000000
47*4882a593Smuzhiyun	Oops: 0002 [#1] PREEMPT SMP
48*4882a593Smuzhiyun	...
49*4882a593Smuzhiyun
50*4882a593SmuzhiyunDespite being an **Oops** or some other sort of stack trace, the offended
51*4882a593Smuzhiyunline is usually required to identify and handle the bug. Along this chapter,
52*4882a593Smuzhiyunwe'll refer to "Oops" for all kinds of stack traces that need to be analyzed.
53*4882a593Smuzhiyun
54*4882a593SmuzhiyunIf the kernel is compiled with ``CONFIG_DEBUG_INFO``, you can enhance the
55*4882a593Smuzhiyunquality of the stack trace by using file:`scripts/decode_stacktrace.sh`.
56*4882a593Smuzhiyun
57*4882a593SmuzhiyunModules linked in
58*4882a593Smuzhiyun-----------------
59*4882a593Smuzhiyun
60*4882a593SmuzhiyunModules that are tainted or are being loaded or unloaded are marked with
61*4882a593Smuzhiyun"(...)", where the taint flags are described in
62*4882a593Smuzhiyunfile:`Documentation/admin-guide/tainted-kernels.rst`, "being loaded" is
63*4882a593Smuzhiyunannotated with "+", and "being unloaded" is annotated with "-".
64*4882a593Smuzhiyun
65*4882a593Smuzhiyun
66*4882a593SmuzhiyunWhere is the Oops message is located?
67*4882a593Smuzhiyun-------------------------------------
68*4882a593Smuzhiyun
69*4882a593SmuzhiyunNormally the Oops text is read from the kernel buffers by klogd and
70*4882a593Smuzhiyunhanded to ``syslogd`` which writes it to a syslog file, typically
71*4882a593Smuzhiyun``/var/log/messages`` (depends on ``/etc/syslog.conf``). On systems with
72*4882a593Smuzhiyunsystemd, it may also be stored by the ``journald`` daemon, and accessed
73*4882a593Smuzhiyunby running ``journalctl`` command.
74*4882a593Smuzhiyun
75*4882a593SmuzhiyunSometimes ``klogd`` dies, in which case you can run ``dmesg > file`` to
76*4882a593Smuzhiyunread the data from the kernel buffers and save it.  Or you can
77*4882a593Smuzhiyun``cat /proc/kmsg > file``, however you have to break in to stop the transfer,
78*4882a593Smuzhiyunsince ``kmsg`` is a "never ending file".
79*4882a593Smuzhiyun
80*4882a593SmuzhiyunIf the machine has crashed so badly that you cannot enter commands or
81*4882a593Smuzhiyunthe disk is not available then you have three options:
82*4882a593Smuzhiyun
83*4882a593Smuzhiyun(1) Hand copy the text from the screen and type it in after the machine
84*4882a593Smuzhiyun    has restarted.  Messy but it is the only option if you have not
85*4882a593Smuzhiyun    planned for a crash. Alternatively, you can take a picture of
86*4882a593Smuzhiyun    the screen with a digital camera - not nice, but better than
87*4882a593Smuzhiyun    nothing.  If the messages scroll off the top of the console, you
88*4882a593Smuzhiyun    may find that booting with a higher resolution (e.g., ``vga=791``)
89*4882a593Smuzhiyun    will allow you to read more of the text. (Caveat: This needs ``vesafb``,
90*4882a593Smuzhiyun    so won't help for 'early' oopses.)
91*4882a593Smuzhiyun
92*4882a593Smuzhiyun(2) Boot with a serial console (see
93*4882a593Smuzhiyun    :ref:`Documentation/admin-guide/serial-console.rst <serial_console>`),
94*4882a593Smuzhiyun    run a null modem to a second machine and capture the output there
95*4882a593Smuzhiyun    using your favourite communication program.  Minicom works well.
96*4882a593Smuzhiyun
97*4882a593Smuzhiyun(3) Use Kdump (see Documentation/admin-guide/kdump/kdump.rst),
98*4882a593Smuzhiyun    extract the kernel ring buffer from old memory with using dmesg
99*4882a593Smuzhiyun    gdbmacro in Documentation/admin-guide/kdump/gdbmacros.txt.
100*4882a593Smuzhiyun
101*4882a593SmuzhiyunFinding the bug's location
102*4882a593Smuzhiyun--------------------------
103*4882a593Smuzhiyun
104*4882a593SmuzhiyunReporting a bug works best if you point the location of the bug at the
105*4882a593SmuzhiyunKernel source file. There are two methods for doing that. Usually, using
106*4882a593Smuzhiyun``gdb`` is easier, but the Kernel should be pre-compiled with debug info.
107*4882a593Smuzhiyun
108*4882a593Smuzhiyungdb
109*4882a593Smuzhiyun^^^
110*4882a593Smuzhiyun
111*4882a593SmuzhiyunThe GNU debugger (``gdb``) is the best way to figure out the exact file and line
112*4882a593Smuzhiyunnumber of the OOPS from the ``vmlinux`` file.
113*4882a593Smuzhiyun
114*4882a593SmuzhiyunThe usage of gdb works best on a kernel compiled with ``CONFIG_DEBUG_INFO``.
115*4882a593SmuzhiyunThis can be set by running::
116*4882a593Smuzhiyun
117*4882a593Smuzhiyun  $ ./scripts/config -d COMPILE_TEST -e DEBUG_KERNEL -e DEBUG_INFO
118*4882a593Smuzhiyun
119*4882a593SmuzhiyunOn a kernel compiled with ``CONFIG_DEBUG_INFO``, you can simply copy the
120*4882a593SmuzhiyunEIP value from the OOPS::
121*4882a593Smuzhiyun
122*4882a593Smuzhiyun EIP:    0060:[<c021e50e>]    Not tainted VLI
123*4882a593Smuzhiyun
124*4882a593SmuzhiyunAnd use GDB to translate that to human-readable form::
125*4882a593Smuzhiyun
126*4882a593Smuzhiyun  $ gdb vmlinux
127*4882a593Smuzhiyun  (gdb) l *0xc021e50e
128*4882a593Smuzhiyun
129*4882a593SmuzhiyunIf you don't have ``CONFIG_DEBUG_INFO`` enabled, you use the function
130*4882a593Smuzhiyunoffset from the OOPS::
131*4882a593Smuzhiyun
132*4882a593Smuzhiyun EIP is at vt_ioctl+0xda8/0x1482
133*4882a593Smuzhiyun
134*4882a593SmuzhiyunAnd recompile the kernel with ``CONFIG_DEBUG_INFO`` enabled::
135*4882a593Smuzhiyun
136*4882a593Smuzhiyun  $ ./scripts/config -d COMPILE_TEST -e DEBUG_KERNEL -e DEBUG_INFO
137*4882a593Smuzhiyun  $ make vmlinux
138*4882a593Smuzhiyun  $ gdb vmlinux
139*4882a593Smuzhiyun  (gdb) l *vt_ioctl+0xda8
140*4882a593Smuzhiyun  0x1888 is in vt_ioctl (drivers/tty/vt/vt_ioctl.c:293).
141*4882a593Smuzhiyun  288	{
142*4882a593Smuzhiyun  289		struct vc_data *vc = NULL;
143*4882a593Smuzhiyun  290		int ret = 0;
144*4882a593Smuzhiyun  291
145*4882a593Smuzhiyun  292		console_lock();
146*4882a593Smuzhiyun  293		if (VT_BUSY(vc_num))
147*4882a593Smuzhiyun  294			ret = -EBUSY;
148*4882a593Smuzhiyun  295		else if (vc_num)
149*4882a593Smuzhiyun  296			vc = vc_deallocate(vc_num);
150*4882a593Smuzhiyun  297		console_unlock();
151*4882a593Smuzhiyun
152*4882a593Smuzhiyunor, if you want to be more verbose::
153*4882a593Smuzhiyun
154*4882a593Smuzhiyun  (gdb) p vt_ioctl
155*4882a593Smuzhiyun  $1 = {int (struct tty_struct *, unsigned int, unsigned long)} 0xae0 <vt_ioctl>
156*4882a593Smuzhiyun  (gdb) l *0xae0+0xda8
157*4882a593Smuzhiyun
158*4882a593SmuzhiyunYou could, instead, use the object file::
159*4882a593Smuzhiyun
160*4882a593Smuzhiyun  $ make drivers/tty/
161*4882a593Smuzhiyun  $ gdb drivers/tty/vt/vt_ioctl.o
162*4882a593Smuzhiyun  (gdb) l *vt_ioctl+0xda8
163*4882a593Smuzhiyun
164*4882a593SmuzhiyunIf you have a call trace, such as::
165*4882a593Smuzhiyun
166*4882a593Smuzhiyun     Call Trace:
167*4882a593Smuzhiyun      [<ffffffff8802c8e9>] :jbd:log_wait_commit+0xa3/0xf5
168*4882a593Smuzhiyun      [<ffffffff810482d9>] autoremove_wake_function+0x0/0x2e
169*4882a593Smuzhiyun      [<ffffffff8802770b>] :jbd:journal_stop+0x1be/0x1ee
170*4882a593Smuzhiyun      ...
171*4882a593Smuzhiyun
172*4882a593Smuzhiyunthis shows the problem likely is in the :jbd: module. You can load that module
173*4882a593Smuzhiyunin gdb and list the relevant code::
174*4882a593Smuzhiyun
175*4882a593Smuzhiyun  $ gdb fs/jbd/jbd.ko
176*4882a593Smuzhiyun  (gdb) l *log_wait_commit+0xa3
177*4882a593Smuzhiyun
178*4882a593Smuzhiyun.. note::
179*4882a593Smuzhiyun
180*4882a593Smuzhiyun     You can also do the same for any function call at the stack trace,
181*4882a593Smuzhiyun     like this one::
182*4882a593Smuzhiyun
183*4882a593Smuzhiyun	 [<f80bc9ca>] ? dvb_usb_adapter_frontend_exit+0x3a/0x70 [dvb_usb]
184*4882a593Smuzhiyun
185*4882a593Smuzhiyun     The position where the above call happened can be seen with::
186*4882a593Smuzhiyun
187*4882a593Smuzhiyun	$ gdb drivers/media/usb/dvb-usb/dvb-usb.o
188*4882a593Smuzhiyun	(gdb) l *dvb_usb_adapter_frontend_exit+0x3a
189*4882a593Smuzhiyun
190*4882a593Smuzhiyunobjdump
191*4882a593Smuzhiyun^^^^^^^
192*4882a593Smuzhiyun
193*4882a593SmuzhiyunTo debug a kernel, use objdump and look for the hex offset from the crash
194*4882a593Smuzhiyunoutput to find the valid line of code/assembler. Without debug symbols, you
195*4882a593Smuzhiyunwill see the assembler code for the routine shown, but if your kernel has
196*4882a593Smuzhiyundebug symbols the C code will also be available. (Debug symbols can be enabled
197*4882a593Smuzhiyunin the kernel hacking menu of the menu configuration.) For example::
198*4882a593Smuzhiyun
199*4882a593Smuzhiyun    $ objdump -r -S -l --disassemble net/dccp/ipv4.o
200*4882a593Smuzhiyun
201*4882a593Smuzhiyun.. note::
202*4882a593Smuzhiyun
203*4882a593Smuzhiyun   You need to be at the top level of the kernel tree for this to pick up
204*4882a593Smuzhiyun   your C files.
205*4882a593Smuzhiyun
206*4882a593SmuzhiyunIf you don't have access to the source code you can still debug some crash
207*4882a593Smuzhiyundumps using the following method (example crash dump output as shown by
208*4882a593SmuzhiyunDave Miller)::
209*4882a593Smuzhiyun
210*4882a593Smuzhiyun     EIP is at 	+0x14/0x4c0
211*4882a593Smuzhiyun      ...
212*4882a593Smuzhiyun     Code: 44 24 04 e8 6f 05 00 00 e9 e8 fe ff ff 8d 76 00 8d bc 27 00 00
213*4882a593Smuzhiyun     00 00 55 57  56 53 81 ec bc 00 00 00 8b ac 24 d0 00 00 00 8b 5d 08
214*4882a593Smuzhiyun     <8b> 83 3c 01 00 00 89 44  24 14 8b 45 28 85 c0 89 44 24 18 0f 85
215*4882a593Smuzhiyun
216*4882a593Smuzhiyun     Put the bytes into a "foo.s" file like this:
217*4882a593Smuzhiyun
218*4882a593Smuzhiyun            .text
219*4882a593Smuzhiyun            .globl foo
220*4882a593Smuzhiyun     foo:
221*4882a593Smuzhiyun            .byte  .... /* bytes from Code: part of OOPS dump */
222*4882a593Smuzhiyun
223*4882a593Smuzhiyun     Compile it with "gcc -c -o foo.o foo.s" then look at the output of
224*4882a593Smuzhiyun     "objdump --disassemble foo.o".
225*4882a593Smuzhiyun
226*4882a593Smuzhiyun     Output:
227*4882a593Smuzhiyun
228*4882a593Smuzhiyun     ip_queue_xmit:
229*4882a593Smuzhiyun         push       %ebp
230*4882a593Smuzhiyun         push       %edi
231*4882a593Smuzhiyun         push       %esi
232*4882a593Smuzhiyun         push       %ebx
233*4882a593Smuzhiyun         sub        $0xbc, %esp
234*4882a593Smuzhiyun         mov        0xd0(%esp), %ebp        ! %ebp = arg0 (skb)
235*4882a593Smuzhiyun         mov        0x8(%ebp), %ebx         ! %ebx = skb->sk
236*4882a593Smuzhiyun         mov        0x13c(%ebx), %eax       ! %eax = inet_sk(sk)->opt
237*4882a593Smuzhiyun
238*4882a593Smuzhiyunfile:`scripts/decodecode` can be used to automate most of this, depending
239*4882a593Smuzhiyunon what CPU architecture is being debugged.
240*4882a593Smuzhiyun
241*4882a593SmuzhiyunReporting the bug
242*4882a593Smuzhiyun-----------------
243*4882a593Smuzhiyun
244*4882a593SmuzhiyunOnce you find where the bug happened, by inspecting its location,
245*4882a593Smuzhiyunyou could either try to fix it yourself or report it upstream.
246*4882a593Smuzhiyun
247*4882a593SmuzhiyunIn order to report it upstream, you should identify the mailing list
248*4882a593Smuzhiyunused for the development of the affected code. This can be done by using
249*4882a593Smuzhiyunthe ``get_maintainer.pl`` script.
250*4882a593Smuzhiyun
251*4882a593SmuzhiyunFor example, if you find a bug at the gspca's sonixj.c file, you can get
252*4882a593Smuzhiyunits maintainers with::
253*4882a593Smuzhiyun
254*4882a593Smuzhiyun	$ ./scripts/get_maintainer.pl -f drivers/media/usb/gspca/sonixj.c
255*4882a593Smuzhiyun	Hans Verkuil <hverkuil@xs4all.nl> (odd fixer:GSPCA USB WEBCAM DRIVER,commit_signer:1/1=100%)
256*4882a593Smuzhiyun	Mauro Carvalho Chehab <mchehab@kernel.org> (maintainer:MEDIA INPUT INFRASTRUCTURE (V4L/DVB),commit_signer:1/1=100%)
257*4882a593Smuzhiyun	Tejun Heo <tj@kernel.org> (commit_signer:1/1=100%)
258*4882a593Smuzhiyun	Bhaktipriya Shridhar <bhaktipriya96@gmail.com> (commit_signer:1/1=100%,authored:1/1=100%,added_lines:4/4=100%,removed_lines:9/9=100%)
259*4882a593Smuzhiyun	linux-media@vger.kernel.org (open list:GSPCA USB WEBCAM DRIVER)
260*4882a593Smuzhiyun	linux-kernel@vger.kernel.org (open list)
261*4882a593Smuzhiyun
262*4882a593SmuzhiyunPlease notice that it will point to:
263*4882a593Smuzhiyun
264*4882a593Smuzhiyun- The last developers that touched the source code (if this is done inside
265*4882a593Smuzhiyun  a git tree). On the above example, Tejun and Bhaktipriya (in this
266*4882a593Smuzhiyun  specific case, none really envolved on the development of this file);
267*4882a593Smuzhiyun- The driver maintainer (Hans Verkuil);
268*4882a593Smuzhiyun- The subsystem maintainer (Mauro Carvalho Chehab);
269*4882a593Smuzhiyun- The driver and/or subsystem mailing list (linux-media@vger.kernel.org);
270*4882a593Smuzhiyun- the Linux Kernel mailing list (linux-kernel@vger.kernel.org).
271*4882a593Smuzhiyun
272*4882a593SmuzhiyunUsually, the fastest way to have your bug fixed is to report it to mailing
273*4882a593Smuzhiyunlist used for the development of the code (linux-media ML) copying the
274*4882a593Smuzhiyundriver maintainer (Hans).
275*4882a593Smuzhiyun
276*4882a593SmuzhiyunIf you are totally stumped as to whom to send the report, and
277*4882a593Smuzhiyun``get_maintainer.pl`` didn't provide you anything useful, send it to
278*4882a593Smuzhiyunlinux-kernel@vger.kernel.org.
279*4882a593Smuzhiyun
280*4882a593SmuzhiyunThanks for your help in making Linux as stable as humanly possible.
281*4882a593Smuzhiyun
282*4882a593SmuzhiyunFixing the bug
283*4882a593Smuzhiyun--------------
284*4882a593Smuzhiyun
285*4882a593SmuzhiyunIf you know programming, you could help us by not only reporting the bug,
286*4882a593Smuzhiyunbut also providing us with a solution. After all, open source is about
287*4882a593Smuzhiyunsharing what you do and don't you want to be recognised for your genius?
288*4882a593Smuzhiyun
289*4882a593SmuzhiyunIf you decide to take this way, once you have worked out a fix please submit
290*4882a593Smuzhiyunit upstream.
291*4882a593Smuzhiyun
292*4882a593SmuzhiyunPlease do read
293*4882a593Smuzhiyun:ref:`Documentation/process/submitting-patches.rst <submittingpatches>` though
294*4882a593Smuzhiyunto help your code get accepted.
295*4882a593Smuzhiyun
296*4882a593Smuzhiyun
297*4882a593Smuzhiyun---------------------------------------------------------------------------
298*4882a593Smuzhiyun
299*4882a593SmuzhiyunNotes on Oops tracing with ``klogd``
300*4882a593Smuzhiyun------------------------------------
301*4882a593Smuzhiyun
302*4882a593SmuzhiyunIn order to help Linus and the other kernel developers there has been
303*4882a593Smuzhiyunsubstantial support incorporated into ``klogd`` for processing protection
304*4882a593Smuzhiyunfaults.  In order to have full support for address resolution at least
305*4882a593Smuzhiyunversion 1.3-pl3 of the ``sysklogd`` package should be used.
306*4882a593Smuzhiyun
307*4882a593SmuzhiyunWhen a protection fault occurs the ``klogd`` daemon automatically
308*4882a593Smuzhiyuntranslates important addresses in the kernel log messages to their
309*4882a593Smuzhiyunsymbolic equivalents.  This translated kernel message is then
310*4882a593Smuzhiyunforwarded through whatever reporting mechanism ``klogd`` is using.  The
311*4882a593Smuzhiyunprotection fault message can be simply cut out of the message files
312*4882a593Smuzhiyunand forwarded to the kernel developers.
313*4882a593Smuzhiyun
314*4882a593SmuzhiyunTwo types of address resolution are performed by ``klogd``.  The first is
315*4882a593Smuzhiyunstatic translation and the second is dynamic translation.
316*4882a593SmuzhiyunStatic translation uses the System.map file.
317*4882a593SmuzhiyunIn order to do static translation the ``klogd`` daemon
318*4882a593Smuzhiyunmust be able to find a system map file at daemon initialization time.
319*4882a593SmuzhiyunSee the klogd man page for information on how ``klogd`` searches for map
320*4882a593Smuzhiyunfiles.
321*4882a593Smuzhiyun
322*4882a593SmuzhiyunDynamic address translation is important when kernel loadable modules
323*4882a593Smuzhiyunare being used.  Since memory for kernel modules is allocated from the
324*4882a593Smuzhiyunkernel's dynamic memory pools there are no fixed locations for either
325*4882a593Smuzhiyunthe start of the module or for functions and symbols in the module.
326*4882a593Smuzhiyun
327*4882a593SmuzhiyunThe kernel supports system calls which allow a program to determine
328*4882a593Smuzhiyunwhich modules are loaded and their location in memory.  Using these
329*4882a593Smuzhiyunsystem calls the klogd daemon builds a symbol table which can be used
330*4882a593Smuzhiyunto debug a protection fault which occurs in a loadable kernel module.
331*4882a593Smuzhiyun
332*4882a593SmuzhiyunAt the very minimum klogd will provide the name of the module which
333*4882a593Smuzhiyungenerated the protection fault.  There may be additional symbolic
334*4882a593Smuzhiyuninformation available if the developer of the loadable module chose to
335*4882a593Smuzhiyunexport symbol information from the module.
336*4882a593Smuzhiyun
337*4882a593SmuzhiyunSince the kernel module environment can be dynamic there must be a
338*4882a593Smuzhiyunmechanism for notifying the ``klogd`` daemon when a change in module
339*4882a593Smuzhiyunenvironment occurs.  There are command line options available which
340*4882a593Smuzhiyunallow klogd to signal the currently executing daemon that symbol
341*4882a593Smuzhiyuninformation should be refreshed.  See the ``klogd`` manual page for more
342*4882a593Smuzhiyuninformation.
343*4882a593Smuzhiyun
344*4882a593SmuzhiyunA patch is included with the sysklogd distribution which modifies the
345*4882a593Smuzhiyun``modules-2.0.0`` package to automatically signal klogd whenever a module
346*4882a593Smuzhiyunis loaded or unloaded.  Applying this patch provides essentially
347*4882a593Smuzhiyunseamless support for debugging protection faults which occur with
348*4882a593Smuzhiyunkernel loadable modules.
349*4882a593Smuzhiyun
350*4882a593SmuzhiyunThe following is an example of a protection fault in a loadable module
351*4882a593Smuzhiyunprocessed by ``klogd``::
352*4882a593Smuzhiyun
353*4882a593Smuzhiyun	Aug 29 09:51:01 blizard kernel: Unable to handle kernel paging request at virtual address f15e97cc
354*4882a593Smuzhiyun	Aug 29 09:51:01 blizard kernel: current->tss.cr3 = 0062d000, %cr3 = 0062d000
355*4882a593Smuzhiyun	Aug 29 09:51:01 blizard kernel: *pde = 00000000
356*4882a593Smuzhiyun	Aug 29 09:51:01 blizard kernel: Oops: 0002
357*4882a593Smuzhiyun	Aug 29 09:51:01 blizard kernel: CPU:    0
358*4882a593Smuzhiyun	Aug 29 09:51:01 blizard kernel: EIP:    0010:[oops:_oops+16/3868]
359*4882a593Smuzhiyun	Aug 29 09:51:01 blizard kernel: EFLAGS: 00010212
360*4882a593Smuzhiyun	Aug 29 09:51:01 blizard kernel: eax: 315e97cc   ebx: 003a6f80   ecx: 001be77b   edx: 00237c0c
361*4882a593Smuzhiyun	Aug 29 09:51:01 blizard kernel: esi: 00000000   edi: bffffdb3   ebp: 00589f90   esp: 00589f8c
362*4882a593Smuzhiyun	Aug 29 09:51:01 blizard kernel: ds: 0018   es: 0018   fs: 002b   gs: 002b   ss: 0018
363*4882a593Smuzhiyun	Aug 29 09:51:01 blizard kernel: Process oops_test (pid: 3374, process nr: 21, stackpage=00589000)
364*4882a593Smuzhiyun	Aug 29 09:51:01 blizard kernel: Stack: 315e97cc 00589f98 0100b0b4 bffffed4 0012e38e 00240c64 003a6f80 00000001
365*4882a593Smuzhiyun	Aug 29 09:51:01 blizard kernel:        00000000 00237810 bfffff00 0010a7fa 00000003 00000001 00000000 bfffff00
366*4882a593Smuzhiyun	Aug 29 09:51:01 blizard kernel:        bffffdb3 bffffed4 ffffffda 0000002b 0007002b 0000002b 0000002b 00000036
367*4882a593Smuzhiyun	Aug 29 09:51:01 blizard kernel: Call Trace: [oops:_oops_ioctl+48/80] [_sys_ioctl+254/272] [_system_call+82/128]
368*4882a593Smuzhiyun	Aug 29 09:51:01 blizard kernel: Code: c7 00 05 00 00 00 eb 08 90 90 90 90 90 90 90 90 89 ec 5d c3
369*4882a593Smuzhiyun
370*4882a593Smuzhiyun---------------------------------------------------------------------------
371*4882a593Smuzhiyun
372*4882a593Smuzhiyun::
373*4882a593Smuzhiyun
374*4882a593Smuzhiyun  Dr. G.W. Wettstein           Oncology Research Div. Computing Facility
375*4882a593Smuzhiyun  Roger Maris Cancer Center    INTERNET: greg@wind.rmcc.com
376*4882a593Smuzhiyun  820 4th St. N.
377*4882a593Smuzhiyun  Fargo, ND  58122
378*4882a593Smuzhiyun  Phone: 701-234-7556
379