1*4882a593SmuzhiyunWhat: /dev/kmsg 2*4882a593SmuzhiyunDate: Mai 2012 3*4882a593SmuzhiyunKernelVersion: 3.5 4*4882a593SmuzhiyunContact: Kay Sievers <kay@vrfy.org> 5*4882a593SmuzhiyunDescription: The /dev/kmsg character device node provides userspace access 6*4882a593Smuzhiyun to the kernel's printk buffer. 7*4882a593Smuzhiyun 8*4882a593Smuzhiyun Injecting messages: 9*4882a593Smuzhiyun 10*4882a593Smuzhiyun Every write() to the opened device node places a log entry in 11*4882a593Smuzhiyun the kernel's printk buffer. 12*4882a593Smuzhiyun 13*4882a593Smuzhiyun The logged line can be prefixed with a <N> syslog prefix, which 14*4882a593Smuzhiyun carries the syslog priority and facility. The single decimal 15*4882a593Smuzhiyun prefix number is composed of the 3 lowest bits being the syslog 16*4882a593Smuzhiyun priority and the next 8 bits the syslog facility number. 17*4882a593Smuzhiyun 18*4882a593Smuzhiyun If no prefix is given, the priority number is the default kernel 19*4882a593Smuzhiyun log priority and the facility number is set to LOG_USER (1). It 20*4882a593Smuzhiyun is not possible to inject messages from userspace with the 21*4882a593Smuzhiyun facility number LOG_KERN (0), to make sure that the origin of 22*4882a593Smuzhiyun the messages can always be reliably determined. 23*4882a593Smuzhiyun 24*4882a593Smuzhiyun Accessing the buffer: 25*4882a593Smuzhiyun 26*4882a593Smuzhiyun Every read() from the opened device node receives one record 27*4882a593Smuzhiyun of the kernel's printk buffer. 28*4882a593Smuzhiyun 29*4882a593Smuzhiyun The first read() directly following an open() always returns 30*4882a593Smuzhiyun first message in the buffer; there is no kernel-internal 31*4882a593Smuzhiyun persistent state; many readers can concurrently open the device 32*4882a593Smuzhiyun and read from it, without affecting other readers. 33*4882a593Smuzhiyun 34*4882a593Smuzhiyun Every read() will receive the next available record. If no more 35*4882a593Smuzhiyun records are available read() will block, or if O_NONBLOCK is 36*4882a593Smuzhiyun used -EAGAIN returned. 37*4882a593Smuzhiyun 38*4882a593Smuzhiyun Messages in the record ring buffer get overwritten as whole, 39*4882a593Smuzhiyun there are never partial messages received by read(). 40*4882a593Smuzhiyun 41*4882a593Smuzhiyun In case messages get overwritten in the circular buffer while 42*4882a593Smuzhiyun the device is kept open, the next read() will return -EPIPE, 43*4882a593Smuzhiyun and the seek position be updated to the next available record. 44*4882a593Smuzhiyun Subsequent reads() will return available records again. 45*4882a593Smuzhiyun 46*4882a593Smuzhiyun Unlike the classic syslog() interface, the 64 bit record 47*4882a593Smuzhiyun sequence numbers allow to calculate the amount of lost 48*4882a593Smuzhiyun messages, in case the buffer gets overwritten. And they allow 49*4882a593Smuzhiyun to reconnect to the buffer and reconstruct the read position 50*4882a593Smuzhiyun if needed, without limiting the interface to a single reader. 51*4882a593Smuzhiyun 52*4882a593Smuzhiyun The device supports seek with the following parameters: 53*4882a593Smuzhiyun 54*4882a593Smuzhiyun SEEK_SET, 0 55*4882a593Smuzhiyun seek to the first entry in the buffer 56*4882a593Smuzhiyun SEEK_END, 0 57*4882a593Smuzhiyun seek after the last entry in the buffer 58*4882a593Smuzhiyun SEEK_DATA, 0 59*4882a593Smuzhiyun seek after the last record available at the time 60*4882a593Smuzhiyun the last SYSLOG_ACTION_CLEAR was issued. 61*4882a593Smuzhiyun 62*4882a593Smuzhiyun Other seek operations or offsets are not supported because of 63*4882a593Smuzhiyun the special behavior this device has. The device allows to read 64*4882a593Smuzhiyun or write only whole variable length messages (records) that are 65*4882a593Smuzhiyun stored in a ring buffer. 66*4882a593Smuzhiyun 67*4882a593Smuzhiyun Because of the non-standard behavior also the error values are 68*4882a593Smuzhiyun non-standard. -ESPIPE is returned for non-zero offset. -EINVAL 69*4882a593Smuzhiyun is returned for other operations, e.g. SEEK_CUR. This behavior 70*4882a593Smuzhiyun and values are historical and could not be modified without the 71*4882a593Smuzhiyun risk of breaking userspace. 72*4882a593Smuzhiyun 73*4882a593Smuzhiyun The output format consists of a prefix carrying the syslog 74*4882a593Smuzhiyun prefix including priority and facility, the 64 bit message 75*4882a593Smuzhiyun sequence number and the monotonic timestamp in microseconds, 76*4882a593Smuzhiyun and a flag field. All fields are separated by a ','. 77*4882a593Smuzhiyun 78*4882a593Smuzhiyun Future extensions might add more comma separated values before 79*4882a593Smuzhiyun the terminating ';'. Unknown fields and values should be 80*4882a593Smuzhiyun gracefully ignored. 81*4882a593Smuzhiyun 82*4882a593Smuzhiyun The human readable text string starts directly after the ';' 83*4882a593Smuzhiyun and is terminated by a '\n'. Untrusted values derived from 84*4882a593Smuzhiyun hardware or other facilities are printed, therefore 85*4882a593Smuzhiyun all non-printable characters and '\' itself in the log message 86*4882a593Smuzhiyun are escaped by "\x00" C-style hex encoding. 87*4882a593Smuzhiyun 88*4882a593Smuzhiyun A line starting with ' ', is a continuation line, adding 89*4882a593Smuzhiyun key/value pairs to the log message, which provide the machine 90*4882a593Smuzhiyun readable context of the message, for reliable processing in 91*4882a593Smuzhiyun userspace. 92*4882a593Smuzhiyun 93*4882a593Smuzhiyun Example:: 94*4882a593Smuzhiyun 95*4882a593Smuzhiyun 7,160,424069,-;pci_root PNP0A03:00: host bridge window [io 0x0000-0x0cf7] (ignored) 96*4882a593Smuzhiyun SUBSYSTEM=acpi 97*4882a593Smuzhiyun DEVICE=+acpi:PNP0A03:00 98*4882a593Smuzhiyun 6,339,5140900,-;NET: Registered protocol family 10 99*4882a593Smuzhiyun 30,340,5690716,-;udevd[80]: starting version 181 100*4882a593Smuzhiyun 101*4882a593Smuzhiyun The DEVICE= key uniquely identifies devices the following way: 102*4882a593Smuzhiyun 103*4882a593Smuzhiyun ============ ================= 104*4882a593Smuzhiyun b12:8 block dev_t 105*4882a593Smuzhiyun c127:3 char dev_t 106*4882a593Smuzhiyun n8 netdev ifindex 107*4882a593Smuzhiyun +sound:card0 subsystem:devname 108*4882a593Smuzhiyun ============ ================= 109*4882a593Smuzhiyun 110*4882a593Smuzhiyun The flags field carries '-' by default. A 'c' indicates a 111*4882a593Smuzhiyun fragment of a line. Note, that these hints about continuation 112*4882a593Smuzhiyun lines are not necessarily correct, and the stream could be 113*4882a593Smuzhiyun interleaved with unrelated messages, but merging the lines in 114*4882a593Smuzhiyun the output usually produces better human readable results. A 115*4882a593Smuzhiyun similar logic is used internally when messages are printed to 116*4882a593Smuzhiyun the console, /proc/kmsg or the syslog() syscall. 117*4882a593Smuzhiyun 118*4882a593Smuzhiyun By default, kernel tries to avoid fragments by concatenating 119*4882a593Smuzhiyun when it can and fragments are rare; however, when extended 120*4882a593Smuzhiyun console support is enabled, the in-kernel concatenation is 121*4882a593Smuzhiyun disabled and /dev/kmsg output will contain more fragments. If 122*4882a593Smuzhiyun the log consumer performs concatenation, the end result 123*4882a593Smuzhiyun should be the same. In the future, the in-kernel concatenation 124*4882a593Smuzhiyun may be removed entirely and /dev/kmsg users are recommended to 125*4882a593Smuzhiyun implement fragment handling. 126*4882a593Smuzhiyun 127*4882a593SmuzhiyunUsers: dmesg(1), userspace kernel log consumers 128