1*4882a593Smuzhiyun.. SPDX-License-Identifier: GPL-2.0 2*4882a593Smuzhiyun 3*4882a593Smuzhiyun================================== 4*4882a593Smuzhiyunrelay interface (formerly relayfs) 5*4882a593Smuzhiyun================================== 6*4882a593Smuzhiyun 7*4882a593SmuzhiyunThe relay interface provides a means for kernel applications to 8*4882a593Smuzhiyunefficiently log and transfer large quantities of data from the kernel 9*4882a593Smuzhiyunto userspace via user-defined 'relay channels'. 10*4882a593Smuzhiyun 11*4882a593SmuzhiyunA 'relay channel' is a kernel->user data relay mechanism implemented 12*4882a593Smuzhiyunas a set of per-cpu kernel buffers ('channel buffers'), each 13*4882a593Smuzhiyunrepresented as a regular file ('relay file') in user space. Kernel 14*4882a593Smuzhiyunclients write into the channel buffers using efficient write 15*4882a593Smuzhiyunfunctions; these automatically log into the current cpu's channel 16*4882a593Smuzhiyunbuffer. User space applications mmap() or read() from the relay files 17*4882a593Smuzhiyunand retrieve the data as it becomes available. The relay files 18*4882a593Smuzhiyunthemselves are files created in a host filesystem, e.g. debugfs, and 19*4882a593Smuzhiyunare associated with the channel buffers using the API described below. 20*4882a593Smuzhiyun 21*4882a593SmuzhiyunThe format of the data logged into the channel buffers is completely 22*4882a593Smuzhiyunup to the kernel client; the relay interface does however provide 23*4882a593Smuzhiyunhooks which allow kernel clients to impose some structure on the 24*4882a593Smuzhiyunbuffer data. The relay interface doesn't implement any form of data 25*4882a593Smuzhiyunfiltering - this also is left to the kernel client. The purpose is to 26*4882a593Smuzhiyunkeep things as simple as possible. 27*4882a593Smuzhiyun 28*4882a593SmuzhiyunThis document provides an overview of the relay interface API. The 29*4882a593Smuzhiyundetails of the function parameters are documented along with the 30*4882a593Smuzhiyunfunctions in the relay interface code - please see that for details. 31*4882a593Smuzhiyun 32*4882a593SmuzhiyunSemantics 33*4882a593Smuzhiyun========= 34*4882a593Smuzhiyun 35*4882a593SmuzhiyunEach relay channel has one buffer per CPU, each buffer has one or more 36*4882a593Smuzhiyunsub-buffers. Messages are written to the first sub-buffer until it is 37*4882a593Smuzhiyuntoo full to contain a new message, in which case it is written to 38*4882a593Smuzhiyunthe next (if available). Messages are never split across sub-buffers. 39*4882a593SmuzhiyunAt this point, userspace can be notified so it empties the first 40*4882a593Smuzhiyunsub-buffer, while the kernel continues writing to the next. 41*4882a593Smuzhiyun 42*4882a593SmuzhiyunWhen notified that a sub-buffer is full, the kernel knows how many 43*4882a593Smuzhiyunbytes of it are padding i.e. unused space occurring because a complete 44*4882a593Smuzhiyunmessage couldn't fit into a sub-buffer. Userspace can use this 45*4882a593Smuzhiyunknowledge to copy only valid data. 46*4882a593Smuzhiyun 47*4882a593SmuzhiyunAfter copying it, userspace can notify the kernel that a sub-buffer 48*4882a593Smuzhiyunhas been consumed. 49*4882a593Smuzhiyun 50*4882a593SmuzhiyunA relay channel can operate in a mode where it will overwrite data not 51*4882a593Smuzhiyunyet collected by userspace, and not wait for it to be consumed. 52*4882a593Smuzhiyun 53*4882a593SmuzhiyunThe relay channel itself does not provide for communication of such 54*4882a593Smuzhiyundata between userspace and kernel, allowing the kernel side to remain 55*4882a593Smuzhiyunsimple and not impose a single interface on userspace. It does 56*4882a593Smuzhiyunprovide a set of examples and a separate helper though, described 57*4882a593Smuzhiyunbelow. 58*4882a593Smuzhiyun 59*4882a593SmuzhiyunThe read() interface both removes padding and internally consumes the 60*4882a593Smuzhiyunread sub-buffers; thus in cases where read(2) is being used to drain 61*4882a593Smuzhiyunthe channel buffers, special-purpose communication between kernel and 62*4882a593Smuzhiyunuser isn't necessary for basic operation. 63*4882a593Smuzhiyun 64*4882a593SmuzhiyunOne of the major goals of the relay interface is to provide a low 65*4882a593Smuzhiyunoverhead mechanism for conveying kernel data to userspace. While the 66*4882a593Smuzhiyunread() interface is easy to use, it's not as efficient as the mmap() 67*4882a593Smuzhiyunapproach; the example code attempts to make the tradeoff between the 68*4882a593Smuzhiyuntwo approaches as small as possible. 69*4882a593Smuzhiyun 70*4882a593Smuzhiyunklog and relay-apps example code 71*4882a593Smuzhiyun================================ 72*4882a593Smuzhiyun 73*4882a593SmuzhiyunThe relay interface itself is ready to use, but to make things easier, 74*4882a593Smuzhiyuna couple simple utility functions and a set of examples are provided. 75*4882a593Smuzhiyun 76*4882a593SmuzhiyunThe relay-apps example tarball, available on the relay sourceforge 77*4882a593Smuzhiyunsite, contains a set of self-contained examples, each consisting of a 78*4882a593Smuzhiyunpair of .c files containing boilerplate code for each of the user and 79*4882a593Smuzhiyunkernel sides of a relay application. When combined these two sets of 80*4882a593Smuzhiyunboilerplate code provide glue to easily stream data to disk, without 81*4882a593Smuzhiyunhaving to bother with mundane housekeeping chores. 82*4882a593Smuzhiyun 83*4882a593SmuzhiyunThe 'klog debugging functions' patch (klog.patch in the relay-apps 84*4882a593Smuzhiyuntarball) provides a couple of high-level logging functions to the 85*4882a593Smuzhiyunkernel which allow writing formatted text or raw data to a channel, 86*4882a593Smuzhiyunregardless of whether a channel to write into exists or not, or even 87*4882a593Smuzhiyunwhether the relay interface is compiled into the kernel or not. These 88*4882a593Smuzhiyunfunctions allow you to put unconditional 'trace' statements anywhere 89*4882a593Smuzhiyunin the kernel or kernel modules; only when there is a 'klog handler' 90*4882a593Smuzhiyunregistered will data actually be logged (see the klog and kleak 91*4882a593Smuzhiyunexamples for details). 92*4882a593Smuzhiyun 93*4882a593SmuzhiyunIt is of course possible to use the relay interface from scratch, 94*4882a593Smuzhiyuni.e. without using any of the relay-apps example code or klog, but 95*4882a593Smuzhiyunyou'll have to implement communication between userspace and kernel, 96*4882a593Smuzhiyunallowing both to convey the state of buffers (full, empty, amount of 97*4882a593Smuzhiyunpadding). The read() interface both removes padding and internally 98*4882a593Smuzhiyunconsumes the read sub-buffers; thus in cases where read(2) is being 99*4882a593Smuzhiyunused to drain the channel buffers, special-purpose communication 100*4882a593Smuzhiyunbetween kernel and user isn't necessary for basic operation. Things 101*4882a593Smuzhiyunsuch as buffer-full conditions would still need to be communicated via 102*4882a593Smuzhiyunsome channel though. 103*4882a593Smuzhiyun 104*4882a593Smuzhiyunklog and the relay-apps examples can be found in the relay-apps 105*4882a593Smuzhiyuntarball on http://relayfs.sourceforge.net 106*4882a593Smuzhiyun 107*4882a593SmuzhiyunThe relay interface user space API 108*4882a593Smuzhiyun================================== 109*4882a593Smuzhiyun 110*4882a593SmuzhiyunThe relay interface implements basic file operations for user space 111*4882a593Smuzhiyunaccess to relay channel buffer data. Here are the file operations 112*4882a593Smuzhiyunthat are available and some comments regarding their behavior: 113*4882a593Smuzhiyun 114*4882a593Smuzhiyun=========== ============================================================ 115*4882a593Smuzhiyunopen() enables user to open an _existing_ channel buffer. 116*4882a593Smuzhiyun 117*4882a593Smuzhiyunmmap() results in channel buffer being mapped into the caller's 118*4882a593Smuzhiyun memory space. Note that you can't do a partial mmap - you 119*4882a593Smuzhiyun must map the entire file, which is NRBUF * SUBBUFSIZE. 120*4882a593Smuzhiyun 121*4882a593Smuzhiyunread() read the contents of a channel buffer. The bytes read are 122*4882a593Smuzhiyun 'consumed' by the reader, i.e. they won't be available 123*4882a593Smuzhiyun again to subsequent reads. If the channel is being used 124*4882a593Smuzhiyun in no-overwrite mode (the default), it can be read at any 125*4882a593Smuzhiyun time even if there's an active kernel writer. If the 126*4882a593Smuzhiyun channel is being used in overwrite mode and there are 127*4882a593Smuzhiyun active channel writers, results may be unpredictable - 128*4882a593Smuzhiyun users should make sure that all logging to the channel has 129*4882a593Smuzhiyun ended before using read() with overwrite mode. Sub-buffer 130*4882a593Smuzhiyun padding is automatically removed and will not be seen by 131*4882a593Smuzhiyun the reader. 132*4882a593Smuzhiyun 133*4882a593Smuzhiyunsendfile() transfer data from a channel buffer to an output file 134*4882a593Smuzhiyun descriptor. Sub-buffer padding is automatically removed 135*4882a593Smuzhiyun and will not be seen by the reader. 136*4882a593Smuzhiyun 137*4882a593Smuzhiyunpoll() POLLIN/POLLRDNORM/POLLERR supported. User applications are 138*4882a593Smuzhiyun notified when sub-buffer boundaries are crossed. 139*4882a593Smuzhiyun 140*4882a593Smuzhiyunclose() decrements the channel buffer's refcount. When the refcount 141*4882a593Smuzhiyun reaches 0, i.e. when no process or kernel client has the 142*4882a593Smuzhiyun buffer open, the channel buffer is freed. 143*4882a593Smuzhiyun=========== ============================================================ 144*4882a593Smuzhiyun 145*4882a593SmuzhiyunIn order for a user application to make use of relay files, the 146*4882a593Smuzhiyunhost filesystem must be mounted. For example:: 147*4882a593Smuzhiyun 148*4882a593Smuzhiyun mount -t debugfs debugfs /sys/kernel/debug 149*4882a593Smuzhiyun 150*4882a593Smuzhiyun.. Note:: 151*4882a593Smuzhiyun 152*4882a593Smuzhiyun the host filesystem doesn't need to be mounted for kernel 153*4882a593Smuzhiyun clients to create or use channels - it only needs to be 154*4882a593Smuzhiyun mounted when user space applications need access to the buffer 155*4882a593Smuzhiyun data. 156*4882a593Smuzhiyun 157*4882a593Smuzhiyun 158*4882a593SmuzhiyunThe relay interface kernel API 159*4882a593Smuzhiyun============================== 160*4882a593Smuzhiyun 161*4882a593SmuzhiyunHere's a summary of the API the relay interface provides to in-kernel clients: 162*4882a593Smuzhiyun 163*4882a593SmuzhiyunTBD(curr. line MT:/API/) 164*4882a593Smuzhiyun channel management functions:: 165*4882a593Smuzhiyun 166*4882a593Smuzhiyun relay_open(base_filename, parent, subbuf_size, n_subbufs, 167*4882a593Smuzhiyun callbacks, private_data) 168*4882a593Smuzhiyun relay_close(chan) 169*4882a593Smuzhiyun relay_flush(chan) 170*4882a593Smuzhiyun relay_reset(chan) 171*4882a593Smuzhiyun 172*4882a593Smuzhiyun channel management typically called on instigation of userspace:: 173*4882a593Smuzhiyun 174*4882a593Smuzhiyun relay_subbufs_consumed(chan, cpu, subbufs_consumed) 175*4882a593Smuzhiyun 176*4882a593Smuzhiyun write functions:: 177*4882a593Smuzhiyun 178*4882a593Smuzhiyun relay_write(chan, data, length) 179*4882a593Smuzhiyun __relay_write(chan, data, length) 180*4882a593Smuzhiyun relay_reserve(chan, length) 181*4882a593Smuzhiyun 182*4882a593Smuzhiyun callbacks:: 183*4882a593Smuzhiyun 184*4882a593Smuzhiyun subbuf_start(buf, subbuf, prev_subbuf, prev_padding) 185*4882a593Smuzhiyun buf_mapped(buf, filp) 186*4882a593Smuzhiyun buf_unmapped(buf, filp) 187*4882a593Smuzhiyun create_buf_file(filename, parent, mode, buf, is_global) 188*4882a593Smuzhiyun remove_buf_file(dentry) 189*4882a593Smuzhiyun 190*4882a593Smuzhiyun helper functions:: 191*4882a593Smuzhiyun 192*4882a593Smuzhiyun relay_buf_full(buf) 193*4882a593Smuzhiyun subbuf_start_reserve(buf, length) 194*4882a593Smuzhiyun 195*4882a593Smuzhiyun 196*4882a593SmuzhiyunCreating a channel 197*4882a593Smuzhiyun------------------ 198*4882a593Smuzhiyun 199*4882a593Smuzhiyunrelay_open() is used to create a channel, along with its per-cpu 200*4882a593Smuzhiyunchannel buffers. Each channel buffer will have an associated file 201*4882a593Smuzhiyuncreated for it in the host filesystem, which can be and mmapped or 202*4882a593Smuzhiyunread from in user space. The files are named basename0...basenameN-1 203*4882a593Smuzhiyunwhere N is the number of online cpus, and by default will be created 204*4882a593Smuzhiyunin the root of the filesystem (if the parent param is NULL). If you 205*4882a593Smuzhiyunwant a directory structure to contain your relay files, you should 206*4882a593Smuzhiyuncreate it using the host filesystem's directory creation function, 207*4882a593Smuzhiyune.g. debugfs_create_dir(), and pass the parent directory to 208*4882a593Smuzhiyunrelay_open(). Users are responsible for cleaning up any directory 209*4882a593Smuzhiyunstructure they create, when the channel is closed - again the host 210*4882a593Smuzhiyunfilesystem's directory removal functions should be used for that, 211*4882a593Smuzhiyune.g. debugfs_remove(). 212*4882a593Smuzhiyun 213*4882a593SmuzhiyunIn order for a channel to be created and the host filesystem's files 214*4882a593Smuzhiyunassociated with its channel buffers, the user must provide definitions 215*4882a593Smuzhiyunfor two callback functions, create_buf_file() and remove_buf_file(). 216*4882a593Smuzhiyuncreate_buf_file() is called once for each per-cpu buffer from 217*4882a593Smuzhiyunrelay_open() and allows the user to create the file which will be used 218*4882a593Smuzhiyunto represent the corresponding channel buffer. The callback should 219*4882a593Smuzhiyunreturn the dentry of the file created to represent the channel buffer. 220*4882a593Smuzhiyunremove_buf_file() must also be defined; it's responsible for deleting 221*4882a593Smuzhiyunthe file(s) created in create_buf_file() and is called during 222*4882a593Smuzhiyunrelay_close(). 223*4882a593Smuzhiyun 224*4882a593SmuzhiyunHere are some typical definitions for these callbacks, in this case 225*4882a593Smuzhiyunusing debugfs:: 226*4882a593Smuzhiyun 227*4882a593Smuzhiyun /* 228*4882a593Smuzhiyun * create_buf_file() callback. Creates relay file in debugfs. 229*4882a593Smuzhiyun */ 230*4882a593Smuzhiyun static struct dentry *create_buf_file_handler(const char *filename, 231*4882a593Smuzhiyun struct dentry *parent, 232*4882a593Smuzhiyun umode_t mode, 233*4882a593Smuzhiyun struct rchan_buf *buf, 234*4882a593Smuzhiyun int *is_global) 235*4882a593Smuzhiyun { 236*4882a593Smuzhiyun return debugfs_create_file(filename, mode, parent, buf, 237*4882a593Smuzhiyun &relay_file_operations); 238*4882a593Smuzhiyun } 239*4882a593Smuzhiyun 240*4882a593Smuzhiyun /* 241*4882a593Smuzhiyun * remove_buf_file() callback. Removes relay file from debugfs. 242*4882a593Smuzhiyun */ 243*4882a593Smuzhiyun static int remove_buf_file_handler(struct dentry *dentry) 244*4882a593Smuzhiyun { 245*4882a593Smuzhiyun debugfs_remove(dentry); 246*4882a593Smuzhiyun 247*4882a593Smuzhiyun return 0; 248*4882a593Smuzhiyun } 249*4882a593Smuzhiyun 250*4882a593Smuzhiyun /* 251*4882a593Smuzhiyun * relay interface callbacks 252*4882a593Smuzhiyun */ 253*4882a593Smuzhiyun static struct rchan_callbacks relay_callbacks = 254*4882a593Smuzhiyun { 255*4882a593Smuzhiyun .create_buf_file = create_buf_file_handler, 256*4882a593Smuzhiyun .remove_buf_file = remove_buf_file_handler, 257*4882a593Smuzhiyun }; 258*4882a593Smuzhiyun 259*4882a593SmuzhiyunAnd an example relay_open() invocation using them:: 260*4882a593Smuzhiyun 261*4882a593Smuzhiyun chan = relay_open("cpu", NULL, SUBBUF_SIZE, N_SUBBUFS, &relay_callbacks, NULL); 262*4882a593Smuzhiyun 263*4882a593SmuzhiyunIf the create_buf_file() callback fails, or isn't defined, channel 264*4882a593Smuzhiyuncreation and thus relay_open() will fail. 265*4882a593Smuzhiyun 266*4882a593SmuzhiyunThe total size of each per-cpu buffer is calculated by multiplying the 267*4882a593Smuzhiyunnumber of sub-buffers by the sub-buffer size passed into relay_open(). 268*4882a593SmuzhiyunThe idea behind sub-buffers is that they're basically an extension of 269*4882a593Smuzhiyundouble-buffering to N buffers, and they also allow applications to 270*4882a593Smuzhiyuneasily implement random-access-on-buffer-boundary schemes, which can 271*4882a593Smuzhiyunbe important for some high-volume applications. The number and size 272*4882a593Smuzhiyunof sub-buffers is completely dependent on the application and even for 273*4882a593Smuzhiyunthe same application, different conditions will warrant different 274*4882a593Smuzhiyunvalues for these parameters at different times. Typically, the right 275*4882a593Smuzhiyunvalues to use are best decided after some experimentation; in general, 276*4882a593Smuzhiyunthough, it's safe to assume that having only 1 sub-buffer is a bad 277*4882a593Smuzhiyunidea - you're guaranteed to either overwrite data or lose events 278*4882a593Smuzhiyundepending on the channel mode being used. 279*4882a593Smuzhiyun 280*4882a593SmuzhiyunThe create_buf_file() implementation can also be defined in such a way 281*4882a593Smuzhiyunas to allow the creation of a single 'global' buffer instead of the 282*4882a593Smuzhiyundefault per-cpu set. This can be useful for applications interested 283*4882a593Smuzhiyunmainly in seeing the relative ordering of system-wide events without 284*4882a593Smuzhiyunthe need to bother with saving explicit timestamps for the purpose of 285*4882a593Smuzhiyunmerging/sorting per-cpu files in a postprocessing step. 286*4882a593Smuzhiyun 287*4882a593SmuzhiyunTo have relay_open() create a global buffer, the create_buf_file() 288*4882a593Smuzhiyunimplementation should set the value of the is_global outparam to a 289*4882a593Smuzhiyunnon-zero value in addition to creating the file that will be used to 290*4882a593Smuzhiyunrepresent the single buffer. In the case of a global buffer, 291*4882a593Smuzhiyuncreate_buf_file() and remove_buf_file() will be called only once. The 292*4882a593Smuzhiyunnormal channel-writing functions, e.g. relay_write(), can still be 293*4882a593Smuzhiyunused - writes from any cpu will transparently end up in the global 294*4882a593Smuzhiyunbuffer - but since it is a global buffer, callers should make sure 295*4882a593Smuzhiyunthey use the proper locking for such a buffer, either by wrapping 296*4882a593Smuzhiyunwrites in a spinlock, or by copying a write function from relay.h and 297*4882a593Smuzhiyuncreating a local version that internally does the proper locking. 298*4882a593Smuzhiyun 299*4882a593SmuzhiyunThe private_data passed into relay_open() allows clients to associate 300*4882a593Smuzhiyunuser-defined data with a channel, and is immediately available 301*4882a593Smuzhiyun(including in create_buf_file()) via chan->private_data or 302*4882a593Smuzhiyunbuf->chan->private_data. 303*4882a593Smuzhiyun 304*4882a593SmuzhiyunBuffer-only channels 305*4882a593Smuzhiyun-------------------- 306*4882a593Smuzhiyun 307*4882a593SmuzhiyunThese channels have no files associated and can be created with 308*4882a593Smuzhiyunrelay_open(NULL, NULL, ...). Such channels are useful in scenarios such 309*4882a593Smuzhiyunas when doing early tracing in the kernel, before the VFS is up. In these 310*4882a593Smuzhiyuncases, one may open a buffer-only channel and then call 311*4882a593Smuzhiyunrelay_late_setup_files() when the kernel is ready to handle files, 312*4882a593Smuzhiyunto expose the buffered data to the userspace. 313*4882a593Smuzhiyun 314*4882a593SmuzhiyunChannel 'modes' 315*4882a593Smuzhiyun--------------- 316*4882a593Smuzhiyun 317*4882a593Smuzhiyunrelay channels can be used in either of two modes - 'overwrite' or 318*4882a593Smuzhiyun'no-overwrite'. The mode is entirely determined by the implementation 319*4882a593Smuzhiyunof the subbuf_start() callback, as described below. The default if no 320*4882a593Smuzhiyunsubbuf_start() callback is defined is 'no-overwrite' mode. If the 321*4882a593Smuzhiyundefault mode suits your needs, and you plan to use the read() 322*4882a593Smuzhiyuninterface to retrieve channel data, you can ignore the details of this 323*4882a593Smuzhiyunsection, as it pertains mainly to mmap() implementations. 324*4882a593Smuzhiyun 325*4882a593SmuzhiyunIn 'overwrite' mode, also known as 'flight recorder' mode, writes 326*4882a593Smuzhiyuncontinuously cycle around the buffer and will never fail, but will 327*4882a593Smuzhiyununconditionally overwrite old data regardless of whether it's actually 328*4882a593Smuzhiyunbeen consumed. In no-overwrite mode, writes will fail, i.e. data will 329*4882a593Smuzhiyunbe lost, if the number of unconsumed sub-buffers equals the total 330*4882a593Smuzhiyunnumber of sub-buffers in the channel. It should be clear that if 331*4882a593Smuzhiyunthere is no consumer or if the consumer can't consume sub-buffers fast 332*4882a593Smuzhiyunenough, data will be lost in either case; the only difference is 333*4882a593Smuzhiyunwhether data is lost from the beginning or the end of a buffer. 334*4882a593Smuzhiyun 335*4882a593SmuzhiyunAs explained above, a relay channel is made of up one or more 336*4882a593Smuzhiyunper-cpu channel buffers, each implemented as a circular buffer 337*4882a593Smuzhiyunsubdivided into one or more sub-buffers. Messages are written into 338*4882a593Smuzhiyunthe current sub-buffer of the channel's current per-cpu buffer via the 339*4882a593Smuzhiyunwrite functions described below. Whenever a message can't fit into 340*4882a593Smuzhiyunthe current sub-buffer, because there's no room left for it, the 341*4882a593Smuzhiyunclient is notified via the subbuf_start() callback that a switch to a 342*4882a593Smuzhiyunnew sub-buffer is about to occur. The client uses this callback to 1) 343*4882a593Smuzhiyuninitialize the next sub-buffer if appropriate 2) finalize the previous 344*4882a593Smuzhiyunsub-buffer if appropriate and 3) return a boolean value indicating 345*4882a593Smuzhiyunwhether or not to actually move on to the next sub-buffer. 346*4882a593Smuzhiyun 347*4882a593SmuzhiyunTo implement 'no-overwrite' mode, the userspace client would provide 348*4882a593Smuzhiyunan implementation of the subbuf_start() callback something like the 349*4882a593Smuzhiyunfollowing:: 350*4882a593Smuzhiyun 351*4882a593Smuzhiyun static int subbuf_start(struct rchan_buf *buf, 352*4882a593Smuzhiyun void *subbuf, 353*4882a593Smuzhiyun void *prev_subbuf, 354*4882a593Smuzhiyun unsigned int prev_padding) 355*4882a593Smuzhiyun { 356*4882a593Smuzhiyun if (prev_subbuf) 357*4882a593Smuzhiyun *((unsigned *)prev_subbuf) = prev_padding; 358*4882a593Smuzhiyun 359*4882a593Smuzhiyun if (relay_buf_full(buf)) 360*4882a593Smuzhiyun return 0; 361*4882a593Smuzhiyun 362*4882a593Smuzhiyun subbuf_start_reserve(buf, sizeof(unsigned int)); 363*4882a593Smuzhiyun 364*4882a593Smuzhiyun return 1; 365*4882a593Smuzhiyun } 366*4882a593Smuzhiyun 367*4882a593SmuzhiyunIf the current buffer is full, i.e. all sub-buffers remain unconsumed, 368*4882a593Smuzhiyunthe callback returns 0 to indicate that the buffer switch should not 369*4882a593Smuzhiyunoccur yet, i.e. until the consumer has had a chance to read the 370*4882a593Smuzhiyuncurrent set of ready sub-buffers. For the relay_buf_full() function 371*4882a593Smuzhiyunto make sense, the consumer is responsible for notifying the relay 372*4882a593Smuzhiyuninterface when sub-buffers have been consumed via 373*4882a593Smuzhiyunrelay_subbufs_consumed(). Any subsequent attempts to write into the 374*4882a593Smuzhiyunbuffer will again invoke the subbuf_start() callback with the same 375*4882a593Smuzhiyunparameters; only when the consumer has consumed one or more of the 376*4882a593Smuzhiyunready sub-buffers will relay_buf_full() return 0, in which case the 377*4882a593Smuzhiyunbuffer switch can continue. 378*4882a593Smuzhiyun 379*4882a593SmuzhiyunThe implementation of the subbuf_start() callback for 'overwrite' mode 380*4882a593Smuzhiyunwould be very similar:: 381*4882a593Smuzhiyun 382*4882a593Smuzhiyun static int subbuf_start(struct rchan_buf *buf, 383*4882a593Smuzhiyun void *subbuf, 384*4882a593Smuzhiyun void *prev_subbuf, 385*4882a593Smuzhiyun size_t prev_padding) 386*4882a593Smuzhiyun { 387*4882a593Smuzhiyun if (prev_subbuf) 388*4882a593Smuzhiyun *((unsigned *)prev_subbuf) = prev_padding; 389*4882a593Smuzhiyun 390*4882a593Smuzhiyun subbuf_start_reserve(buf, sizeof(unsigned int)); 391*4882a593Smuzhiyun 392*4882a593Smuzhiyun return 1; 393*4882a593Smuzhiyun } 394*4882a593Smuzhiyun 395*4882a593SmuzhiyunIn this case, the relay_buf_full() check is meaningless and the 396*4882a593Smuzhiyuncallback always returns 1, causing the buffer switch to occur 397*4882a593Smuzhiyununconditionally. It's also meaningless for the client to use the 398*4882a593Smuzhiyunrelay_subbufs_consumed() function in this mode, as it's never 399*4882a593Smuzhiyunconsulted. 400*4882a593Smuzhiyun 401*4882a593SmuzhiyunThe default subbuf_start() implementation, used if the client doesn't 402*4882a593Smuzhiyundefine any callbacks, or doesn't define the subbuf_start() callback, 403*4882a593Smuzhiyunimplements the simplest possible 'no-overwrite' mode, i.e. it does 404*4882a593Smuzhiyunnothing but return 0. 405*4882a593Smuzhiyun 406*4882a593SmuzhiyunHeader information can be reserved at the beginning of each sub-buffer 407*4882a593Smuzhiyunby calling the subbuf_start_reserve() helper function from within the 408*4882a593Smuzhiyunsubbuf_start() callback. This reserved area can be used to store 409*4882a593Smuzhiyunwhatever information the client wants. In the example above, room is 410*4882a593Smuzhiyunreserved in each sub-buffer to store the padding count for that 411*4882a593Smuzhiyunsub-buffer. This is filled in for the previous sub-buffer in the 412*4882a593Smuzhiyunsubbuf_start() implementation; the padding value for the previous 413*4882a593Smuzhiyunsub-buffer is passed into the subbuf_start() callback along with a 414*4882a593Smuzhiyunpointer to the previous sub-buffer, since the padding value isn't 415*4882a593Smuzhiyunknown until a sub-buffer is filled. The subbuf_start() callback is 416*4882a593Smuzhiyunalso called for the first sub-buffer when the channel is opened, to 417*4882a593Smuzhiyungive the client a chance to reserve space in it. In this case the 418*4882a593Smuzhiyunprevious sub-buffer pointer passed into the callback will be NULL, so 419*4882a593Smuzhiyunthe client should check the value of the prev_subbuf pointer before 420*4882a593Smuzhiyunwriting into the previous sub-buffer. 421*4882a593Smuzhiyun 422*4882a593SmuzhiyunWriting to a channel 423*4882a593Smuzhiyun-------------------- 424*4882a593Smuzhiyun 425*4882a593SmuzhiyunKernel clients write data into the current cpu's channel buffer using 426*4882a593Smuzhiyunrelay_write() or __relay_write(). relay_write() is the main logging 427*4882a593Smuzhiyunfunction - it uses local_irqsave() to protect the buffer and should be 428*4882a593Smuzhiyunused if you might be logging from interrupt context. If you know 429*4882a593Smuzhiyunyou'll never be logging from interrupt context, you can use 430*4882a593Smuzhiyun__relay_write(), which only disables preemption. These functions 431*4882a593Smuzhiyundon't return a value, so you can't determine whether or not they 432*4882a593Smuzhiyunfailed - the assumption is that you wouldn't want to check a return 433*4882a593Smuzhiyunvalue in the fast logging path anyway, and that they'll always succeed 434*4882a593Smuzhiyununless the buffer is full and no-overwrite mode is being used, in 435*4882a593Smuzhiyunwhich case you can detect a failed write in the subbuf_start() 436*4882a593Smuzhiyuncallback by calling the relay_buf_full() helper function. 437*4882a593Smuzhiyun 438*4882a593Smuzhiyunrelay_reserve() is used to reserve a slot in a channel buffer which 439*4882a593Smuzhiyuncan be written to later. This would typically be used in applications 440*4882a593Smuzhiyunthat need to write directly into a channel buffer without having to 441*4882a593Smuzhiyunstage data in a temporary buffer beforehand. Because the actual write 442*4882a593Smuzhiyunmay not happen immediately after the slot is reserved, applications 443*4882a593Smuzhiyunusing relay_reserve() can keep a count of the number of bytes actually 444*4882a593Smuzhiyunwritten, either in space reserved in the sub-buffers themselves or as 445*4882a593Smuzhiyuna separate array. See the 'reserve' example in the relay-apps tarball 446*4882a593Smuzhiyunat http://relayfs.sourceforge.net for an example of how this can be 447*4882a593Smuzhiyundone. Because the write is under control of the client and is 448*4882a593Smuzhiyunseparated from the reserve, relay_reserve() doesn't protect the buffer 449*4882a593Smuzhiyunat all - it's up to the client to provide the appropriate 450*4882a593Smuzhiyunsynchronization when using relay_reserve(). 451*4882a593Smuzhiyun 452*4882a593SmuzhiyunClosing a channel 453*4882a593Smuzhiyun----------------- 454*4882a593Smuzhiyun 455*4882a593SmuzhiyunThe client calls relay_close() when it's finished using the channel. 456*4882a593SmuzhiyunThe channel and its associated buffers are destroyed when there are no 457*4882a593Smuzhiyunlonger any references to any of the channel buffers. relay_flush() 458*4882a593Smuzhiyunforces a sub-buffer switch on all the channel buffers, and can be used 459*4882a593Smuzhiyunto finalize and process the last sub-buffers before the channel is 460*4882a593Smuzhiyunclosed. 461*4882a593Smuzhiyun 462*4882a593SmuzhiyunMisc 463*4882a593Smuzhiyun---- 464*4882a593Smuzhiyun 465*4882a593SmuzhiyunSome applications may want to keep a channel around and re-use it 466*4882a593Smuzhiyunrather than open and close a new channel for each use. relay_reset() 467*4882a593Smuzhiyuncan be used for this purpose - it resets a channel to its initial 468*4882a593Smuzhiyunstate without reallocating channel buffer memory or destroying 469*4882a593Smuzhiyunexisting mappings. It should however only be called when it's safe to 470*4882a593Smuzhiyundo so, i.e. when the channel isn't currently being written to. 471*4882a593Smuzhiyun 472*4882a593SmuzhiyunFinally, there are a couple of utility callbacks that can be used for 473*4882a593Smuzhiyundifferent purposes. buf_mapped() is called whenever a channel buffer 474*4882a593Smuzhiyunis mmapped from user space and buf_unmapped() is called when it's 475*4882a593Smuzhiyununmapped. The client can use this notification to trigger actions 476*4882a593Smuzhiyunwithin the kernel application, such as enabling/disabling logging to 477*4882a593Smuzhiyunthe channel. 478*4882a593Smuzhiyun 479*4882a593Smuzhiyun 480*4882a593SmuzhiyunResources 481*4882a593Smuzhiyun========= 482*4882a593Smuzhiyun 483*4882a593SmuzhiyunFor news, example code, mailing list, etc. see the relay interface homepage: 484*4882a593Smuzhiyun 485*4882a593Smuzhiyun http://relayfs.sourceforge.net 486*4882a593Smuzhiyun 487*4882a593Smuzhiyun 488*4882a593SmuzhiyunCredits 489*4882a593Smuzhiyun======= 490*4882a593Smuzhiyun 491*4882a593SmuzhiyunThe ideas and specs for the relay interface came about as a result of 492*4882a593Smuzhiyundiscussions on tracing involving the following: 493*4882a593Smuzhiyun 494*4882a593SmuzhiyunMichel Dagenais <michel.dagenais@polymtl.ca> 495*4882a593SmuzhiyunRichard Moore <richardj_moore@uk.ibm.com> 496*4882a593SmuzhiyunBob Wisniewski <bob@watson.ibm.com> 497*4882a593SmuzhiyunKarim Yaghmour <karim@opersys.com> 498*4882a593SmuzhiyunTom Zanussi <zanussi@us.ibm.com> 499*4882a593Smuzhiyun 500*4882a593SmuzhiyunAlso thanks to Hubertus Franke for a lot of useful suggestions and bug 501*4882a593Smuzhiyunreports. 502