xref: /OK3568_Linux_fs/kernel/Documentation/target/tcmu-design.rst (revision 4882a59341e53eb6f0b4789bf948001014eff981)
1*4882a593Smuzhiyun====================
2*4882a593SmuzhiyunTCM Userspace Design
3*4882a593Smuzhiyun====================
4*4882a593Smuzhiyun
5*4882a593Smuzhiyun
6*4882a593Smuzhiyun.. Contents:
7*4882a593Smuzhiyun
8*4882a593Smuzhiyun   1) Design
9*4882a593Smuzhiyun     a) Background
10*4882a593Smuzhiyun     b) Benefits
11*4882a593Smuzhiyun     c) Design constraints
12*4882a593Smuzhiyun     d) Implementation overview
13*4882a593Smuzhiyun        i. Mailbox
14*4882a593Smuzhiyun        ii. Command ring
15*4882a593Smuzhiyun        iii. Data Area
16*4882a593Smuzhiyun     e) Device discovery
17*4882a593Smuzhiyun     f) Device events
18*4882a593Smuzhiyun     g) Other contingencies
19*4882a593Smuzhiyun   2) Writing a user pass-through handler
20*4882a593Smuzhiyun     a) Discovering and configuring TCMU uio devices
21*4882a593Smuzhiyun     b) Waiting for events on the device(s)
22*4882a593Smuzhiyun     c) Managing the command ring
23*4882a593Smuzhiyun   3) A final note
24*4882a593Smuzhiyun
25*4882a593Smuzhiyun
26*4882a593SmuzhiyunDesign
27*4882a593Smuzhiyun======
28*4882a593Smuzhiyun
29*4882a593SmuzhiyunTCM is another name for LIO, an in-kernel iSCSI target (server).
30*4882a593SmuzhiyunExisting TCM targets run in the kernel.  TCMU (TCM in Userspace)
31*4882a593Smuzhiyunallows userspace programs to be written which act as iSCSI targets.
32*4882a593SmuzhiyunThis document describes the design.
33*4882a593Smuzhiyun
34*4882a593SmuzhiyunThe existing kernel provides modules for different SCSI transport
35*4882a593Smuzhiyunprotocols.  TCM also modularizes the data storage.  There are existing
36*4882a593Smuzhiyunmodules for file, block device, RAM or using another SCSI device as
37*4882a593Smuzhiyunstorage.  These are called "backstores" or "storage engines".  These
38*4882a593Smuzhiyunbuilt-in modules are implemented entirely as kernel code.
39*4882a593Smuzhiyun
40*4882a593SmuzhiyunBackground
41*4882a593Smuzhiyun----------
42*4882a593Smuzhiyun
43*4882a593SmuzhiyunIn addition to modularizing the transport protocol used for carrying
44*4882a593SmuzhiyunSCSI commands ("fabrics"), the Linux kernel target, LIO, also modularizes
45*4882a593Smuzhiyunthe actual data storage as well. These are referred to as "backstores"
46*4882a593Smuzhiyunor "storage engines". The target comes with backstores that allow a
47*4882a593Smuzhiyunfile, a block device, RAM, or another SCSI device to be used for the
48*4882a593Smuzhiyunlocal storage needed for the exported SCSI LUN. Like the rest of LIO,
49*4882a593Smuzhiyunthese are implemented entirely as kernel code.
50*4882a593Smuzhiyun
51*4882a593SmuzhiyunThese backstores cover the most common use cases, but not all. One new
52*4882a593Smuzhiyunuse case that other non-kernel target solutions, such as tgt, are able
53*4882a593Smuzhiyunto support is using Gluster's GLFS or Ceph's RBD as a backstore. The
54*4882a593Smuzhiyuntarget then serves as a translator, allowing initiators to store data
55*4882a593Smuzhiyunin these non-traditional networked storage systems, while still only
56*4882a593Smuzhiyunusing standard protocols themselves.
57*4882a593Smuzhiyun
58*4882a593SmuzhiyunIf the target is a userspace process, supporting these is easy. tgt,
59*4882a593Smuzhiyunfor example, needs only a small adapter module for each, because the
60*4882a593Smuzhiyunmodules just use the available userspace libraries for RBD and GLFS.
61*4882a593Smuzhiyun
62*4882a593SmuzhiyunAdding support for these backstores in LIO is considerably more
63*4882a593Smuzhiyundifficult, because LIO is entirely kernel code. Instead of undertaking
64*4882a593Smuzhiyunthe significant work to port the GLFS or RBD APIs and protocols to the
65*4882a593Smuzhiyunkernel, another approach is to create a userspace pass-through
66*4882a593Smuzhiyunbackstore for LIO, "TCMU".
67*4882a593Smuzhiyun
68*4882a593Smuzhiyun
69*4882a593SmuzhiyunBenefits
70*4882a593Smuzhiyun--------
71*4882a593Smuzhiyun
72*4882a593SmuzhiyunIn addition to allowing relatively easy support for RBD and GLFS, TCMU
73*4882a593Smuzhiyunwill also allow easier development of new backstores. TCMU combines
74*4882a593Smuzhiyunwith the LIO loopback fabric to become something similar to FUSE
75*4882a593Smuzhiyun(Filesystem in Userspace), but at the SCSI layer instead of the
76*4882a593Smuzhiyunfilesystem layer. A SUSE, if you will.
77*4882a593Smuzhiyun
78*4882a593SmuzhiyunThe disadvantage is there are more distinct components to configure, and
79*4882a593Smuzhiyunpotentially to malfunction. This is unavoidable, but hopefully not
80*4882a593Smuzhiyunfatal if we're careful to keep things as simple as possible.
81*4882a593Smuzhiyun
82*4882a593SmuzhiyunDesign constraints
83*4882a593Smuzhiyun------------------
84*4882a593Smuzhiyun
85*4882a593Smuzhiyun- Good performance: high throughput, low latency
86*4882a593Smuzhiyun- Cleanly handle if userspace:
87*4882a593Smuzhiyun
88*4882a593Smuzhiyun   1) never attaches
89*4882a593Smuzhiyun   2) hangs
90*4882a593Smuzhiyun   3) dies
91*4882a593Smuzhiyun   4) misbehaves
92*4882a593Smuzhiyun
93*4882a593Smuzhiyun- Allow future flexibility in user & kernel implementations
94*4882a593Smuzhiyun- Be reasonably memory-efficient
95*4882a593Smuzhiyun- Simple to configure & run
96*4882a593Smuzhiyun- Simple to write a userspace backend
97*4882a593Smuzhiyun
98*4882a593Smuzhiyun
99*4882a593SmuzhiyunImplementation overview
100*4882a593Smuzhiyun-----------------------
101*4882a593Smuzhiyun
102*4882a593SmuzhiyunThe core of the TCMU interface is a memory region that is shared
103*4882a593Smuzhiyunbetween kernel and userspace. Within this region is: a control area
104*4882a593Smuzhiyun(mailbox); a lockless producer/consumer circular buffer for commands
105*4882a593Smuzhiyunto be passed up, and status returned; and an in/out data buffer area.
106*4882a593Smuzhiyun
107*4882a593SmuzhiyunTCMU uses the pre-existing UIO subsystem. UIO allows device driver
108*4882a593Smuzhiyundevelopment in userspace, and this is conceptually very close to the
109*4882a593SmuzhiyunTCMU use case, except instead of a physical device, TCMU implements a
110*4882a593Smuzhiyunmemory-mapped layout designed for SCSI commands. Using UIO also
111*4882a593Smuzhiyunbenefits TCMU by handling device introspection (e.g. a way for
112*4882a593Smuzhiyunuserspace to determine how large the shared region is) and signaling
113*4882a593Smuzhiyunmechanisms in both directions.
114*4882a593Smuzhiyun
115*4882a593SmuzhiyunThere are no embedded pointers in the memory region. Everything is
116*4882a593Smuzhiyunexpressed as an offset from the region's starting address. This allows
117*4882a593Smuzhiyunthe ring to still work if the user process dies and is restarted with
118*4882a593Smuzhiyunthe region mapped at a different virtual address.
119*4882a593Smuzhiyun
120*4882a593SmuzhiyunSee target_core_user.h for the struct definitions.
121*4882a593Smuzhiyun
122*4882a593SmuzhiyunThe Mailbox
123*4882a593Smuzhiyun-----------
124*4882a593Smuzhiyun
125*4882a593SmuzhiyunThe mailbox is always at the start of the shared memory region, and
126*4882a593Smuzhiyuncontains a version, details about the starting offset and size of the
127*4882a593Smuzhiyuncommand ring, and head and tail pointers to be used by the kernel and
128*4882a593Smuzhiyunuserspace (respectively) to put commands on the ring, and indicate
129*4882a593Smuzhiyunwhen the commands are completed.
130*4882a593Smuzhiyun
131*4882a593Smuzhiyunversion - 1 (userspace should abort if otherwise)
132*4882a593Smuzhiyun
133*4882a593Smuzhiyunflags:
134*4882a593Smuzhiyun    - TCMU_MAILBOX_FLAG_CAP_OOOC:
135*4882a593Smuzhiyun	indicates out-of-order completion is supported.
136*4882a593Smuzhiyun	See "The Command Ring" for details.
137*4882a593Smuzhiyun
138*4882a593Smuzhiyuncmdr_off
139*4882a593Smuzhiyun	The offset of the start of the command ring from the start
140*4882a593Smuzhiyun	of the memory region, to account for the mailbox size.
141*4882a593Smuzhiyuncmdr_size
142*4882a593Smuzhiyun	The size of the command ring. This does *not* need to be a
143*4882a593Smuzhiyun	power of two.
144*4882a593Smuzhiyuncmd_head
145*4882a593Smuzhiyun	Modified by the kernel to indicate when a command has been
146*4882a593Smuzhiyun	placed on the ring.
147*4882a593Smuzhiyuncmd_tail
148*4882a593Smuzhiyun	Modified by userspace to indicate when it has completed
149*4882a593Smuzhiyun	processing of a command.
150*4882a593Smuzhiyun
151*4882a593SmuzhiyunThe Command Ring
152*4882a593Smuzhiyun----------------
153*4882a593Smuzhiyun
154*4882a593SmuzhiyunCommands are placed on the ring by the kernel incrementing
155*4882a593Smuzhiyunmailbox.cmd_head by the size of the command, modulo cmdr_size, and
156*4882a593Smuzhiyunthen signaling userspace via uio_event_notify(). Once the command is
157*4882a593Smuzhiyuncompleted, userspace updates mailbox.cmd_tail in the same way and
158*4882a593Smuzhiyunsignals the kernel via a 4-byte write(). When cmd_head equals
159*4882a593Smuzhiyuncmd_tail, the ring is empty -- no commands are currently waiting to be
160*4882a593Smuzhiyunprocessed by userspace.
161*4882a593Smuzhiyun
162*4882a593SmuzhiyunTCMU commands are 8-byte aligned. They start with a common header
163*4882a593Smuzhiyuncontaining "len_op", a 32-bit value that stores the length, as well as
164*4882a593Smuzhiyunthe opcode in the lowest unused bits. It also contains cmd_id and
165*4882a593Smuzhiyunflags fields for setting by the kernel (kflags) and userspace
166*4882a593Smuzhiyun(uflags).
167*4882a593Smuzhiyun
168*4882a593SmuzhiyunCurrently only two opcodes are defined, TCMU_OP_CMD and TCMU_OP_PAD.
169*4882a593Smuzhiyun
170*4882a593SmuzhiyunWhen the opcode is CMD, the entry in the command ring is a struct
171*4882a593Smuzhiyuntcmu_cmd_entry. Userspace finds the SCSI CDB (Command Data Block) via
172*4882a593Smuzhiyuntcmu_cmd_entry.req.cdb_off. This is an offset from the start of the
173*4882a593Smuzhiyunoverall shared memory region, not the entry. The data in/out buffers
174*4882a593Smuzhiyunare accessible via tht req.iov[] array. iov_cnt contains the number of
175*4882a593Smuzhiyunentries in iov[] needed to describe either the Data-In or Data-Out
176*4882a593Smuzhiyunbuffers. For bidirectional commands, iov_cnt specifies how many iovec
177*4882a593Smuzhiyunentries cover the Data-Out area, and iov_bidi_cnt specifies how many
178*4882a593Smuzhiyuniovec entries immediately after that in iov[] cover the Data-In
179*4882a593Smuzhiyunarea. Just like other fields, iov.iov_base is an offset from the start
180*4882a593Smuzhiyunof the region.
181*4882a593Smuzhiyun
182*4882a593SmuzhiyunWhen completing a command, userspace sets rsp.scsi_status, and
183*4882a593Smuzhiyunrsp.sense_buffer if necessary. Userspace then increments
184*4882a593Smuzhiyunmailbox.cmd_tail by entry.hdr.length (mod cmdr_size) and signals the
185*4882a593Smuzhiyunkernel via the UIO method, a 4-byte write to the file descriptor.
186*4882a593Smuzhiyun
187*4882a593SmuzhiyunIf TCMU_MAILBOX_FLAG_CAP_OOOC is set for mailbox->flags, kernel is
188*4882a593Smuzhiyuncapable of handling out-of-order completions. In this case, userspace can
189*4882a593Smuzhiyunhandle command in different order other than original. Since kernel would
190*4882a593Smuzhiyunstill process the commands in the same order it appeared in the command
191*4882a593Smuzhiyunring, userspace need to update the cmd->id when completing the
192*4882a593Smuzhiyuncommand(a.k.a steal the original command's entry).
193*4882a593Smuzhiyun
194*4882a593SmuzhiyunWhen the opcode is PAD, userspace only updates cmd_tail as above --
195*4882a593Smuzhiyunit's a no-op. (The kernel inserts PAD entries to ensure each CMD entry
196*4882a593Smuzhiyunis contiguous within the command ring.)
197*4882a593Smuzhiyun
198*4882a593SmuzhiyunMore opcodes may be added in the future. If userspace encounters an
199*4882a593Smuzhiyunopcode it does not handle, it must set UNKNOWN_OP bit (bit 0) in
200*4882a593Smuzhiyunhdr.uflags, update cmd_tail, and proceed with processing additional
201*4882a593Smuzhiyuncommands, if any.
202*4882a593Smuzhiyun
203*4882a593SmuzhiyunThe Data Area
204*4882a593Smuzhiyun-------------
205*4882a593Smuzhiyun
206*4882a593SmuzhiyunThis is shared-memory space after the command ring. The organization
207*4882a593Smuzhiyunof this area is not defined in the TCMU interface, and userspace
208*4882a593Smuzhiyunshould access only the parts referenced by pending iovs.
209*4882a593Smuzhiyun
210*4882a593Smuzhiyun
211*4882a593SmuzhiyunDevice Discovery
212*4882a593Smuzhiyun----------------
213*4882a593Smuzhiyun
214*4882a593SmuzhiyunOther devices may be using UIO besides TCMU. Unrelated user processes
215*4882a593Smuzhiyunmay also be handling different sets of TCMU devices. TCMU userspace
216*4882a593Smuzhiyunprocesses must find their devices by scanning sysfs
217*4882a593Smuzhiyunclass/uio/uio*/name. For TCMU devices, these names will be of the
218*4882a593Smuzhiyunformat::
219*4882a593Smuzhiyun
220*4882a593Smuzhiyun	tcm-user/<hba_num>/<device_name>/<subtype>/<path>
221*4882a593Smuzhiyun
222*4882a593Smuzhiyunwhere "tcm-user" is common for all TCMU-backed UIO devices. <hba_num>
223*4882a593Smuzhiyunand <device_name> allow userspace to find the device's path in the
224*4882a593Smuzhiyunkernel target's configfs tree. Assuming the usual mount point, it is
225*4882a593Smuzhiyunfound at::
226*4882a593Smuzhiyun
227*4882a593Smuzhiyun	/sys/kernel/config/target/core/user_<hba_num>/<device_name>
228*4882a593Smuzhiyun
229*4882a593SmuzhiyunThis location contains attributes such as "hw_block_size", that
230*4882a593Smuzhiyunuserspace needs to know for correct operation.
231*4882a593Smuzhiyun
232*4882a593Smuzhiyun<subtype> will be a userspace-process-unique string to identify the
233*4882a593SmuzhiyunTCMU device as expecting to be backed by a certain handler, and <path>
234*4882a593Smuzhiyunwill be an additional handler-specific string for the user process to
235*4882a593Smuzhiyunconfigure the device, if needed. The name cannot contain ':', due to
236*4882a593SmuzhiyunLIO limitations.
237*4882a593Smuzhiyun
238*4882a593SmuzhiyunFor all devices so discovered, the user handler opens /dev/uioX and
239*4882a593Smuzhiyuncalls mmap()::
240*4882a593Smuzhiyun
241*4882a593Smuzhiyun	mmap(NULL, size, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0)
242*4882a593Smuzhiyun
243*4882a593Smuzhiyunwhere size must be equal to the value read from
244*4882a593Smuzhiyun/sys/class/uio/uioX/maps/map0/size.
245*4882a593Smuzhiyun
246*4882a593Smuzhiyun
247*4882a593SmuzhiyunDevice Events
248*4882a593Smuzhiyun-------------
249*4882a593Smuzhiyun
250*4882a593SmuzhiyunIf a new device is added or removed, a notification will be broadcast
251*4882a593Smuzhiyunover netlink, using a generic netlink family name of "TCM-USER" and a
252*4882a593Smuzhiyunmulticast group named "config". This will include the UIO name as
253*4882a593Smuzhiyundescribed in the previous section, as well as the UIO minor
254*4882a593Smuzhiyunnumber. This should allow userspace to identify both the UIO device and
255*4882a593Smuzhiyunthe LIO device, so that after determining the device is supported
256*4882a593Smuzhiyun(based on subtype) it can take the appropriate action.
257*4882a593Smuzhiyun
258*4882a593Smuzhiyun
259*4882a593SmuzhiyunOther contingencies
260*4882a593Smuzhiyun-------------------
261*4882a593Smuzhiyun
262*4882a593SmuzhiyunUserspace handler process never attaches:
263*4882a593Smuzhiyun
264*4882a593Smuzhiyun- TCMU will post commands, and then abort them after a timeout period
265*4882a593Smuzhiyun  (30 seconds.)
266*4882a593Smuzhiyun
267*4882a593SmuzhiyunUserspace handler process is killed:
268*4882a593Smuzhiyun
269*4882a593Smuzhiyun- It is still possible to restart and re-connect to TCMU
270*4882a593Smuzhiyun  devices. Command ring is preserved. However, after the timeout period,
271*4882a593Smuzhiyun  the kernel will abort pending tasks.
272*4882a593Smuzhiyun
273*4882a593SmuzhiyunUserspace handler process hangs:
274*4882a593Smuzhiyun
275*4882a593Smuzhiyun- The kernel will abort pending tasks after a timeout period.
276*4882a593Smuzhiyun
277*4882a593SmuzhiyunUserspace handler process is malicious:
278*4882a593Smuzhiyun
279*4882a593Smuzhiyun- The process can trivially break the handling of devices it controls,
280*4882a593Smuzhiyun  but should not be able to access kernel memory outside its shared
281*4882a593Smuzhiyun  memory areas.
282*4882a593Smuzhiyun
283*4882a593Smuzhiyun
284*4882a593SmuzhiyunWriting a user pass-through handler (with example code)
285*4882a593Smuzhiyun=======================================================
286*4882a593Smuzhiyun
287*4882a593SmuzhiyunA user process handing a TCMU device must support the following:
288*4882a593Smuzhiyun
289*4882a593Smuzhiyuna) Discovering and configuring TCMU uio devices
290*4882a593Smuzhiyunb) Waiting for events on the device(s)
291*4882a593Smuzhiyunc) Managing the command ring: Parsing operations and commands,
292*4882a593Smuzhiyun   performing work as needed, setting response fields (scsi_status and
293*4882a593Smuzhiyun   possibly sense_buffer), updating cmd_tail, and notifying the kernel
294*4882a593Smuzhiyun   that work has been finished
295*4882a593Smuzhiyun
296*4882a593SmuzhiyunFirst, consider instead writing a plugin for tcmu-runner. tcmu-runner
297*4882a593Smuzhiyunimplements all of this, and provides a higher-level API for plugin
298*4882a593Smuzhiyunauthors.
299*4882a593Smuzhiyun
300*4882a593SmuzhiyunTCMU is designed so that multiple unrelated processes can manage TCMU
301*4882a593Smuzhiyundevices separately. All handlers should make sure to only open their
302*4882a593Smuzhiyundevices, based opon a known subtype string.
303*4882a593Smuzhiyun
304*4882a593Smuzhiyuna) Discovering and configuring TCMU UIO devices::
305*4882a593Smuzhiyun
306*4882a593Smuzhiyun      /* error checking omitted for brevity */
307*4882a593Smuzhiyun
308*4882a593Smuzhiyun      int fd, dev_fd;
309*4882a593Smuzhiyun      char buf[256];
310*4882a593Smuzhiyun      unsigned long long map_len;
311*4882a593Smuzhiyun      void *map;
312*4882a593Smuzhiyun
313*4882a593Smuzhiyun      fd = open("/sys/class/uio/uio0/name", O_RDONLY);
314*4882a593Smuzhiyun      ret = read(fd, buf, sizeof(buf));
315*4882a593Smuzhiyun      close(fd);
316*4882a593Smuzhiyun      buf[ret-1] = '\0'; /* null-terminate and chop off the \n */
317*4882a593Smuzhiyun
318*4882a593Smuzhiyun      /* we only want uio devices whose name is a format we expect */
319*4882a593Smuzhiyun      if (strncmp(buf, "tcm-user", 8))
320*4882a593Smuzhiyun	exit(-1);
321*4882a593Smuzhiyun
322*4882a593Smuzhiyun      /* Further checking for subtype also needed here */
323*4882a593Smuzhiyun
324*4882a593Smuzhiyun      fd = open(/sys/class/uio/%s/maps/map0/size, O_RDONLY);
325*4882a593Smuzhiyun      ret = read(fd, buf, sizeof(buf));
326*4882a593Smuzhiyun      close(fd);
327*4882a593Smuzhiyun      str_buf[ret-1] = '\0'; /* null-terminate and chop off the \n */
328*4882a593Smuzhiyun
329*4882a593Smuzhiyun      map_len = strtoull(buf, NULL, 0);
330*4882a593Smuzhiyun
331*4882a593Smuzhiyun      dev_fd = open("/dev/uio0", O_RDWR);
332*4882a593Smuzhiyun      map = mmap(NULL, map_len, PROT_READ|PROT_WRITE, MAP_SHARED, dev_fd, 0);
333*4882a593Smuzhiyun
334*4882a593Smuzhiyun
335*4882a593Smuzhiyun      b) Waiting for events on the device(s)
336*4882a593Smuzhiyun
337*4882a593Smuzhiyun      while (1) {
338*4882a593Smuzhiyun        char buf[4];
339*4882a593Smuzhiyun
340*4882a593Smuzhiyun        int ret = read(dev_fd, buf, 4); /* will block */
341*4882a593Smuzhiyun
342*4882a593Smuzhiyun        handle_device_events(dev_fd, map);
343*4882a593Smuzhiyun      }
344*4882a593Smuzhiyun
345*4882a593Smuzhiyun
346*4882a593Smuzhiyunc) Managing the command ring::
347*4882a593Smuzhiyun
348*4882a593Smuzhiyun      #include <linux/target_core_user.h>
349*4882a593Smuzhiyun
350*4882a593Smuzhiyun      int handle_device_events(int fd, void *map)
351*4882a593Smuzhiyun      {
352*4882a593Smuzhiyun        struct tcmu_mailbox *mb = map;
353*4882a593Smuzhiyun        struct tcmu_cmd_entry *ent = (void *) mb + mb->cmdr_off + mb->cmd_tail;
354*4882a593Smuzhiyun        int did_some_work = 0;
355*4882a593Smuzhiyun
356*4882a593Smuzhiyun        /* Process events from cmd ring until we catch up with cmd_head */
357*4882a593Smuzhiyun        while (ent != (void *)mb + mb->cmdr_off + mb->cmd_head) {
358*4882a593Smuzhiyun
359*4882a593Smuzhiyun          if (tcmu_hdr_get_op(ent->hdr.len_op) == TCMU_OP_CMD) {
360*4882a593Smuzhiyun            uint8_t *cdb = (void *)mb + ent->req.cdb_off;
361*4882a593Smuzhiyun            bool success = true;
362*4882a593Smuzhiyun
363*4882a593Smuzhiyun            /* Handle command here. */
364*4882a593Smuzhiyun            printf("SCSI opcode: 0x%x\n", cdb[0]);
365*4882a593Smuzhiyun
366*4882a593Smuzhiyun            /* Set response fields */
367*4882a593Smuzhiyun            if (success)
368*4882a593Smuzhiyun              ent->rsp.scsi_status = SCSI_NO_SENSE;
369*4882a593Smuzhiyun            else {
370*4882a593Smuzhiyun              /* Also fill in rsp->sense_buffer here */
371*4882a593Smuzhiyun              ent->rsp.scsi_status = SCSI_CHECK_CONDITION;
372*4882a593Smuzhiyun            }
373*4882a593Smuzhiyun          }
374*4882a593Smuzhiyun          else if (tcmu_hdr_get_op(ent->hdr.len_op) != TCMU_OP_PAD) {
375*4882a593Smuzhiyun            /* Tell the kernel we didn't handle unknown opcodes */
376*4882a593Smuzhiyun            ent->hdr.uflags |= TCMU_UFLAG_UNKNOWN_OP;
377*4882a593Smuzhiyun          }
378*4882a593Smuzhiyun          else {
379*4882a593Smuzhiyun            /* Do nothing for PAD entries except update cmd_tail */
380*4882a593Smuzhiyun          }
381*4882a593Smuzhiyun
382*4882a593Smuzhiyun          /* update cmd_tail */
383*4882a593Smuzhiyun          mb->cmd_tail = (mb->cmd_tail + tcmu_hdr_get_len(&ent->hdr)) % mb->cmdr_size;
384*4882a593Smuzhiyun          ent = (void *) mb + mb->cmdr_off + mb->cmd_tail;
385*4882a593Smuzhiyun          did_some_work = 1;
386*4882a593Smuzhiyun        }
387*4882a593Smuzhiyun
388*4882a593Smuzhiyun        /* Notify the kernel that work has been finished */
389*4882a593Smuzhiyun        if (did_some_work) {
390*4882a593Smuzhiyun          uint32_t buf = 0;
391*4882a593Smuzhiyun
392*4882a593Smuzhiyun          write(fd, &buf, 4);
393*4882a593Smuzhiyun        }
394*4882a593Smuzhiyun
395*4882a593Smuzhiyun        return 0;
396*4882a593Smuzhiyun      }
397*4882a593Smuzhiyun
398*4882a593Smuzhiyun
399*4882a593SmuzhiyunA final note
400*4882a593Smuzhiyun============
401*4882a593Smuzhiyun
402*4882a593SmuzhiyunPlease be careful to return codes as defined by the SCSI
403*4882a593Smuzhiyunspecifications. These are different than some values defined in the
404*4882a593Smuzhiyunscsi/scsi.h include file. For example, CHECK CONDITION's status code
405*4882a593Smuzhiyunis 2, not 1.
406