1*4882a593Smuzhiyun.. SPDX-License-Identifier: GPL-2.0 2*4882a593Smuzhiyun 3*4882a593Smuzhiyun======= 4*4882a593SmuzhiyunSCSI EH 5*4882a593Smuzhiyun======= 6*4882a593Smuzhiyun 7*4882a593SmuzhiyunThis document describes SCSI midlayer error handling infrastructure. 8*4882a593SmuzhiyunPlease refer to Documentation/scsi/scsi_mid_low_api.rst for more 9*4882a593Smuzhiyuninformation regarding SCSI midlayer. 10*4882a593Smuzhiyun 11*4882a593Smuzhiyun.. TABLE OF CONTENTS 12*4882a593Smuzhiyun 13*4882a593Smuzhiyun [1] How SCSI commands travel through the midlayer and to EH 14*4882a593Smuzhiyun [1-1] struct scsi_cmnd 15*4882a593Smuzhiyun [1-2] How do scmd's get completed? 16*4882a593Smuzhiyun [1-2-1] Completing a scmd w/ scsi_done 17*4882a593Smuzhiyun [1-2-2] Completing a scmd w/ timeout 18*4882a593Smuzhiyun [1-3] How EH takes over 19*4882a593Smuzhiyun [2] How SCSI EH works 20*4882a593Smuzhiyun [2-1] EH through fine-grained callbacks 21*4882a593Smuzhiyun [2-1-1] Overview 22*4882a593Smuzhiyun [2-1-2] Flow of scmds through EH 23*4882a593Smuzhiyun [2-1-3] Flow of control 24*4882a593Smuzhiyun [2-2] EH through transportt->eh_strategy_handler() 25*4882a593Smuzhiyun [2-2-1] Pre transportt->eh_strategy_handler() SCSI midlayer conditions 26*4882a593Smuzhiyun [2-2-2] Post transportt->eh_strategy_handler() SCSI midlayer conditions 27*4882a593Smuzhiyun [2-2-3] Things to consider 28*4882a593Smuzhiyun 29*4882a593Smuzhiyun 30*4882a593Smuzhiyun1. How SCSI commands travel through the midlayer and to EH 31*4882a593Smuzhiyun========================================================== 32*4882a593Smuzhiyun 33*4882a593Smuzhiyun1.1 struct scsi_cmnd 34*4882a593Smuzhiyun-------------------- 35*4882a593Smuzhiyun 36*4882a593SmuzhiyunEach SCSI command is represented with struct scsi_cmnd (== scmd). A 37*4882a593Smuzhiyunscmd has two list_head's to link itself into lists. The two are 38*4882a593Smuzhiyunscmd->list and scmd->eh_entry. The former is used for free list or 39*4882a593Smuzhiyunper-device allocated scmd list and not of much interest to this EH 40*4882a593Smuzhiyundiscussion. The latter is used for completion and EH lists and unless 41*4882a593Smuzhiyunotherwise stated scmds are always linked using scmd->eh_entry in this 42*4882a593Smuzhiyundiscussion. 43*4882a593Smuzhiyun 44*4882a593Smuzhiyun 45*4882a593Smuzhiyun1.2 How do scmd's get completed? 46*4882a593Smuzhiyun-------------------------------- 47*4882a593Smuzhiyun 48*4882a593SmuzhiyunOnce LLDD gets hold of a scmd, either the LLDD will complete the 49*4882a593Smuzhiyuncommand by calling scsi_done callback passed from midlayer when 50*4882a593Smuzhiyuninvoking hostt->queuecommand() or the block layer will time it out. 51*4882a593Smuzhiyun 52*4882a593Smuzhiyun 53*4882a593Smuzhiyun1.2.1 Completing a scmd w/ scsi_done 54*4882a593Smuzhiyun^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 55*4882a593Smuzhiyun 56*4882a593SmuzhiyunFor all non-EH commands, scsi_done() is the completion callback. It 57*4882a593Smuzhiyunjust calls blk_complete_request() to delete the block layer timer and 58*4882a593Smuzhiyunraise SCSI_SOFTIRQ 59*4882a593Smuzhiyun 60*4882a593SmuzhiyunSCSI_SOFTIRQ handler scsi_softirq calls scsi_decide_disposition() to 61*4882a593Smuzhiyundetermine what to do with the command. scsi_decide_disposition() 62*4882a593Smuzhiyunlooks at the scmd->result value and sense data to determine what to do 63*4882a593Smuzhiyunwith the command. 64*4882a593Smuzhiyun 65*4882a593Smuzhiyun - SUCCESS 66*4882a593Smuzhiyun 67*4882a593Smuzhiyun scsi_finish_command() is invoked for the command. The 68*4882a593Smuzhiyun function does some maintenance chores and then calls 69*4882a593Smuzhiyun scsi_io_completion() to finish the I/O. 70*4882a593Smuzhiyun scsi_io_completion() then notifies the block layer on 71*4882a593Smuzhiyun the completed request by calling blk_end_request and 72*4882a593Smuzhiyun friends or figures out what to do with the remainder 73*4882a593Smuzhiyun of the data in case of an error. 74*4882a593Smuzhiyun 75*4882a593Smuzhiyun - NEEDS_RETRY 76*4882a593Smuzhiyun 77*4882a593Smuzhiyun - ADD_TO_MLQUEUE 78*4882a593Smuzhiyun 79*4882a593Smuzhiyun scmd is requeued to blk queue. 80*4882a593Smuzhiyun 81*4882a593Smuzhiyun - otherwise 82*4882a593Smuzhiyun 83*4882a593Smuzhiyun scsi_eh_scmd_add(scmd) is invoked for the command. See 84*4882a593Smuzhiyun [1-3] for details of this function. 85*4882a593Smuzhiyun 86*4882a593Smuzhiyun 87*4882a593Smuzhiyun1.2.2 Completing a scmd w/ timeout 88*4882a593Smuzhiyun^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 89*4882a593Smuzhiyun 90*4882a593SmuzhiyunThe timeout handler is scsi_times_out(). When a timeout occurs, this 91*4882a593Smuzhiyunfunction 92*4882a593Smuzhiyun 93*4882a593Smuzhiyun 1. invokes optional hostt->eh_timed_out() callback. Return value can 94*4882a593Smuzhiyun be one of 95*4882a593Smuzhiyun 96*4882a593Smuzhiyun - BLK_EH_RESET_TIMER 97*4882a593Smuzhiyun This indicates that more time is required to finish the 98*4882a593Smuzhiyun command. Timer is restarted. This action is counted as a 99*4882a593Smuzhiyun retry and only allowed scmd->allowed + 1(!) times. Once the 100*4882a593Smuzhiyun limit is reached, action for BLK_EH_DONE is taken instead. 101*4882a593Smuzhiyun 102*4882a593Smuzhiyun - BLK_EH_DONE 103*4882a593Smuzhiyun eh_timed_out() callback did not handle the command. 104*4882a593Smuzhiyun Step #2 is taken. 105*4882a593Smuzhiyun 106*4882a593Smuzhiyun 2. scsi_abort_command() is invoked to schedule an asynchrous abort. 107*4882a593Smuzhiyun Asynchronous abort are not invoked for commands which the 108*4882a593Smuzhiyun SCSI_EH_ABORT_SCHEDULED flag is set (this indicates that the command 109*4882a593Smuzhiyun already had been aborted once, and this is a retry which failed), 110*4882a593Smuzhiyun or when the EH deadline is expired. In these case Step #3 is taken. 111*4882a593Smuzhiyun 112*4882a593Smuzhiyun 3. scsi_eh_scmd_add(scmd, SCSI_EH_CANCEL_CMD) is invoked for the 113*4882a593Smuzhiyun command. See [1-4] for more information. 114*4882a593Smuzhiyun 115*4882a593Smuzhiyun1.3 Asynchronous command aborts 116*4882a593Smuzhiyun------------------------------- 117*4882a593Smuzhiyun 118*4882a593Smuzhiyun After a timeout occurs a command abort is scheduled from 119*4882a593Smuzhiyun scsi_abort_command(). If the abort is successful the command 120*4882a593Smuzhiyun will either be retried (if the number of retries is not exhausted) 121*4882a593Smuzhiyun or terminated with DID_TIME_OUT. 122*4882a593Smuzhiyun 123*4882a593Smuzhiyun Otherwise scsi_eh_scmd_add() is invoked for the command. 124*4882a593Smuzhiyun See [1-4] for more information. 125*4882a593Smuzhiyun 126*4882a593Smuzhiyun1.4 How EH takes over 127*4882a593Smuzhiyun--------------------- 128*4882a593Smuzhiyun 129*4882a593Smuzhiyunscmds enter EH via scsi_eh_scmd_add(), which does the following. 130*4882a593Smuzhiyun 131*4882a593Smuzhiyun 1. Links scmd->eh_entry to shost->eh_cmd_q 132*4882a593Smuzhiyun 133*4882a593Smuzhiyun 2. Sets SHOST_RECOVERY bit in shost->shost_state 134*4882a593Smuzhiyun 135*4882a593Smuzhiyun 3. Increments shost->host_failed 136*4882a593Smuzhiyun 137*4882a593Smuzhiyun 4. Wakes up SCSI EH thread if shost->host_busy == shost->host_failed 138*4882a593Smuzhiyun 139*4882a593SmuzhiyunAs can be seen above, once any scmd is added to shost->eh_cmd_q, 140*4882a593SmuzhiyunSHOST_RECOVERY shost_state bit is turned on. This prevents any new 141*4882a593Smuzhiyunscmd to be issued from blk queue to the host; eventually, all scmds on 142*4882a593Smuzhiyunthe host either complete normally, fail and get added to eh_cmd_q, or 143*4882a593Smuzhiyuntime out and get added to shost->eh_cmd_q. 144*4882a593Smuzhiyun 145*4882a593SmuzhiyunIf all scmds either complete or fail, the number of in-flight scmds 146*4882a593Smuzhiyunbecomes equal to the number of failed scmds - i.e. shost->host_busy == 147*4882a593Smuzhiyunshost->host_failed. This wakes up SCSI EH thread. So, once woken up, 148*4882a593SmuzhiyunSCSI EH thread can expect that all in-flight commands have failed and 149*4882a593Smuzhiyunare linked on shost->eh_cmd_q. 150*4882a593Smuzhiyun 151*4882a593SmuzhiyunNote that this does not mean lower layers are quiescent. If a LLDD 152*4882a593Smuzhiyuncompleted a scmd with error status, the LLDD and lower layers are 153*4882a593Smuzhiyunassumed to forget about the scmd at that point. However, if a scmd 154*4882a593Smuzhiyunhas timed out, unless hostt->eh_timed_out() made lower layers forget 155*4882a593Smuzhiyunabout the scmd, which currently no LLDD does, the command is still 156*4882a593Smuzhiyunactive as long as lower layers are concerned and completion could 157*4882a593Smuzhiyunoccur at any time. Of course, all such completions are ignored as the 158*4882a593Smuzhiyuntimer has already expired. 159*4882a593Smuzhiyun 160*4882a593SmuzhiyunWe'll talk about how SCSI EH takes actions to abort - make LLDD 161*4882a593Smuzhiyunforget about - timed out scmds later. 162*4882a593Smuzhiyun 163*4882a593Smuzhiyun 164*4882a593Smuzhiyun2. How SCSI EH works 165*4882a593Smuzhiyun==================== 166*4882a593Smuzhiyun 167*4882a593SmuzhiyunLLDD's can implement SCSI EH actions in one of the following two 168*4882a593Smuzhiyunways. 169*4882a593Smuzhiyun 170*4882a593Smuzhiyun - Fine-grained EH callbacks 171*4882a593Smuzhiyun LLDD can implement fine-grained EH callbacks and let SCSI 172*4882a593Smuzhiyun midlayer drive error handling and call appropriate callbacks. 173*4882a593Smuzhiyun This will be discussed further in [2-1]. 174*4882a593Smuzhiyun 175*4882a593Smuzhiyun - eh_strategy_handler() callback 176*4882a593Smuzhiyun This is one big callback which should perform whole error 177*4882a593Smuzhiyun handling. As such, it should do all chores the SCSI midlayer 178*4882a593Smuzhiyun performs during recovery. This will be discussed in [2-2]. 179*4882a593Smuzhiyun 180*4882a593SmuzhiyunOnce recovery is complete, SCSI EH resumes normal operation by 181*4882a593Smuzhiyuncalling scsi_restart_operations(), which 182*4882a593Smuzhiyun 183*4882a593Smuzhiyun 1. Checks if door locking is needed and locks door. 184*4882a593Smuzhiyun 185*4882a593Smuzhiyun 2. Clears SHOST_RECOVERY shost_state bit 186*4882a593Smuzhiyun 187*4882a593Smuzhiyun 3. Wakes up waiters on shost->host_wait. This occurs if someone 188*4882a593Smuzhiyun calls scsi_block_when_processing_errors() on the host. 189*4882a593Smuzhiyun (*QUESTION* why is it needed? All operations will be blocked 190*4882a593Smuzhiyun anyway after it reaches blk queue.) 191*4882a593Smuzhiyun 192*4882a593Smuzhiyun 4. Kicks queues in all devices on the host in the asses 193*4882a593Smuzhiyun 194*4882a593Smuzhiyun 195*4882a593Smuzhiyun2.1 EH through fine-grained callbacks 196*4882a593Smuzhiyun------------------------------------- 197*4882a593Smuzhiyun 198*4882a593Smuzhiyun2.1.1 Overview 199*4882a593Smuzhiyun^^^^^^^^^^^^^^ 200*4882a593Smuzhiyun 201*4882a593SmuzhiyunIf eh_strategy_handler() is not present, SCSI midlayer takes charge 202*4882a593Smuzhiyunof driving error handling. EH's goals are two - make LLDD, host and 203*4882a593Smuzhiyundevice forget about timed out scmds and make them ready for new 204*4882a593Smuzhiyuncommands. A scmd is said to be recovered if the scmd is forgotten by 205*4882a593Smuzhiyunlower layers and lower layers are ready to process or fail the scmd 206*4882a593Smuzhiyunagain. 207*4882a593Smuzhiyun 208*4882a593SmuzhiyunTo achieve these goals, EH performs recovery actions with increasing 209*4882a593Smuzhiyunseverity. Some actions are performed by issuing SCSI commands and 210*4882a593Smuzhiyunothers are performed by invoking one of the following fine-grained 211*4882a593Smuzhiyunhostt EH callbacks. Callbacks may be omitted and omitted ones are 212*4882a593Smuzhiyunconsidered to fail always. 213*4882a593Smuzhiyun 214*4882a593Smuzhiyun:: 215*4882a593Smuzhiyun 216*4882a593Smuzhiyun int (* eh_abort_handler)(struct scsi_cmnd *); 217*4882a593Smuzhiyun int (* eh_device_reset_handler)(struct scsi_cmnd *); 218*4882a593Smuzhiyun int (* eh_bus_reset_handler)(struct scsi_cmnd *); 219*4882a593Smuzhiyun int (* eh_host_reset_handler)(struct scsi_cmnd *); 220*4882a593Smuzhiyun 221*4882a593SmuzhiyunHigher-severity actions are taken only when lower-severity actions 222*4882a593Smuzhiyuncannot recover some of failed scmds. Also, note that failure of the 223*4882a593Smuzhiyunhighest-severity action means EH failure and results in offlining of 224*4882a593Smuzhiyunall unrecovered devices. 225*4882a593Smuzhiyun 226*4882a593SmuzhiyunDuring recovery, the following rules are followed 227*4882a593Smuzhiyun 228*4882a593Smuzhiyun - Recovery actions are performed on failed scmds on the to do list, 229*4882a593Smuzhiyun eh_work_q. If a recovery action succeeds for a scmd, recovered 230*4882a593Smuzhiyun scmds are removed from eh_work_q. 231*4882a593Smuzhiyun 232*4882a593Smuzhiyun Note that single recovery action on a scmd can recover multiple 233*4882a593Smuzhiyun scmds. e.g. resetting a device recovers all failed scmds on the 234*4882a593Smuzhiyun device. 235*4882a593Smuzhiyun 236*4882a593Smuzhiyun - Higher severity actions are taken iff eh_work_q is not empty after 237*4882a593Smuzhiyun lower severity actions are complete. 238*4882a593Smuzhiyun 239*4882a593Smuzhiyun - EH reuses failed scmds to issue commands for recovery. For 240*4882a593Smuzhiyun timed-out scmds, SCSI EH ensures that LLDD forgets about a scmd 241*4882a593Smuzhiyun before reusing it for EH commands. 242*4882a593Smuzhiyun 243*4882a593SmuzhiyunWhen a scmd is recovered, the scmd is moved from eh_work_q to EH 244*4882a593Smuzhiyunlocal eh_done_q using scsi_eh_finish_cmd(). After all scmds are 245*4882a593Smuzhiyunrecovered (eh_work_q is empty), scsi_eh_flush_done_q() is invoked to 246*4882a593Smuzhiyuneither retry or error-finish (notify upper layer of failure) recovered 247*4882a593Smuzhiyunscmds. 248*4882a593Smuzhiyun 249*4882a593Smuzhiyunscmds are retried iff its sdev is still online (not offlined during 250*4882a593SmuzhiyunEH), REQ_FAILFAST is not set and ++scmd->retries is less than 251*4882a593Smuzhiyunscmd->allowed. 252*4882a593Smuzhiyun 253*4882a593Smuzhiyun 254*4882a593Smuzhiyun2.1.2 Flow of scmds through EH 255*4882a593Smuzhiyun^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 256*4882a593Smuzhiyun 257*4882a593Smuzhiyun 1. Error completion / time out 258*4882a593Smuzhiyun 259*4882a593Smuzhiyun :ACTION: scsi_eh_scmd_add() is invoked for scmd 260*4882a593Smuzhiyun 261*4882a593Smuzhiyun - add scmd to shost->eh_cmd_q 262*4882a593Smuzhiyun - set SHOST_RECOVERY 263*4882a593Smuzhiyun - shost->host_failed++ 264*4882a593Smuzhiyun 265*4882a593Smuzhiyun :LOCKING: shost->host_lock 266*4882a593Smuzhiyun 267*4882a593Smuzhiyun 2. EH starts 268*4882a593Smuzhiyun 269*4882a593Smuzhiyun :ACTION: move all scmds to EH's local eh_work_q. shost->eh_cmd_q 270*4882a593Smuzhiyun is cleared. 271*4882a593Smuzhiyun 272*4882a593Smuzhiyun :LOCKING: shost->host_lock (not strictly necessary, just for 273*4882a593Smuzhiyun consistency) 274*4882a593Smuzhiyun 275*4882a593Smuzhiyun 3. scmd recovered 276*4882a593Smuzhiyun 277*4882a593Smuzhiyun :ACTION: scsi_eh_finish_cmd() is invoked to EH-finish scmd 278*4882a593Smuzhiyun 279*4882a593Smuzhiyun - scsi_setup_cmd_retry() 280*4882a593Smuzhiyun - move from local eh_work_q to local eh_done_q 281*4882a593Smuzhiyun 282*4882a593Smuzhiyun :LOCKING: none 283*4882a593Smuzhiyun 284*4882a593Smuzhiyun :CONCURRENCY: at most one thread per separate eh_work_q to 285*4882a593Smuzhiyun keep queue manipulation lockless 286*4882a593Smuzhiyun 287*4882a593Smuzhiyun 4. EH completes 288*4882a593Smuzhiyun 289*4882a593Smuzhiyun :ACTION: scsi_eh_flush_done_q() retries scmds or notifies upper 290*4882a593Smuzhiyun layer of failure. May be called concurrently but must have 291*4882a593Smuzhiyun a no more than one thread per separate eh_work_q to 292*4882a593Smuzhiyun manipulate the queue locklessly 293*4882a593Smuzhiyun 294*4882a593Smuzhiyun - scmd is removed from eh_done_q and scmd->eh_entry is cleared 295*4882a593Smuzhiyun - if retry is necessary, scmd is requeued using 296*4882a593Smuzhiyun scsi_queue_insert() 297*4882a593Smuzhiyun - otherwise, scsi_finish_command() is invoked for scmd 298*4882a593Smuzhiyun - zero shost->host_failed 299*4882a593Smuzhiyun 300*4882a593Smuzhiyun :LOCKING: queue or finish function performs appropriate locking 301*4882a593Smuzhiyun 302*4882a593Smuzhiyun 303*4882a593Smuzhiyun2.1.3 Flow of control 304*4882a593Smuzhiyun^^^^^^^^^^^^^^^^^^^^^^ 305*4882a593Smuzhiyun 306*4882a593Smuzhiyun EH through fine-grained callbacks start from scsi_unjam_host(). 307*4882a593Smuzhiyun 308*4882a593Smuzhiyun``scsi_unjam_host`` 309*4882a593Smuzhiyun 310*4882a593Smuzhiyun 1. Lock shost->host_lock, splice_init shost->eh_cmd_q into local 311*4882a593Smuzhiyun eh_work_q and unlock host_lock. Note that shost->eh_cmd_q is 312*4882a593Smuzhiyun cleared by this action. 313*4882a593Smuzhiyun 314*4882a593Smuzhiyun 2. Invoke scsi_eh_get_sense. 315*4882a593Smuzhiyun 316*4882a593Smuzhiyun ``scsi_eh_get_sense`` 317*4882a593Smuzhiyun 318*4882a593Smuzhiyun This action is taken for each error-completed 319*4882a593Smuzhiyun (!SCSI_EH_CANCEL_CMD) commands without valid sense data. Most 320*4882a593Smuzhiyun SCSI transports/LLDDs automatically acquire sense data on 321*4882a593Smuzhiyun command failures (autosense). Autosense is recommended for 322*4882a593Smuzhiyun performance reasons and as sense information could get out of 323*4882a593Smuzhiyun sync between occurrence of CHECK CONDITION and this action. 324*4882a593Smuzhiyun 325*4882a593Smuzhiyun Note that if autosense is not supported, scmd->sense_buffer 326*4882a593Smuzhiyun contains invalid sense data when error-completing the scmd 327*4882a593Smuzhiyun with scsi_done(). scsi_decide_disposition() always returns 328*4882a593Smuzhiyun FAILED in such cases thus invoking SCSI EH. When the scmd 329*4882a593Smuzhiyun reaches here, sense data is acquired and 330*4882a593Smuzhiyun scsi_decide_disposition() is called again. 331*4882a593Smuzhiyun 332*4882a593Smuzhiyun 1. Invoke scsi_request_sense() which issues REQUEST_SENSE 333*4882a593Smuzhiyun command. If fails, no action. Note that taking no action 334*4882a593Smuzhiyun causes higher-severity recovery to be taken for the scmd. 335*4882a593Smuzhiyun 336*4882a593Smuzhiyun 2. Invoke scsi_decide_disposition() on the scmd 337*4882a593Smuzhiyun 338*4882a593Smuzhiyun - SUCCESS 339*4882a593Smuzhiyun scmd->retries is set to scmd->allowed preventing 340*4882a593Smuzhiyun scsi_eh_flush_done_q() from retrying the scmd and 341*4882a593Smuzhiyun scsi_eh_finish_cmd() is invoked. 342*4882a593Smuzhiyun 343*4882a593Smuzhiyun - NEEDS_RETRY 344*4882a593Smuzhiyun scsi_eh_finish_cmd() invoked 345*4882a593Smuzhiyun 346*4882a593Smuzhiyun - otherwise 347*4882a593Smuzhiyun No action. 348*4882a593Smuzhiyun 349*4882a593Smuzhiyun 3. If !list_empty(&eh_work_q), invoke scsi_eh_abort_cmds(). 350*4882a593Smuzhiyun 351*4882a593Smuzhiyun ``scsi_eh_abort_cmds`` 352*4882a593Smuzhiyun 353*4882a593Smuzhiyun This action is taken for each timed out command when 354*4882a593Smuzhiyun no_async_abort is enabled in the host template. 355*4882a593Smuzhiyun hostt->eh_abort_handler() is invoked for each scmd. The 356*4882a593Smuzhiyun handler returns SUCCESS if it has succeeded to make LLDD and 357*4882a593Smuzhiyun all related hardware forget about the scmd. 358*4882a593Smuzhiyun 359*4882a593Smuzhiyun If a timedout scmd is successfully aborted and the sdev is 360*4882a593Smuzhiyun either offline or ready, scsi_eh_finish_cmd() is invoked for 361*4882a593Smuzhiyun the scmd. Otherwise, the scmd is left in eh_work_q for 362*4882a593Smuzhiyun higher-severity actions. 363*4882a593Smuzhiyun 364*4882a593Smuzhiyun Note that both offline and ready status mean that the sdev is 365*4882a593Smuzhiyun ready to process new scmds, where processing also implies 366*4882a593Smuzhiyun immediate failing; thus, if a sdev is in one of the two 367*4882a593Smuzhiyun states, no further recovery action is needed. 368*4882a593Smuzhiyun 369*4882a593Smuzhiyun Device readiness is tested using scsi_eh_tur() which issues 370*4882a593Smuzhiyun TEST_UNIT_READY command. Note that the scmd must have been 371*4882a593Smuzhiyun aborted successfully before reusing it for TEST_UNIT_READY. 372*4882a593Smuzhiyun 373*4882a593Smuzhiyun 4. If !list_empty(&eh_work_q), invoke scsi_eh_ready_devs() 374*4882a593Smuzhiyun 375*4882a593Smuzhiyun ``scsi_eh_ready_devs`` 376*4882a593Smuzhiyun 377*4882a593Smuzhiyun This function takes four increasingly more severe measures to 378*4882a593Smuzhiyun make failed sdevs ready for new commands. 379*4882a593Smuzhiyun 380*4882a593Smuzhiyun 1. Invoke scsi_eh_stu() 381*4882a593Smuzhiyun 382*4882a593Smuzhiyun ``scsi_eh_stu`` 383*4882a593Smuzhiyun 384*4882a593Smuzhiyun For each sdev which has failed scmds with valid sense data 385*4882a593Smuzhiyun of which scsi_check_sense()'s verdict is FAILED, 386*4882a593Smuzhiyun START_STOP_UNIT command is issued w/ start=1. Note that 387*4882a593Smuzhiyun as we explicitly choose error-completed scmds, it is known 388*4882a593Smuzhiyun that lower layers have forgotten about the scmd and we can 389*4882a593Smuzhiyun reuse it for STU. 390*4882a593Smuzhiyun 391*4882a593Smuzhiyun If STU succeeds and the sdev is either offline or ready, 392*4882a593Smuzhiyun all failed scmds on the sdev are EH-finished with 393*4882a593Smuzhiyun scsi_eh_finish_cmd(). 394*4882a593Smuzhiyun 395*4882a593Smuzhiyun *NOTE* If hostt->eh_abort_handler() isn't implemented or 396*4882a593Smuzhiyun failed, we may still have timed out scmds at this point 397*4882a593Smuzhiyun and STU doesn't make lower layers forget about those 398*4882a593Smuzhiyun scmds. Yet, this function EH-finish all scmds on the sdev 399*4882a593Smuzhiyun if STU succeeds leaving lower layers in an inconsistent 400*4882a593Smuzhiyun state. It seems that STU action should be taken only when 401*4882a593Smuzhiyun a sdev has no timed out scmd. 402*4882a593Smuzhiyun 403*4882a593Smuzhiyun 2. If !list_empty(&eh_work_q), invoke scsi_eh_bus_device_reset(). 404*4882a593Smuzhiyun 405*4882a593Smuzhiyun ``scsi_eh_bus_device_reset`` 406*4882a593Smuzhiyun 407*4882a593Smuzhiyun This action is very similar to scsi_eh_stu() except that, 408*4882a593Smuzhiyun instead of issuing STU, hostt->eh_device_reset_handler() 409*4882a593Smuzhiyun is used. Also, as we're not issuing SCSI commands and 410*4882a593Smuzhiyun resetting clears all scmds on the sdev, there is no need 411*4882a593Smuzhiyun to choose error-completed scmds. 412*4882a593Smuzhiyun 413*4882a593Smuzhiyun 3. If !list_empty(&eh_work_q), invoke scsi_eh_bus_reset() 414*4882a593Smuzhiyun 415*4882a593Smuzhiyun ``scsi_eh_bus_reset`` 416*4882a593Smuzhiyun 417*4882a593Smuzhiyun hostt->eh_bus_reset_handler() is invoked for each channel 418*4882a593Smuzhiyun with failed scmds. If bus reset succeeds, all failed 419*4882a593Smuzhiyun scmds on all ready or offline sdevs on the channel are 420*4882a593Smuzhiyun EH-finished. 421*4882a593Smuzhiyun 422*4882a593Smuzhiyun 4. If !list_empty(&eh_work_q), invoke scsi_eh_host_reset() 423*4882a593Smuzhiyun 424*4882a593Smuzhiyun ``scsi_eh_host_reset`` 425*4882a593Smuzhiyun 426*4882a593Smuzhiyun This is the last resort. hostt->eh_host_reset_handler() 427*4882a593Smuzhiyun is invoked. If host reset succeeds, all failed scmds on 428*4882a593Smuzhiyun all ready or offline sdevs on the host are EH-finished. 429*4882a593Smuzhiyun 430*4882a593Smuzhiyun 5. If !list_empty(&eh_work_q), invoke scsi_eh_offline_sdevs() 431*4882a593Smuzhiyun 432*4882a593Smuzhiyun ``scsi_eh_offline_sdevs`` 433*4882a593Smuzhiyun 434*4882a593Smuzhiyun Take all sdevs which still have unrecovered scmds offline 435*4882a593Smuzhiyun and EH-finish the scmds. 436*4882a593Smuzhiyun 437*4882a593Smuzhiyun 5. Invoke scsi_eh_flush_done_q(). 438*4882a593Smuzhiyun 439*4882a593Smuzhiyun ``scsi_eh_flush_done_q`` 440*4882a593Smuzhiyun 441*4882a593Smuzhiyun At this point all scmds are recovered (or given up) and 442*4882a593Smuzhiyun put on eh_done_q by scsi_eh_finish_cmd(). This function 443*4882a593Smuzhiyun flushes eh_done_q by either retrying or notifying upper 444*4882a593Smuzhiyun layer of failure of the scmds. 445*4882a593Smuzhiyun 446*4882a593Smuzhiyun 447*4882a593Smuzhiyun2.2 EH through transportt->eh_strategy_handler() 448*4882a593Smuzhiyun------------------------------------------------ 449*4882a593Smuzhiyun 450*4882a593Smuzhiyuntransportt->eh_strategy_handler() is invoked in the place of 451*4882a593Smuzhiyunscsi_unjam_host() and it is responsible for whole recovery process. 452*4882a593SmuzhiyunOn completion, the handler should have made lower layers forget about 453*4882a593Smuzhiyunall failed scmds and either ready for new commands or offline. Also, 454*4882a593Smuzhiyunit should perform SCSI EH maintenance chores to maintain integrity of 455*4882a593SmuzhiyunSCSI midlayer. IOW, of the steps described in [2-1-2], all steps 456*4882a593Smuzhiyunexcept for #1 must be implemented by eh_strategy_handler(). 457*4882a593Smuzhiyun 458*4882a593Smuzhiyun 459*4882a593Smuzhiyun2.2.1 Pre transportt->eh_strategy_handler() SCSI midlayer conditions 460*4882a593Smuzhiyun^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 461*4882a593Smuzhiyun 462*4882a593Smuzhiyun The following conditions are true on entry to the handler. 463*4882a593Smuzhiyun 464*4882a593Smuzhiyun - Each failed scmd's eh_flags field is set appropriately. 465*4882a593Smuzhiyun 466*4882a593Smuzhiyun - Each failed scmd is linked on scmd->eh_cmd_q by scmd->eh_entry. 467*4882a593Smuzhiyun 468*4882a593Smuzhiyun - SHOST_RECOVERY is set. 469*4882a593Smuzhiyun 470*4882a593Smuzhiyun - shost->host_failed == shost->host_busy 471*4882a593Smuzhiyun 472*4882a593Smuzhiyun 473*4882a593Smuzhiyun2.2.2 Post transportt->eh_strategy_handler() SCSI midlayer conditions 474*4882a593Smuzhiyun^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 475*4882a593Smuzhiyun 476*4882a593Smuzhiyun The following conditions must be true on exit from the handler. 477*4882a593Smuzhiyun 478*4882a593Smuzhiyun - shost->host_failed is zero. 479*4882a593Smuzhiyun 480*4882a593Smuzhiyun - Each scmd is in such a state that scsi_setup_cmd_retry() on the 481*4882a593Smuzhiyun scmd doesn't make any difference. 482*4882a593Smuzhiyun 483*4882a593Smuzhiyun - shost->eh_cmd_q is cleared. 484*4882a593Smuzhiyun 485*4882a593Smuzhiyun - Each scmd->eh_entry is cleared. 486*4882a593Smuzhiyun 487*4882a593Smuzhiyun - Either scsi_queue_insert() or scsi_finish_command() is called on 488*4882a593Smuzhiyun each scmd. Note that the handler is free to use scmd->retries and 489*4882a593Smuzhiyun ->allowed to limit the number of retries. 490*4882a593Smuzhiyun 491*4882a593Smuzhiyun 492*4882a593Smuzhiyun2.2.3 Things to consider 493*4882a593Smuzhiyun^^^^^^^^^^^^^^^^^^^^^^^^ 494*4882a593Smuzhiyun 495*4882a593Smuzhiyun - Know that timed out scmds are still active on lower layers. Make 496*4882a593Smuzhiyun lower layers forget about them before doing anything else with 497*4882a593Smuzhiyun those scmds. 498*4882a593Smuzhiyun 499*4882a593Smuzhiyun - For consistency, when accessing/modifying shost data structure, 500*4882a593Smuzhiyun grab shost->host_lock. 501*4882a593Smuzhiyun 502*4882a593Smuzhiyun - On completion, each failed sdev must have forgotten about all 503*4882a593Smuzhiyun active scmds. 504*4882a593Smuzhiyun 505*4882a593Smuzhiyun - On completion, each failed sdev must be ready for new commands or 506*4882a593Smuzhiyun offline. 507*4882a593Smuzhiyun 508*4882a593Smuzhiyun 509*4882a593SmuzhiyunTejun Heo 510*4882a593Smuzhiyunhtejun@gmail.com 511*4882a593Smuzhiyun 512*4882a593Smuzhiyun11th September 2005 513