Skip to content

Commit

Permalink
scsi: core: alua: I/O errors for ALUA state transitions
Browse files Browse the repository at this point in the history
[ Upstream commit 10157b1 ]

When a host is configured with a few LUNs and I/O is running, injecting FC
faults repeatedly leads to path recovery problems.  The LUNs have 4 paths
each and 3 of them come back active after say an FC fault which makes 2 of
the paths go down, instead of all 4. This happens after several iterations
of continuous FC faults.

Reason here is that we're returning an I/O error whenever we're
encountering sense code 06/04/0a (LOGICAL UNIT NOT ACCESSIBLE, ASYMMETRIC
ACCESS STATE TRANSITION) instead of retrying.

[mwilck: The original patch was developed by Rajashekhar M A and Hannes
Reinecke. I moved the code to alua_check_sense() as suggested by Mike
Christie [1]. Evan Milne had raised the question whether pg->state should
be set to transitioning in the UA case [2]. I believe that doing this is
correct. SCSI_ACCESS_STATE_TRANSITIONING by itself doesn't cause I/O
errors. Our handler schedules an RTPG, which will only result in an I/O
error condition if the transitioning timeout expires.]

[1] https://lore.kernel.org/all/[email protected]/
[2] https://lore.kernel.org/all/CAGtn9r=kicnTDE2o7Gt5Y=yoidHYD7tG8XdMHEBJTBraVEoOCw@mail.gmail.com/

Co-developed-by: Rajashekhar M A <[email protected]>
Co-developed-by: Hannes Reinecke <[email protected]>
Signed-off-by: Hannes Reinecke <[email protected]>
Signed-off-by: Martin Wilck <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Damien Le Moal <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: Mike Christie <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
  • Loading branch information
Martin Wilck authored and gregkh committed Jul 23, 2024
1 parent 712eb0a commit f668421
Showing 1 changed file with 22 additions and 9 deletions.
31 changes: 22 additions & 9 deletions drivers/scsi/device_handler/scsi_dh_alua.c
Original file line number Diff line number Diff line change
Expand Up @@ -414,28 +414,40 @@ static char print_alua_state(unsigned char state)
}
}

static enum scsi_disposition alua_check_sense(struct scsi_device *sdev,
struct scsi_sense_hdr *sense_hdr)
static void alua_handle_state_transition(struct scsi_device *sdev)
{
struct alua_dh_data *h = sdev->handler_data;
struct alua_port_group *pg;

rcu_read_lock();
pg = rcu_dereference(h->pg);
if (pg)
pg->state = SCSI_ACCESS_STATE_TRANSITIONING;
rcu_read_unlock();
alua_check(sdev, false);
}

static enum scsi_disposition alua_check_sense(struct scsi_device *sdev,
struct scsi_sense_hdr *sense_hdr)
{
switch (sense_hdr->sense_key) {
case NOT_READY:
if (sense_hdr->asc == 0x04 && sense_hdr->ascq == 0x0a) {
/*
* LUN Not Accessible - ALUA state transition
*/
rcu_read_lock();
pg = rcu_dereference(h->pg);
if (pg)
pg->state = SCSI_ACCESS_STATE_TRANSITIONING;
rcu_read_unlock();
alua_check(sdev, false);
alua_handle_state_transition(sdev);
return NEEDS_RETRY;
}
break;
case UNIT_ATTENTION:
if (sense_hdr->asc == 0x04 && sense_hdr->ascq == 0x0a) {
/*
* LUN Not Accessible - ALUA state transition
*/
alua_handle_state_transition(sdev);
return NEEDS_RETRY;
}
if (sense_hdr->asc == 0x29 && sense_hdr->ascq == 0x00) {
/*
* Power On, Reset, or Bus Device Reset.
Expand Down Expand Up @@ -502,7 +514,8 @@ static int alua_tur(struct scsi_device *sdev)

retval = scsi_test_unit_ready(sdev, ALUA_FAILOVER_TIMEOUT * HZ,
ALUA_FAILOVER_RETRIES, &sense_hdr);
if (sense_hdr.sense_key == NOT_READY &&
if ((sense_hdr.sense_key == NOT_READY ||
sense_hdr.sense_key == UNIT_ATTENTION) &&
sense_hdr.asc == 0x04 && sense_hdr.ascq == 0x0a)
return SCSI_DH_RETRY;
else if (retval)
Expand Down

0 comments on commit f668421

Please sign in to comment.