Skip to content

Commit

Permalink
MMP: increase delay for busy pool before suspending
Browse files Browse the repository at this point in the history
In mmp_thread(), if the mmp_delay is increasing beyond the fixed
mmp_interval, use the longer mmp_delay value when determining if
the pool should be suspended.  Otherwise, if the system is very
busy (but still getting MMP writes to disk) it may incorrectly
suspend the pool, even though a peer system would not import it.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
  • Loading branch information
Andreas Dilger authored and jhammond-intel committed Jan 19, 2018
1 parent 6bc4a23 commit 891b2e7
Showing 1 changed file with 14 additions and 15 deletions.
29 changes: 14 additions & 15 deletions module/zfs/mmp.c
Original file line number Diff line number Diff line change
Expand Up @@ -249,7 +249,7 @@ mmp_write_done(zio_t *zio)
goto unlock;

/*
* Mmp writes are queued on a fixed schedule, but under many
* MMP writes are queued on a fixed schedule, but under many
* circumstances, such as a busy device or faulty hardware,
* the writes will complete at variable, much longer,
* intervals. In these cases, another node checking for
Expand All @@ -269,8 +269,7 @@ mmp_write_done(zio_t *zio)
if (delay > mts->mmp_delay)
mts->mmp_delay = delay;
else
mts->mmp_delay = (delay + mts->mmp_delay * 127) /
128;
mts->mmp_delay = (delay + mts->mmp_delay * 127) / 128;
} else {
mts->mmp_delay = 0;
}
Expand Down Expand Up @@ -381,17 +380,17 @@ mmp_thread(void *arg)
uint64_t mmp_fail_intervals = zfs_multihost_fail_intervals;
uint64_t mmp_interval = MSEC2NSEC(
MAX(zfs_multihost_interval, MMP_MIN_INTERVAL));
uint64_t fail_interval;
int vdev_leaves = MAX(vdev_count_leaves(spa), 1);
boolean_t suspended = spa_suspended(spa);
boolean_t multihost = spa_multihost(spa);
hrtime_t start, next_time;

start = gethrtime();
if (multihost) {
next_time = start + mmp_interval /
MAX(vdev_count_leaves(spa), 1);
} else {
if (multihost)
next_time = start + mmp_interval / vdev_leaves;
else
next_time = start + MSEC2NSEC(MMP_DEFAULT_INTERVAL);
}

/*
* When MMP goes off => on, or spa goes suspended =>
Expand All @@ -417,16 +416,16 @@ mmp_thread(void *arg)
* immediately suspended before writes can occur at the new
* higher frequency.
*/
if ((mmp_interval * mmp_fail_intervals) < max_fail_ns) {
max_fail_ns = ((31 * max_fail_ns) + (mmp_interval *
mmp_fail_intervals)) / 32;
} else {
max_fail_ns = mmp_interval * mmp_fail_intervals;
}
fail_interval = MAX(mmp_interval,
mmp->mmp_delay * vdev_leaves) * mmp_fail_intervals;
if (fail_interval < max_fail_ns)
max_fail_ns = ((31 * max_fail_ns) + fail_interval) / 32;
else
max_fail_ns = fail_interval;

/*
* Suspend the pool if no MMP write has succeeded in over
* mmp_interval * mmp_fail_intervals nanoseconds.
* mmp_delay * vdev_leaves * mmp_fail_intervals nanoseconds.
*/
if (!suspended && mmp_fail_intervals && multihost &&
(start - mmp->mmp_last_write) > max_fail_ns) {
Expand Down

0 comments on commit 891b2e7

Please sign in to comment.