Skip to content

Commit

Permalink
mgr: relax osd ok-to-stop condition on degraded pgs
Browse files Browse the repository at this point in the history
Right now, the "ok-to-stop" condition is relatively rigorous, it allows
stopping an osd only if no PG on it is non-active or degraded. But there
are situations in which an OSD is part of a degraded pg and the pg still
still have > min_size complete replicas after the OSD is stopped.

In 9750061, we changed from considering
just acting to using avail_no_missing (OSDs that have no missing objects).
When the projected pg_acting is constructed this way, we can safely compare
to min_size... even for a PG marked degraded.

Fixes: https://tracker.ceph.com/issues/49392
Signed-off-by: Xuehan Xu <xxhdx1985126@gmail.com>
  • Loading branch information
xxhdx1985126 authored and liewegas committed Feb 20, 2021
1 parent df0b0b6 commit 2f28fc5
Showing 1 changed file with 11 additions and 10 deletions.
21 changes: 11 additions & 10 deletions src/mgr/DaemonServer.cc
Original file line number Diff line number Diff line change
Expand Up @@ -1643,20 +1643,21 @@ bool DaemonServer::_handle_command(
continue;
}
touched_pgs++;
if (!(q.second.state & PG_STATE_ACTIVE) ||
(q.second.state & PG_STATE_DEGRADED)) {
++dangerous_pgs;
continue;
}

const pg_pool_t *pi = osdmap.get_pg_pool(q.first.pool());
if (!pi) {
++dangerous_pgs; // pool is creating or deleting
} else {
if (pg_acting.size() < pi->min_size) {
++dangerous_pgs;
}
continue;
}
}

if (!(q.second.state & PG_STATE_ACTIVE)) {
++dangerous_pgs;
continue;
}
if (pg_acting.size() < pi->min_size) {
++dangerous_pgs;
}
}
});
if (r) {
cmdctx->reply(r, ss);
Expand Down

0 comments on commit 2f28fc5

Please sign in to comment.