Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
DLPX-81891 agent panicked in reclaim_frees_object(): blocks_size vs n…
…um_bytes mismatch (openzfs#605) Whenever we do a PutObject, we use a timeout of `PER_REQUEST_TIMEOUT` which is 2 seconds. The timeout is implemented via `tokio::time::timeout()`, which drops the Future if the timeout expires. See `object_access::retry()`. When this happens, we don't have visibility into the internal state of S3. In particular, we don't know if the PutObject was ignored, actually completed (e.g. we timed out just as it was sending a response to us), or if it will complete at some point of time in the future. The problem occurs when a later PutObject (e.g. due to reclaiming freed blocks) overwrites the same object with different data, and the first, timed-out PutObject is applied at a later time, overwriting the second PutObject's data. This looks like the second PutObject had no effect, but in fact it did take effect, but was later overwritten by the first, timed-out PutObject. If the overwritten data contains blocks from a different (consolidated) object, those blocks will be lost. To mitigate the problem, in this commit we disable object consolidation. In other words, when reclaiming freed space, a given object may be overwritten only with a subset of its original blocks, with no additional blocks added. This way, if a timed-out PutObject takes effect later, the reverted state of the object only contains additional blocks (which will be leaked, at least until the entire object is deleted).
- Loading branch information