Skip to content

Commit

Permalink
os/bluestore: Fix BlueFS::truncate()
Browse files Browse the repository at this point in the history
In `struct bluefs_fnode_t` there is a vector `extents` and
the vector `extents_index` that is a log2 seek cache.

Until modifications to truncate() we never removed extents from files.
Modified truncate() did not update extents_index.

For example 10 extents long files when truncated to 0 will have:
0 extents, 10 extents_index.
After writing some data to file:
1 extents, 11 extents_index.

Now, `bluefs_fnode_t::seek` will binary search extents_index,
lets say it located seek at item #3.
It will then jump up from #0 extent (that exists) to #3 extent which
does not exist at.
The worst part is that code is now broken, as #3 != extent.end().

There are 3 parts of the fix:
1) assert in `bluefs_fnode_t::seek` to protect against
   jumping outside extents
2) code in BlueFS::truncate to sync up `extents_index` with `extents`
3) dampening down assert in _replay to give a way out of cases
   where incorrect "offset 12345" (12345 is file size) instead of
   "offset 20000" (allocations occupied) was written to log.

Fixes: https://tracker.ceph.com/issues/69481

Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>
  • Loading branch information
aclamk committed Jan 13, 2025
1 parent f2b5e2f commit 7f36010
Show file tree
Hide file tree
Showing 3 changed files with 7 additions and 2 deletions.
4 changes: 3 additions & 1 deletion src/os/bluestore/BlueFS.cc
Original file line number Diff line number Diff line change
Expand Up @@ -1706,7 +1706,8 @@ int BlueFS::_replay(bool noop, bool to_stdout)
<< " fnode=" << fnode
<< " delta=" << delta
<< dendl;
ceph_assert(delta.offset == fnode.allocated);
// be leanient, if there is no extents just produce error message
ceph_assert(delta.offset == fnode.allocated || delta.extents.empty());
}
if (cct->_conf->bluefs_log_replay_check_allocations) {
int r = _check_allocations(fnode,
Expand Down Expand Up @@ -3830,6 +3831,7 @@ int BlueFS::truncate(FileWriter *h, uint64_t offset)/*_WF_L*/
fnode.size = offset;
fnode.allocated = new_allocated;
fnode.reset_delta();
fnode.recalc_allocated();
log.t.op_file_update(fnode);
// sad, but is_dirty must be set to signal flushing of the log
h->file->is_dirty = true;
Expand Down
4 changes: 3 additions & 1 deletion src/os/bluestore/bluefs_types.cc
Original file line number Diff line number Diff line change
Expand Up @@ -154,7 +154,9 @@ mempool::bluefs::vector<bluefs_extent_t>::iterator bluefs_fnode_t::seek(
assert(it != extents_index.begin());
--it;
assert(offset >= *it);
p += it - extents_index.begin();
uint32_t skip = it - extents_index.begin();
ceph_assert(skip <= extents.size());
p += skip;
offset -= *it;
}

Expand Down
1 change: 1 addition & 0 deletions src/os/bluestore/bluefs_types.h
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,7 @@ struct bluefs_fnode_t {
void recalc_allocated() {
allocated = 0;
extents_index.reserve(extents.size());
extents_index.clear();
for (auto& p : extents) {
extents_index.emplace_back(allocated);
allocated += p.length;
Expand Down

0 comments on commit 7f36010

Please sign in to comment.