Skip to content

Commit

Permalink
btrfs-progs: docs: add an extra note to btrfs data checksum and directIO
Browse files Browse the repository at this point in the history
In v6.14 kernel release, btrfs will force a direct IO to fall back to
a buffered one if the inode requires a data checksum.

This will cause a small performance drop, to solve the false data
checksum mismatch problem caused by direct IOs.

Although such a change is small to most end users, for those requiring
such a zero-copy direct IO this will be a behavior change, and this
requires a proper documentation update.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
  • Loading branch information
adam900710 committed Feb 17, 2025
1 parent 55137da commit 3a69833
Showing 1 changed file with 18 additions and 0 deletions.
18 changes: 18 additions & 0 deletions Documentation/ch-checksumming.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,24 @@ writing and verified after reading the blocks from devices. The whole metadata
block has an inline checksum stored in the b-tree node header. Each data block
has a detached checksum stored in the checksum tree.

.. note::
Since a data checksum is calculated just before submitting to the block
device, btrfs has a strong requirement that the coresponding data block must

Check failure on line 8 in Documentation/ch-checksumming.rst

View workflow job for this annotation

GitHub Actions / Check for spelling errors

coresponding ==> corresponding
not be modified until the writeback is finished.

This requirement is met for a buffered write as btrfs has the full control on
its page caches, but a direct write (``O_DIRECT``) bypasses page caches, and
btrfs can not control the direct IO buffer (as it can be in user space memory),
thus it's possible that a user space program modifies its direct write buffer
before the buffer is fully written back, and this can lead to a data checksum mismatch.

To avoid such a checksum mismatch, since v6.14 btrfs will force a direct
write to fall back to a buffered one, if the inode requires a data checksum.
This will bring a small performance penalty, and if the end user requires true
zero-copy direct writes, they should set the ``NODATASUM`` flag for the inode
and make sure the direct IO buffer is fully aligned to btrfs block size.


There are several checksum algorithms supported. The default and backward
compatible algorithm is *crc32c*. Since kernel 5.5 there are three more with different
characteristics and trade-offs regarding speed and strength. The following list
Expand Down

0 comments on commit 3a69833

Please sign in to comment.