-
Notifications
You must be signed in to change notification settings - Fork 250
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
btrfs-progs: cmds/qgroup: use sysfs to detect inconsistent status early #952
Open
adam900710
wants to merge
27
commits into
kdave:devel
Choose a base branch
from
adam900710:qgroup_sysfs_detect
base: devel
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Unify the error message style a bit, print the relevant path if missing. Signed-off-by: David Sterba <dsterba@suse.com>
The preferred error message should have a prefix with problem description and then the errno description as we use the negative errno convention almost everywhere. - drop additional %d in the message if %m is present - replace %d with %m Signed-off-by: David Sterba <dsterba@suse.com>
The subvolumes created during mkfs are not printed in the summary because btrfs_mkfs_fill_dir() deletes them from the list as they get created. Signed-off-by: David Sterba <dsterba@suse.com>
Print indicators in the summary if the subvolume is read-write, read-only or default: $ mkfs.btrfs --subvol ro:subvolro --subvol rw:subvolrw --subvol default-ro:defaultro --rootdir /rootdir/path img ... Rootdir from: /rootdir/path Compress: no Subvolume (rw): subvolrw Subvolume (ro): subvolro Subvolume (dro): defaultro ... The path is relative to the rootdir path and may not be a subvolume in the source directory so drop the rootdir as this may be confusing. Signed-off-by: David Sterba <dsterba@suse.com>
Fix missing or wrong formatting, enhance descriptions or add more references. Signed-off-by: David Sterba <dsterba@suse.com>
Docker warns about CMD syntax, which is harmless in our case but let's fix the warning by using the json (quoted strings in array) syntax. JSONArgsRecommended: JSON arguments recommended for CMD to prevent unintended behavior related to OS signals (line 29) Explained in https://docs.docker.com/reference/build-checks/json-args-recommended . Signed-off-by: David Sterba <dsterba@suse.com>
Enable CodeQL code scanning checks. For that the automatically detected build steps don't work as the documentation is expected to build. Add the manual steps. Signed-off-by: David Sterba <dsterba@suse.com>
[ci skip] Signed-off-by: David Sterba <dsterba@suse.com>
The values of UNITS_ modes are supposed to be unique and non-overlapping. Due to wrong definition this was not true for UNITS_DECIMAL which set 2 bits. This breaks code that strictly depends on the uniqueness of the bits and not on the order of processing of the constants. Signed-off-by: David Sterba <dsterba@suse.com>
There's a special case when scrub rate is printed, the device sizes and rate use different units. The size can be in terabytes while the rate can be in hundreds of megabytes (contemporary 10-20T disks, 250MB/s). The sizes use what is set on command line (or human readable by default), while the rate is always human readable with exception to the option --raw to provide a way to print the exact numbers without any conversions. This got broken in commit ec3c842 ("btrfs-progs: scrub status: with --si, show rate in metric units") that forced the command line mode to the rate as well. Instead of that we need to detect the SI/IEC mode and set it to the human readable format of rate. Signed-off-by: David Sterba <dsterba@suse.com>
Add currently implemented types and update recommendations. Signed-off-by: David Sterba <dsterba@suse.com>
Thanks to hunspell. [ci skip] Signed-off-by: David Sterba <dsterba@suse.com>
Add sections for each directory so it's more visible where the sections start and end. Update formatting, enhance descriptions. Signed-off-by: David Sterba <dsterba@suse.com>
The references used in :doc: need to reference relative path of the document, otherwise this leads to a warning. btrfs-progs/Documentation/Send-receive.rst:19: WARNING: unknown document: 'dev-send-stream' [ref.doc] btrfs-progs/Documentation/dev/CmdLineConventions.rst:60: WARNING: unknown document: 'btrfstune' [ref.doc] Signed-off-by: David Sterba <dsterba@suse.com>
The kernel commit 08fe4db170b419 ("Btrfs: Fix uninitialized root flags for subvolumes") from 2011 sets the flag BTRFS_INODE_ROOT_ITEM_INIT on root items, to work around a bug where flags and byte_limit weren't being set. Copy this behaviour in mkfs, to prevent the kernel from having to do it on the first mount. We memset the btrfs_root_item, so there's no corruption issue as there once was. We already do this in btrfs_make_subvolume(), as otherwise the readonly flag of any subvolumes would get reset. Signed-off-by: Mark Harmstone <maharmstone@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
The kernel adds a zeroed btrfs_dev_stats_item for each device on the first mount. Preempt this by doing it at mkfs time. Signed-off-by: Mark Harmstone <maharmstone@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
Follow the kernel by setting the BIG_METADATA incompat flag if nodesize is greater than the page size. This flag was introduced with commit 727011e07cbdf8 ("Btrfs: allow metadata blocks larger than the page size") in 2010, as kernels before 2.6.36 would crash due to a buggy page cache implementation. The flag has no real meaning anymore but we can at least set it at mkfs time. Signed-off-by: Mark Harmstone <maharmstone@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
Just like all qgroup functions, if a qgroup is marked inconsistent, limit will not work as expected. In fact with recent kernels, limit and qgroup number updating will be fully skipped if qgroup is already inconsistent. Add one extra note on `btrfs qgroup limit` subcommand for it. Link: https://bugzilla.suse.com/show_bug.cgi?id=1235765 Reported-by: Vojtech Lacina <vlacina@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com>
…wer redundancy [BUG] There is a bug report that when deleting a device using sysfs /sys/block/<dev>/device/delete, the kernel module will still try to read and write the device. Normally it's fine as long as all chunks can tolerate that removed device (e.g. all RAID1). But the problem is when one is trying to lower the redundancy by converting to another profile: # mkfs.btrfs -f -m raid1 -d raid1 /dev/sdd /dev/sde # mount /dev/sdd /mnt # echo 1 > /sys/block/sde/device/delete # btrfs balance start --force -mdup -dsingle /mnt This will lead to the filesystem mounted RO, with the following error messages: sd 6:0:0:0: [sde] Synchronizing SCSI cache ata7.00: Entering standby power mode btrfs: attempt to access beyond end of device sde: rw=6145, sector=21696, nr_sectors = 32 limit=0 btrfs: attempt to access beyond end of device sde: rw=6145, sector=21728, nr_sectors = 32 limit=0 btrfs: attempt to access beyond end of device sde: rw=6145, sector=21760, nr_sectors = 32 limit=0 BTRFS error (device sdd): bdev /dev/sde errs: wr 1, rd 0, flush 0, corrupt 0, gen 0 BTRFS error (device sdd): bdev /dev/sde errs: wr 2, rd 0, flush 0, corrupt 0, gen 0 BTRFS error (device sdd): bdev /dev/sde errs: wr 3, rd 0, flush 0, corrupt 0, gen 0 BTRFS error (device sdd): bdev /dev/sde errs: wr 3, rd 0, flush 1, corrupt 0, gen 0 btrfs: attempt to access beyond end of device sde: rw=145409, sector=128, nr_sectors = 8 limit=0 BTRFS warning (device sdd): lost super block write due to IO error on /dev/sde (-5) BTRFS error (device sdd): bdev /dev/sde errs: wr 4, rd 0, flush 1, corrupt 0, gen 0 btrfs: attempt to access beyond end of device sde: rw=14337, sector=131072, nr_sectors = 8 limit=0 BTRFS warning (device sdd): lost super block write due to IO error on /dev/sde (-5) BTRFS error (device sdd): bdev /dev/sde errs: wr 5, rd 0, flush 1, corrupt 0, gen 0 BTRFS error (device sdd): error writing primary super block to device 2 BTRFS info (device sdd): balance: start -dconvert=single -mconvert=dup -sconvert=dup BTRFS info (device sdd): relocating block group 1372585984 flags data|raid1 BTRFS error (device sdd): bdev /dev/sde errs: wr 5, rd 0, flush 2, corrupt 0, gen 0 BTRFS warning (device sdd): chunk 2446327808 missing 1 devices, max tolerance is 0 for writable mount BTRFS: error (device sdd) in write_all_supers:4044: errno=-5 IO failure (errors while submitting device barriers.) BTRFS info (device sdd state E): forced readonly BTRFS warning (device sdd state E): Skipping commit of aborted transaction. BTRFS error (device sdd state EA): Transaction aborted (error -5) BTRFS: error (device sdd state EA) in cleanup_transaction:2017: errno=-5 IO failure BTRFS info (device sdd state EA): balance: ended with status: -5 [CAUSE] Btrfs doesn't have any runtime device error handling, it fully rely on the extra copy provided. For the sysfs block device removal, normally there is a device shutdown callback to the running fs, but unfortunately btrfs doesn't support this callback yet. Thus even with that device removed, btrfs will still access that removed device (both read and write, even if they will fail). Normally for a full RAID1 btrfs, it will still be fine reading/write the fs as usual. The proper action is to replace the removed/missing/failing device with a newer one using `btrfs device replace`. But when doing the convert, btrfs will allocate new metadata chunks on to the removed device (which will lose all writes). And since the new metadata profile is DUP, which can not handle any missing device of that metadata chunk, finally it triggers the final protection at transaction commit time, and flips the filesystem to RO, before causing any real data loss. [DOC ENHANCEMENT] Add a warning to the `convert` filter about the dangerous doing convert to a lower redundancy profile when there is a known failing/removed device. And mention the proper way to handle such failing/missing device. The root fix is to introduce a failing/removed device detection for btrfs, but that will be a pretty big feature and will take quite some time before landing it upstream. Link: https://lore.kernel.org/linux-btrfs/2cb1d81e-12a8-4fb1-b3fc-e7e83d31e059@siddall.name/ Reported-by: Jeff Siddall <news@siddall.name> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
There already is misc/065-csum-conversion-inject . Signed-off-by: David Sterba <dsterba@suse.com>
The 065 and 066 are taken. Signed-off-by: David Sterba <dsterba@suse.com>
There are two tests 064, and we're missing 061 so fill the gap. Signed-off-by: David Sterba <dsterba@suse.com>
Use 'symbolic' as the term. Signed-off-by: David Sterba <dsterba@suse.com>
The templates need to be renamed, so make it more suggestive by prepending the expected name of the script test.sh. Also fix the permissions and make it 755 so it's not missed later as this would not execute the script. Signed-off-by: David Sterba <dsterba@suse.com>
Minor fixes and updates that reflect current state. [ci skip] Signed-off-by: David Sterba <dsterba@suse.com>
For quick checks before a push of non-code changes we may want to do only the spellchecking workflow. Any branch pushed matching the prefix "codespell/" will be picked by this. Signed-off-by: David Sterba <dsterba@suse.com>
Currently if "btrfs qgroup show" detects the qgroup is in an inconsistent, it will output a warning: WARNING: qgroup data inconsistent, rescan recommended But the detection is based on the tree search ioctl, and qgroup tree is only updated at transaction commit time. This means if qgroup is marked inconsistent, and the transaction is not commit, there can be a huge window as long as 30s before "btrfs qgroup show" to give a proper warning about inconsistent qgroup numbers. To address this window, use the '/sys/fs/btrfs/<fsid>/qgroup/inconsistent' file to do extra check. That file is updated at real time, thus there is no delay, and can give an early warning about inconsistent qgroup numbers. Link: https://bugzilla.suse.com/show_bug.cgi?id=1235765 Signed-off-by: Qu Wenruo <wqu@suse.com>
1480f4a
to
103338d
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Currently if "btrfs qgroup show" detects the qgroup is in an inconsistent, it will output a warning:
WARNING: qgroup data inconsistent, rescan recommended
But the detection is based on the tree search ioctl, and qgroup tree is only updated at transaction commit time.
This means if qgroup is marked inconsistent, and the transaction is not commit, there can be a huge window as long as 30s before "btrfs qgroup show" to give a proper warning about inconsistent qgroup numbers.
To address this window, use the
'/sys/fs/btrfs//qgroup/inconsistent' file to do extra check.
That file is updated at real time, thus there is no delay, and can give an early warning about inconsistent qgroup numbers.
Link: https://bugzilla.suse.com/show_bug.cgi?id=1235765