-
Notifications
You must be signed in to change notification settings - Fork 138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disk activity widget inactive - Tumbleweed #2844
Comments
When checking here: https://www.kernel.org/doc/Documentation/ABI/testing/procfs-diskstats it seems that additional fields have been added since Kernel 5.5 and depending on the way the parsing occurs that might throw it for a loop (pun intended).
|
Nice find, @Hooverdan96 ! Looks like we should update our unit tests there and then our parsing. The following should most likely be moved to a dedicated issue but before that, I wanted to briefly get your feedback on considering it further: should we consider and investigate whether the python |
using I think this is where the parsing occurs (only showing the upper code snippet) rockstor-core/src/rockstor/smart_manager/data_collector.py Lines 594 to 630 in 5cdc84a
|
Quick Research: here is the disk readout from 'psutil' https://psutil.readthedocs.io/en/latest/#psutil.disk_io_counters giving these fields:
in the Rockstor parsing it seems to map these fields: rockstor-core/src/rockstor/smart_manager/data_collector.py Lines 653 to 669 in 5cdc84a
sooo, unless we restructure the widget, I think |
ok, some more analysis. I attempted a mapping, and there are a few I couldn't really map (or find that they're derived/calculated). So, any refactoring might not be so bad ... <style> </style>
|
and finally, it would have been too easy, but there is this python package: https://pypi.org/project/proc/ but it doesn't seem to have been updated since 2020 so probably not an option, unless the reason is that it only is updated, when the Linux |
@Hooverdan96
We can presumably drop some of the dimensions monitored. It may end up making the widget let cpu heavy as it goes: especially give we may well incur more overhead in using psutil. |
If we indeed "just" need to update our current parsing for TW, would you agree on implementing this fix for the PR targeting this issue given we are in feature freeze and only bug fix for now? |
I'm ok with that approach, too. It seems to me that somewhere the "number of columns available" needs to be evaluated before mapping. |
@Hooverdan96
Although highly visible, this is not a show-stopper - but a nice to have. Plus only presenting in Tumbleweed, which we have on the downloads page as "Development/Advanced-user/Rescue use only" lowers the significance of this issue. But on the other hand it is likely only a matter of time before this may present in our current Leap 15.6 target. So I'm currently not thinking of delaying our next stable rpm version designation as a result of this issue (it remains not on that Milestone). And if we take care to limit any related PR to allow a back-port from future testing branch to the then stable master we can release a fix for master also. To fully assess the issue we need to know exactly the root cause: it may not be the additional number of parameters per device; but the devices presented. Linking to a now very old issue that may, or may not, have the same cause but from a 2019 kernel: "Disk activity plot showing nothing, all drives" #2049 I.e. it may be those loop devices showing in TW that are not shown in Leap (at least for now). |
Linking to the last major change in this area: Given we seem to have additional device showing in TW than in 15.6 we may just have an issue with get_byid_name_map(): Looking into extending the existing test/tests we have for get_byid_name_map() (added in #1979) re:
to try and rule that out as a potential cause for this TW failure. |
OK, so hopefully some progress: Adding via @Hooverdan96 pointer to rockstor-core/src/rockstor/smart_manager/data_collector.py the following filter:
We have a functional disk activity widget in latest TW. Suggesting the loop* devices are throwing us and we can avoid them entirely by this filter. Having more of a look but reporting here by way of findings to date. |
Moving to establishing a device blacklist via major device numbers (first column in
As cleaner/faster and more easily extensible. I.e. we also have potential noise re sr0 (scsi rom) devices:
We can then move, in time, to using block major numbers as a canonical reference to replace a number of device name matches elsewhere that are less robust: i.e. note the /dev/sr deprecation that we still see in Leap 15.6 that is to be replaced by /dev/scd as per the above kernel doc reference. |
Add filter to `/proc/diskstats` excluding block loopback major device 7. This avoids a failed byid_name lookup via byid_disk_map() as loop* block devices do not have byid names, and are not devices of interest. Includes - Additional sanity check on diskstats re ignoring < 14 columns. - Filter out SCSI CD-ROM and Floppy disks re /proc/diskstats. - Associated comment & TODO updates/additions. - Minor type hint improvement. - Updated get_byid_name_map() test data re newer kernels.
…ctive---Tumbleweed Disk activity widget inactive - Tumbleweed #2844
Closing as: |
On our pending 5.0.9-0 installers we have no activity shown in the Dashboard disk activity widget. This is only for a Tumblweed base and affects all our installer machine/arch targets: i.e. TW on x86_64, ARM64EFI, Pi4.
Example command output from a failing TW.x86_64 installer instance:
Example command output from a working 15.6.x86_64 installer instance:
It is assumed currently that we have a parsing issue afoot, and that there is something in the TW output that is throwing our disk activity data retrieval in all TW targets.
The text was updated successfully, but these errors were encountered: