Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

S.M.A.R.T. Input Plugin: Some attributes not collected on NVMe drive #7824

Closed
split-n opened this issue Jul 12, 2020 · 1 comment
Closed

Comments

@split-n
Copy link

split-n commented Jul 12, 2020

Relevant telegraf.conf:

[[inputs.smart]]
    use_sudo = true
    attributes = true

System info:

  • Telegraf 1.14.5 (git: HEAD e77ce3d)
  • Proxmox VE 6.2
    • It's based on Debian 10.4
  • smartmontools 7.1-pve2

Steps to reproduce:

  1. Enable SMART plugin on config to collect attributes
  2. Start telegraf service or Run sudo -u telegraf telegraf --input-filter=smart --test

Expected behavior:

Attributes like "Data Units Written" should be collected because it's defined and tested to be collected.

"Data Units Written": {

"name": "Data_Units_Written",

Actual behavior:

Only a few attributes collected. (What I need is to collect Data_Units_Written).

Additional info:

It looks like #6032 is a similar issue.

Output from sudo -u telegraf telegraf --input-filter=smart --test (serial is masked)

> smart_attribute,device=nvme0,host=pve,model=INTEL\ SSDPE2KX020T8,name=Critical_Warning,serial_no=**MASKED**,user=ifdb raw_value=0i 1594517566000000000
> smart_attribute,device=nvme0,host=pve,id=194,model=INTEL\ SSDPE2KX020T8,name=Temperature_Celsius,serial_no=**MASKED**,user=ifdb raw_value=48i 1594517566000000000
> smart_attribute,device=nvme0,host=pve,model=INTEL\ SSDPE2KX020T8,name=Available_Spare,serial_no=**MASKED**,user=ifdb raw_value=100i 1594517566000000000
> smart_attribute,device=nvme0,host=pve,id=12,model=INTEL\ SSDPE2KX020T8,name=Power_Cycle_Count,serial_no=**MASKED**,user=ifdb raw_value=101i 1594517566000000000
> smart_attribute,device=nvme0,host=pve,id=9,model=INTEL\ SSDPE2KX020T8,name=Power_On_Hours,serial_no=**MASKED**,user=ifdb raw_value=3513i 1594517566000000000
> smart_attribute,device=nvme0,host=pve,model=INTEL\ SSDPE2KX020T8,name=Media_and_Data_Integrity_Errors,serial_no=**MASKED**,user=ifdb raw_value=0i 1594517566000000000
> smart_attribute,device=nvme0,host=pve,model=INTEL\ SSDPE2KX020T8,name=Error_Information_Log_Entries,serial_no=**MASKED**,user=ifdb raw_value=0i 1594517566000000000
> smart_device,device=nvme0,host=pve,model=INTEL\ SSDPE2KX020T8,serial_no=**MASKED**,user=ifdb exit_status=0i,health_ok=true,temp_c=48i 1594517566000000000

Output from sudo smartctl --info --health --attributes --tolerance=verypermissive --format=brief /dev/nvme0n1

smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.44-1-pve] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Number:                       INTEL SSDPE2KX020T8
Serial Number:                      *MASKED*
Firmware Version:                   VDV10170
PCI Vendor/Subsystem ID:            0x8086
IEEE OUI Identifier:                0x5cd2e4
Total NVM Capacity:                 1,800,359,040,512 [1.80 TB]
Unallocated NVM Capacity:           0
Controller ID:                      0
Number of Namespaces:               128
Namespace 1 Size/Capacity:          1,800,359,040,512 [1.80 TB]
Namespace 1 Formatted LBA Size:     512
Namespace 1 IEEE EUI-64:            5cd2e4 *MASKED*
Local Time is:                      Sun Jul 12 10:39:20 2020 JST

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02)
Critical Warning:                   0x00
Temperature:                        49 Celsius
Available Spare:                    100%
Available Spare Threshold:          10%
Percentage Used:                    1%
Data Units Read:                    89,827,188 [45.9 TB]
Data Units Written:                 170,448,319 [87.2 TB]
Host Read Commands:                 1,858,588,410
Host Write Commands:                3,292,385,011
Controller Busy Time:               441
Power Cycles:                       101
Power On Hours:                     3,513
Unsafe Shutdowns:                   51
Media and Data Integrity Errors:    0
Error Information Log Entries:      0
Warning  Comp. Temperature Time:    0
Critical Comp. Temperature Time:    0
@split-n
Copy link
Author

split-n commented Jul 12, 2020

I realized that support for missing attributes had added recently on #7575 that's not included in Telegraf 1.14.5. And it works on Telegraf 1.15-RC2. Close this issue.

@split-n split-n closed this as completed Jul 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant