-
Notifications
You must be signed in to change notification settings - Fork 684
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[show] Add show ssd CLI #502
Conversation
@sandycelestica Please follow this PR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please make 'ssd' a group and all ssd-related commands should be nested under that group. This will help consolidate the commands. For example,
show ssd all
show ssd badblock
show ssd capcity
...
Many examples of groups like this already exist in this file.
* [show] update ssd cli format * [show] fix invalid capacity text
click.echo(line.strip()) | ||
return | ||
else: | ||
for line in output: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this code block repeats 4 times. Function?
|
||
|
||
# | ||
# function name : print_test_title |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why printing firmware version or capacity is a "test"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll update this part later.If you have the detail request of the content.Please let me know.I plan to display like this:"Show SSD Firmware info"
|
||
checkin = 1 | ||
testname = "Vender ID and Devie ID Check Test" | ||
command = "sudo lspci -n | grep b960 " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is b960? vid? pid?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
b960 is the switch chip(BCM56960) of Seastone. Here we just check switch chip info.
else: | ||
checkin = 0 | ||
click.echo(line.strip()) | ||
click.echo("Device Model Not Match 3ME3 3ME4 or 3IE3") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how to extend the code if disk vendor is not one from the list?
upd:
to make it extendable consider using plugins model
see: https://github.com/Azure/SONiC/wiki/SSD-Health-design-(Draft)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your suggest. Currently this SSD diag is just for Seastone. So not support other types of SSD right now. And we are planning to extend it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
show utility is mostly a wrapper
I suggest to have a separate utility (like sfputil, psuutil, intfutil, etc) and call it from show
@sandycelestica, Please check the above comment |
checkin = 1 | ||
|
||
if(checkin == 1): | ||
remainingtime = poweron * 100 / 7.62 - poweron |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remainingtime = poweron * 100 / 7.62 - poweron
remainingtime = poweron * (100 / 7.62 - 1)
remainingtime = poweron * 12.12
what is remaining time for SSD?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a formula to calculate this value:
ID Hex Attribute
09 (09) Power on Hours
To estimate remaining hours of use: ID# 9 (100/7.26) – 1 (x ID# 9)
In our code,
if "Power_On_Hours" in line:
rawval = line.split()[-1]
poweron = int(rawval)
checkin = 1
we get the poweron first and then calculate the remaining time follow the formula.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just don't understand why remaining time is calculated as poweron*12
why remaining is always greater?
@mudsut4ke, @sandycelestica: Are you aware of this in-progress design for SSD health: https://github.com/Azure/SONiC/wiki/SSD-Health-design-(Draft)? |
From the beginning we discussed with Guo Han. He suggested to modify the main.py in show. So we merged it into show. |
Add @grantshao Please help Sandy review this change |
@lguohan, Do you have a comment ? |
checkin = 1 | ||
|
||
if(checkin == 1): | ||
remainingtime = poweron * 100 / 7.62 - poweron |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just don't understand why remaining time is calculated as poweron*12
why remaining is always greater?
|
||
@ssd.command() | ||
@click.argument("device") | ||
def all(device): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all() can call other functions and not duplicate the code...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK,I'll update it as your suggestion.Thanks
|
||
|
||
# | ||
# check_pcie_speed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be better to have pcie in a separate PR
or at least fix the subject...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a general comment i think it is bad practice to add a CLI which can work on specific platforms.
The idea is that the code/utility provided can be extended to other vendors at any time and for that it should be designed to be like that.
Also, as this is only a utility I believe it is only show the current value which is good but it is partially. I think it should be added as a new API to the Platform API as a component and can be monitored as well. for example if some threshold expires we raise an error so the user/admin can handle.
In addition as it is relevant to multiple vendors we usually discuss the design before the PR in the community meeting or at least share a design doc. Can you please share?
We had an SSD HLD in the last community meeting. I think merging this CLI now will contradict the design held. The design is also referred earlier on this PR discussion. @andriymoroz-mlnx could you please share the design document in https://github.com/Azure/SONiC/tree/master/doc as PRfollowing the review comments? |
Created https://github.com/Azure/SONiC/pull/378/files |
- What I did
Add commands to display SSD information
- How I did it
Add function to display ssd information to 'show/main.py'
- How to verify it
- Previous command output (if the output of a command-line utility has changed)
- New command output (if the output of a command-line utility has changed)