Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[sfputil] Not able to read out values of voltage/temp/power on some cables #2720

Closed
keboliu opened this issue Mar 29, 2019 · 0 comments
Closed

Comments

@keboliu
Copy link
Collaborator

keboliu commented Mar 29, 2019

Description

Steps to reproduce the issue:
read the eeprom dom info with sfputil:

Describe the results you received:
with 201807/201811 image, voltage/temp/power not able to be read out for some cable, but with 201803 image is OK. any one encountered into similiar issue?

bad output with 201807/201811 image:

~# sudo sfputil show eeprom --dom -p Ethernet0
Ethernet0: SFP EEPROM detected
CalibrationType: Internally Calibrated
Connector: Optical pigtail
EncodingCodes: NRZ
ExtIdentOfTypeOfTransceiver: GBIC/SFP defined by twowire interface ID
LengthCable(UnitsOfm): 3
NominalSignallingRate(UnitsOf100Mbd): 255
RateIdentifier: Unspecified
ReceivedPowerMeasurementType: Avg power
TransceiverCodes:
FibreChannelLinkLength: short distance (S)
SFP+CableTechnology: Active Cable
TypeOfTransceiver: SFP or SFP Plus
VendorDataCode(YYYY-MM-DD Lot): 2017-06-05
VendorName: Hisense
VendorOUI:
VendorPN: LTF8507-PC03-HW1
VendorRev: 1
VendorSN: S50763H002Y
AlarmFlagStatus:
...
MonitorData:
RXPower: -infdBm
TXBias: 0.0000mA
TXPower: -infdBm
Temperature: 0.0000C
Vcc: 0.0000Volts
StatusControl:
...
TempLowWarning: Off
VccHighWarning: Off
VccLowWarning: Off

good out with 201803

Ethernet0: SFP EEPROM detected
CalibrationType: Internally Calibrated
Connector: Optical pigtail
EncodingCodes: NRZ
ExtIdentOfTypeOfTransceiver: GBIC/SFP defined by twowire interface ID
LengthCable(UnitsOfm): 3
NominalSignallingRate(UnitsOf100Mbd): 255
RateIdentifier: Unspecified
ReceivedPowerMeasurementType: Avg power
TransceiverCodes:
FibreChannelLinkLength: short distance (S)
SFP+CableTechnology: Active Cable
TypeOfTransceiver: SFP or SFP Plus
VendorDataCode(YYYY-MM-DD Lot): 2017-12-12
VendorName: Hisense
VendorOUI: His
VendorPN: LTF8507-PC03-HW1
VendorRev: 1
VendorSN: S507C3H002G
AlarmFlagStatus:
...
MonitorData:
RXPower: -3.3451dBm
TXBias: 6.2000mA
TXPower: -1.2703dBm
Temperature: 40.5508C
Vcc: 3.2823Volts

StatusControl:
...

lguohan pushed a commit that referenced this issue May 31, 2019
…/power on some cables (#2957)

* [device/mellanox/x86_64-mlnx_msn2700-r0/plugins/sfputil.py]
purpose and restrictions
1. reading eeprom via ethtool.
2. avoid changing common codes shared by all the manufacture (sonic-platform-common), contrain all the modifications with Mellanox-specific code.
current implementation
A new class based on SfpUtilBase and a new method _read_eeprom_specific_bytes_via_ethtool have been introduced in order to change the way the eprom DOM data is read. Typically the best practice to do this kind of thing is to contrain the modification within the function which execute reading operations only and keep other stuffs (especially the interface) untouched. However, this can hardly be achieved since the original reading function takes the file object as input parameter to represent the port. It is done by having the file object to point to /var/run/hwmanagement files, which will not be maintained in the future. As a result, a new interface has to be introduced with a port number/name as input parameter in order to get rid of the dependency on the those files:
_read_eeprom_specific_bytes_via_ethtool
Since the interface changed, all methods that call the interface should also be overwritten in order to call the new interface, including:
_read_eeprom_devid
get_transceiver_info_dict
get_transceiver_dom_info_dict
Only interface used to read eeprom DOM has been replaced and the main logic has not been changed except the following mentioned.
1. reading DOM data for sfp port, which is implementioned in get_transceiver_dom_info_dict. In this case a "calibration" should be firstly read from eeprom before other values like temperature, voltage, rx/tx power, can be parsed. However, this has been ignored in the original code, resulting in that the data cann't be parsed.
2. In the original implemention the data area containing the data are read from DOM separatedly in order to avoid read uncessary data and achieve a better performance. Having used ethtool to read DOM data, the performance gap between reading all the area and reading the spot data separatedly has been narrowed to almost zero. To make the code neat and readable, we change the way to read this data.

* [sfputil] Returns dict with all data set to N/A for ports without dom support
Currently, the way in which dom data is read has been changed from using sysfs to using ethtool.
The ethtool returns None for ports without dom support, resulting in None being returned. However, this fails xcvrd to add the TRANSCEIVER_DOM_SENSOR table entry of associated port to CONFIG_DB and then causes SNMP fail.
To address this issue a default dict is initialized with all data set to 'N/A' and is returned is the above case.
BTW, in the original implementation which sysfs is used to read dom data, even though non-None data is returned for ports without dom support, it does not contain valid data. This can result in wrong data in TRANSCEIVER_DOM_SENSOR table.

* [sfputil]
removing unnecessary empty lines
removing redundent code
replacing hardcoding strings/numbers with predefined const variables
@keboliu keboliu closed this as completed Jun 12, 2019
yxieca pushed a commit that referenced this issue Mar 17, 2023
src/sonic-platform-common

* 8460721 - (HEAD -> 202205, origin/202205) Fix issues in cmis.get_transceiver_bulk_status (#351) (10 hours ago) [Stephen Sun]
src/sonic-sairedis

* 10f37ef - (HEAD -> 202205, origin/202205) Ignore removing switch for mellanox platform due to known limitation (#1216) (10 hours ago) [Junchao-Mellanox]
src/sonic-swss

* 5f031af - (HEAD -> 202205, origin/202205) [flowcounterrouter] Fix the Route remove flow for non-bound prefixes (#2691) (10 hours ago) [Vivek]
src/sonic-utilities

* f88ca1c9 - (HEAD -> 202205, origin/202205) Improve show acl commands (#2667) (10 hours ago) [bingwang-ms]
* 738406b7 - Enhance the logic to wait for all buffer tables to be removed in _clear_qos (#2720) (10 hours ago) [Stephen Sun]
How I did it
StormLiangMS added a commit that referenced this issue Mar 30, 2023
Why I did it
832ef9c4 - Fix bug in GCU vlanintf_validator ([Bcm SAI] ugprade Broadcom SAI to version 3.3.5.4m-1 #2765) (5 minutes ago) [jingwenxie]
53f611b7 - Revert "Convert IPv6 addresses to lowercase in apply-patch (Add Pegatron project to branch 201807 #2299)" (Add note for running out of disk space in /var/lib/docker to README.md #2758) (20 hours ago) [jingwenxie]
79a21cef - Revert frr route check ([mlnx] fix url inconsistency in fw.mk #2761) (8 minutes ago) [StormLiangMS]
824680ed - Resolved rc!=0 problem by replacing fgrep with awk. Added ipv4 filtering to get only v4 peers in case of show ip bgp neighbors (Improve eeprom access reliability #2756) (30 hours ago) [saurabh17g]
10f31ea6 - Revert "Replace pickle by json (Add autoneg to 7170-Q59S20 #2636)" ([hostcfgd] Default value of fallthrough for authentication set to be False.  #2746) (7 days ago) [Mai Bui]
05fa7513 - Fix the show interface counters throwing exception on device with no external interfaces ([docker-platform-monitor]: Add smartmontools 6.6-1 #2703) (11 days ago) [abdosi]
f27dea0c - [route_check] remove check-frr_patch mock ([minigraph]: Mark both ERSPAN and ERSPANv6 as mirror ACL tables #2732) (11 days ago) [Stepan Blyshchak]
2d95529d - Revert "Update load minigraph to load backend acl (mlnx msn2010: default config_db.json generation with sonic-cfggen is not working #2236)" (swss stretch update broke restore_neighbors.py for neigh service #2735) (12 days ago) [Neetha John]
c869c970 - (master) Update the ref guide to reflect the vlan brief output ([teamd] update teamd docker to stretch and fix teamd_init failure #2731) (2 weeks ago) [Vivek]
76457141 - Fix fast-reboot DB migration ([teamd]: update teamd docker to stretch #2734) (2 weeks ago) [Aryeh Feigin]
f7f783bc - Enhance the logic to wait for all buffer tables to be removed in _clear_qos ([sfputil] Not able to read out values of voltage/temp/power on some cables  #2720) (2 weeks ago) [Stephen Sun]
e6179afa - Remove timer from FAST_REBOOT STATE_DB entry and use finalizer (Rollback kernel submodule update. #2621) (3 weeks ago) [Aryeh Feigin]
ff688323 - [route_check] fix IPv6 address handling ([docker pmon] install fancontrol & sensord #2722) (3 weeks ago) [Stepan Blyshchak]
7a604c51 - update fast-reboot ([201811][sairedis][swss] advance sub module head of sairedis and swss #2728) (3 weeks ago) [jhli-cisco]
9f83ace9 - [GCU] Add vlanintf-validator (Revert "[device/celestica] blacklist gpio_ich kernel module on haliburton" #2697) (3 weeks ago) [jingwenxie]
338d1c05 - Check SONiC dependencies before installation. ([sonic-slave]: Add iproute2 dependencies in stretch docker #2716) (3 weeks ago) [Liu Shilong]
64d2efd2 - Improve show acl commands ([sonic-utilities] update submodule #2667) (3 weeks ago) [bingwang-ms]
2ef5b31e - [GCU] Add PFC_WD RDMA validator ([sub module] advance sonic-utilities sub module for 201811 branch #2619) (3 weeks ago) [isabelmsft]
c7aa8416 - [show][muxcable] increase timeout for displaying HW_STATUS (Fixing get_transceiver_change_event #2712) (3 weeks ago) [vdahiya12]
2fc2b826 - YANG validation for ConfigDB Updates: MIRROR_SESSION use case ([mellanox] Update SDK to 4.3.0132 #2430) (3 weeks ago) [isabelmsft]
e16bdaae - Fix non-zero status exit on non secure boot system ([service] add warmboot finializer service #2715) (3 weeks ago) [kellyyeh]
90d70152 - [route_check] implement a check for FRR routes not marked offloaded (Feature to run an option platform specific script on the first boot #2531) (3 weeks ago) [Stepan Blyshchak]
c2bc150a - [warm/fast-reboot] Backup logs from tmpfs to disk during fast/warm shutdown ([swss]: update swss docker to stretch #2714) (3 weeks ago) [Vaibhav Hemant Dixit]
a015834d - [db_migrator] Add missing attribute 'weight' to route entries in APPL DB ([device/celestica] blacklist gpio_ich kernel module on seastone #2691) (4 weeks ago) [Vaibhav Hemant Dixit]
cd519aac - [ci] Fix pipeline issue caused by sonic-slave-* change. ([201803] Modify Debian apt repos to reflect changes made by maintainers #2709) (4 weeks ago) [Liu Shilong]
2680e6f3 - [dhcp_relay] Fix dhcp_relay restart error while add/del vlan ([thrift] add a patch to revert THRIFT-3650 #2688) (4 weeks ago) [Yaqiang Zhu]
How I did it
How to verify it
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant