Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed PTP software timestamping. #55

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

peci1
Copy link

@peci1 peci1 commented Jun 25, 2024

The TX path ignored whether the SKB wanted a SW or HW stamp, and once HW timestamping was enabled for the whole NIC, it always returned HW stamps. With this fix, if the TX packet requests SW stamp, it skips the whole PTP path (unless both HW and SW stamps are requested).

On the RX path, the driver has started in an inconsistent state. By default, no timestamping should be done for RX packets. However, the driver automatically enabled RX timestamping on initialization. This fix correctly initializes the NIC with no timestamping and it needs to be set up via SIOCSHWTSTAMP.

There was also a problem with the RX path on init. The NIC did not put PTP packets on the PTP ring, but it did append timestamps to them. That resulted in wrong RX packets which were 12/14 bytes longer, because the timestamp length is only subtracted from PTP ring packets. This was worked around by adding a "flipping" enable_ptp() call that first sets the opposite value and then the actually wanted one.


I tested this on QNAP QXG-10G2T with two AQC107S and firmware 3.1.121 on an NVidia Jetson AGX Xavier with kernel 4.9.337.

The problem manifested when using with linuxptp and passing the -S flag to ptp4l. As a master, it told that there was no tx timestamp found for the sent SYNC packet (because it received a HW stamp which overrides the requested SW stamp). As a slave, it failed to read incoming SYNC messages with bad message error which was caused by the incorrect length of the RX packet.

I tested this fix both at cold boot (which brings the weird RX config mismatch) and on-the-fly switching between timestamping on and off (sudo hwstamp_ctl -i eth2 -r 0 -t 0 and sudo hwstamp_ctl -i eth2 -r 12 -t 1), link losses and reacquisitions etc. Everything works with this PR.

I also tested this with FW 3.0.33, which does not have PTP support. The driver does, however, provide full TX and RX software timestamping capabilities, and these also work with this PR.

I also tested that HW timestamping still works correctly with this PR (on 3.1.121 FW).

I have a few Thunderbolt cards with AQC107S, too, so I'll try to test with these on a normal notebook with newer kernel. I also have one card with (probably) AQC113C, so I'll also test with that one.

Update:

  • Tested on Sonnet Solo10G (FW 3.1.121) (AQC107S) on x86_64 kernel 5.15 without issues.
  • Tested on OWCTB3ADP10GBE (AQC107S) on x86_64 kernel 5.15 with stock FW 3.1.106 without issues. With 3.1.121, I could not get link.
  • Tested on Sanlink3 N1 (AQC107S) on x86_64 kernel 5.15 with stock FW 3.0.5 without issues (but this FW has only SW timestamping). With FW 3.1.121, no issues were observed.

I've added the NEEDS_PTP_FLIP quirk to all cards. I don't really know if it's needed for all of them. But I assume it can't hurt anything.

Also, I needed to remove check for !aq_ptp->a1_ptp from aq_ptp_clock_init() so that the initial weird RX state is fixed. I don't really know why this check was there.

The TX path ignored whether the SKB wanted a SW or HW stamp, and once HW timestamping was enabled for the whole NIC, it always returned HW stamps. With this fix, if the TX packet requests SW stamp, it correctly only gets SW stamps.

On the RX path, the driver has started in an inconsistent state. By default, no timestamping should be done for RX packets. However, the driver automatically enabled RX timestamping on initialization. This fix correctly initializes the NIC with no timestamping and it needs to be set up via SIOCSHWTSTAMP.

There was also a problem with the RX path on init. The NIC did not put PTP packets on the PTP ring, but it did append timestamps to them. That resulted in wrong RX packets which were 12/14 bytes longer, because the timestamp length is only subtracted from PTP ring packets. This was worked around by adding a "flipping" enable_ptp() call that first sets the opposite value and then the actually wanted one. This does not seem to be needed for ATL2 chips.

Signed-off-by: Martin Pecka <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant