Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

T6630: ntp: support hardware timestamp offload and other mechanisms to improve accuracy #3966

Open
wants to merge 5 commits into
base: current
Choose a base branch
from

Conversation

lucasec
Copy link
Contributor

@lucasec lucasec commented Aug 10, 2024

Change Summary

This PR exposes features of the chrony daemon that underlies the current VyOS NTP service implementation. These features leverage hardware capabilities of the NIC to achieve more accurate time synchronization.

Running NTP on the local network, we can often get system clocks within hundreds of microseconds of each other. With these additional features, we can get within double digit nanosecond accuracy.

Precisely synchronized clocks may be of limited benefit to generalized networking and computing usecases (outside of accurate log timestamps), but can be immensely helpful for some HPC and A/V production usecases. If you value relative accuracy (systems synchronized with each other) over absolute accuracy (network synchronized with a stratum 0 GPS clock, etc.), being able to use VyOS as a hub for precise time synchronization can be convenient over deploying a dedicated linux system for this purpose.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Code style update (formatting, renaming)
  • Refactoring (no functional changes)
  • Migration from an old Vyatta component to vyos-1x, please link to related PR inside obsoleted component
  • Other (please describe):

Related Task(s)

https://vyos.dev/T6630

Related PR(s)

Component(s) name

ntp

Proposed changes

This change exposes several options supported by the chrony daemon that can increase the accuracy of the NTP time synchronization.

Hardware Timestamp Offload

chrony can leverage NIC hardware capabilities to record the exact time packets are received on the interface, as well as when packets were actually transmitted.

My finding is that this is widely supported by a number of Intel NICs, including:

  • Intel X553 10Gb SFP+
  • Intel I350 gigabit ethernet
  • Intel I225-LM 2.5Gbe card

One newer NIC oddly could only timestamp received packets that it detected being of the PTP protocol, but it could still timestamp transmitted packets:

  • Intel X710 10Gb SFP+

In general you can just tell chrony to enable timestamping to the fullest extent it is supported on all available NICs:
set service ntp offload timestamp default-enable.

If you find the functionality to be buggy on certain interfaces, you can specify specific interfaces to enable:
set service ntp offload timestamp interface eth0, set service ntp offload timestamp interface eth1, etc.

If the interface supports timestamping only received NTP packets, chrony will use this by default. If you find this buggy on your NIC, you can customize the RX filter: set service ntp offload timestamp interface eth0 rx-filter all (timestamp all received packets) or set service ntp offload timestamp interface eth0 rx-filter none (turn off RX timestamping).

Interleaved mode

Interleaved mode is currently an internet draft (https://datatracker.ietf.org/doc/draft-ietf-ntp-interleaved-modes/07/), and is expected to be incorporated into the upcoming NTPv5. It relies on exchanging multiple packets where the second packet contains the true transmit time of the previous packet (which was not known when the packet was constructed in userspace).

Combined with hardware timestamping, NTP servers/clients can exchange extremely accurate timestamps that cut out any variance caused by kernel queuing delays or variable context switching overhead.

Experimental PTP transport for NTP packets

As mentioned, the Intel X710 series will only timestamp received packets on the PTP port/protocol. To work around this, chrony has a clever (somewhat non-standard, but there is an internet draft here: https://datatracker.ietf.org/doc/draft-ietf-ntp-over-ptp/) mechanism where it wrap a NTP packet inside a PTP protocol message. This is only ever going to be usable in your local LAN with known clients that you can ensure are running a compatible version of chrony.

Whether this should be included at all in VyOS is debatable. Alternatively we could explore implementing a proper ptp service (either to consume time from an existing PTP time infrastructure on the network, or supply time from ntp to the network over ptp).

How to test

Hardware Timestamp Offload

Enable timestamp offload for your existing NTP configuration:

set service ntp offload timestamp default-enable

To confirm hardware timestamping is enabled, run show log ntp and look for the following line: Enabled HW timestamping on <interface>. This confirms hardware timestamping is enabled in both directions on your NIC. That's it and you should see increased stability and less jitter in your NTP measurements.

If the line you see looks more like the following: Enabled HW timestamping (TX only) on <Interface>, this means your NIC does not support being configured to add a hardware timestamp to all received packets. You can use ethtool -T <interface> to verify what "Hardware Receive Filter Modes" your interface supports.

Experimental PTP transport for NTP packets

If your interface supports the various ptp rx filter modes but not the all mode, you can try the experimental "PTP transport" option if your other clients are VyOS devices or other linux systems running the chrony daemon.

On the server:

edit service ntp
set ptp-transport
set offload timestamp interface <interface> receive-filter ptp

On the client:

edit service ntp
set ptp-transport
set offload timestamp interface <interface> receive-filter ptp
set server <server IP> ptp-transport

Interleave mode

If you have VyOS configured as a client to another NTP server that you know is running chrony or otherwise supports the NTP interleaved mode, simply add the interleave option to your server configuration:

edit service ntp
set server <server IP> interleave

Smoketest result

DEBUG - vyos@vyos:~$ /usr/bin/vyos-smoketest
DEBUG - /usr/bin/vyos-smoketest
DEBUG - Running Testcase: /usr/libexec/vyos/tests/smoke/cli/test_service_ntp.py
DEBUG - test_base_options (__main__.TestSystemNTP.test_base_options) ... ok
DEBUG - test_clients (__main__.TestSystemNTP.test_clients) ... ok
DEBUG - test_interface (__main__.TestSystemNTP.test_interface) ... ok
DEBUG - test_interleave_option (__main__.TestSystemNTP.test_interleave_option) ... ok
DEBUG - test_leap_seconds (__main__.TestSystemNTP.test_leap_seconds) ... ok
DEBUG - test_offload_timestamp_default (__main__.TestSystemNTP.test_offload_timestamp_default) ... ok
DEBUG - test_ptp_transport (__main__.TestSystemNTP.test_ptp_transport) ... ok
DEBUG - test_vrf (__main__.TestSystemNTP.test_vrf) ... ok

Note that the smoketests do not cover the rx-filter option, as this is not supported by the virtual NIC used in QEMU.

Checklist:

  • I have read the CONTRIBUTING document
  • I have linked this PR to one or more Phabricator Task(s)
  • I have run the components SMOKETESTS if applicable
  • My commit headlines contain a valid Task id
  • My change requires a change to the documentation
  • I have updated the documentation accordingly

Copy link

github-actions bot commented Aug 10, 2024

👍
No issues in PR Title / Commit Title

Copy link

github-actions bot commented Aug 11, 2024

✅ No issues found in unused-imports check.. Please refer the workflow run

interface-definitions/service_ntp.xml.in Outdated Show resolved Hide resolved
interface-definitions/service_ntp.xml.in Outdated Show resolved Hide resolved
@@ -13,6 +13,76 @@
#include <include/generic-interface.xml.i>
#include <include/listen-address.xml.i>
#include <include/interface/vrf.xml.i>
<node name="offload">
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would de-nest these nodes:

Something like:

set service ntp ptp timestamp interface <name | all>

The all is borrowed from LLDP which does the same - individual interfaces or all

@c-po
Copy link
Member

c-po commented Sep 20, 2024

vyos@vyos:~$ /usr/libexec/vyos/tests/smoke/cli/test_service_ntp.py
test_base_options (__main__.TestSystemNTP.test_base_options) ... ok
test_clients (__main__.TestSystemNTP.test_clients) ... ok
test_interface (__main__.TestSystemNTP.test_interface) ... ok
test_interleave_option (__main__.TestSystemNTP.test_interleave_option) ... ok
test_leap_seconds (__main__.TestSystemNTP.test_leap_seconds) ... ok
test_offload_timestamp_default (__main__.TestSystemNTP.test_offload_timestamp_default) ... ok
test_ptp_transport (__main__.TestSystemNTP.test_ptp_transport) ... ok
test_vrf (__main__.TestSystemNTP.test_vrf) ... ok

----------------------------------------------------------------------
Ran 8 tests in 25.548s

OK

@c-po
Copy link
Member

c-po commented Sep 20, 2024

@lucasec could you send a PR for the documentation?

@lucasec
Copy link
Contributor Author

lucasec commented Sep 21, 2024

@lucasec could you send a PR for the documentation?

Hey, yep, I can get one open. I also believe I owe a few doc PRs for my ipsec contributions over the past months.

<leafNode name="port">
<defaultValue>319</defaultValue>
</leafNode>
<node name="timestamp">
Copy link
Contributor Author

@lucasec lucasec Sep 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure timestamp should be nested under ptp.

If the NIC support timestamping of ntp packets or all packets, the hardware timestamping features can be run over the normal NTP port, without involving the ptp transport functionality at all.

This was why I originally had this block at the top level. You should be able to enable timestamping without enabling PTP.

@lucasec
Copy link
Contributor Author

lucasec commented Sep 23, 2024

Docs PR: vyos/vyos-documentation#1553
Note that the config syntax I wrote in the docs is not exactly in sync with the current state of this branch until we resolve this comment: vyos/vyos-documentation#1553

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging this pull request may close these issues.

3 participants