Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

T5873: ipsec remote access VPN: support VTI interfaces. #3221

Merged
merged 4 commits into from
Aug 1, 2024

Conversation

lucasec
Copy link
Contributor

@lucasec lucasec commented Apr 1, 2024

Change Summary

Route-based VPNs can be more convenient to configure and tie in nicely with existing routing protocols, zone-based firewalls, and other common network configurations. OpenVPN users are already quite familiar with this pattern. This PR extends the IPsec (IKEv2) Remote Access VPN to support "virtual tunnel interfaces" enabling similar usage patterns.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Code style update (formatting, renaming)
  • Refactoring (no functional changes)
  • Migration from an old Vyatta component to vyos-1x, please link to related PR inside obsoleted component
  • Other (please describe):

Related Task(s)

https://vyos.dev/T5873
also lays some groundwork needed to complete https://vyos.dev/T5874

Related PR(s)

Component(s) name

ipsec

Proposed changes

This PR includes several changes to enable VTI-based remote access VPNs.

remote-access pool range block

The remote-access pool configuration block has been extended to accept a range block, as an alternative to the current CIDR prefix attribute. This allows defining a more granular range for assigning VPN client IPs, which is helpful if you want to reserve one or more IPs at the start of a CIDR block for the router itself on the VTI interface.

remote-access bind attribute

The remote-access connection accepts a new bind attribute. This works identically to the peer <peer> vti bind attribute (for site-to-site peers you either define one or more tunnels, or a vti block—there is no equivalent of tunnel for remote-access connections hence the decision not to nest it under vti here). Once defined, all traffic to/from connected peers uses the specified VTI interface, as opposed to being routed by kernel policies. This change is enabled by the internal change in VyOS 1.4 that switched from the legacy vti interface type to the newer xfrm interface type, which happily multiple tunnels with different local/remote traffic selectors, e.g. for each connected VPN client.

vti-up-down hook revamp

To facilitate the use of the bind attribute with remote-access connections, a revamp of the vti-up-down hook was also required. The hook is called by the ipsec daemon to dynamically bring VTI interfaces up and down.

The existing logic was build for site-to-site connections with the assumption every VTI interface maps 1:1 to a single ipsec connection. These assumptions were already outdated for dual-stack site-to-site connections (the existing hook ignored events for the IPv6 side of the connection). Remote access configurations further expose this limitation, as each VPN client establishes its own connection.

This PR introduces a new hook mechanism that relies on a plaintext DB file (/tmp/ipsec_vti_interfaces). When ipsec connections come up and down, the DB file is updated with the list of connections that "want" a particular VTI interface to come up. Interfaces are brought up when referenced by at least one entry in the DB and taken down if there are no references. If a remote access connection is configured, a persistent reference is placed for the connection, since clients can connect at any time (I see little value in having the interface go down when no client are connected). Modifications to the interface disable config now take effect immediately—removing disable brings the interface up if and only if it is referenced by an entry in the DB.

Miscellaneous cleanup

The old vti-up-down hook had some code around deleting route table 220. This was also tied to warnings issued at config time if VTI is used without the disable-route-autoinstall option. These are leftover from the previous vti interface type implementation and not relevant to the xfrm implementation.

Naming of remote-access child ESP sessions is improved (making the output of show vpn ipsec sa more useful).

Certain vestiges of ipsec config (such as DHCP interface references) are still written even if the peer or connection is disabled.

How to test

Example configuration:

 interfaces {
     ethernet eth0 {
         ...
     }
     vti vti1 {
         address 10.23.58.1/24
         address fdcc:2200:a8ee:2358::1/64
         description "Client VPN"
         mtu 1436
     }
 }
 vpn {
     ipsec {
         esp-group ClientVPN-Client {
             lifetime 3600
             pfs enable
             proposal 1 {
                 encryption aes256gcm128
                 hash sha256
             }
         }
         ike-group ClientVPN-Client {
             key-exchange ikev2
             lifetime 7200
             proposal 1 {
                 dh-group 21
                 encryption aes256gcm128
                 hash sha256
             }
         }
         options {
             disable-route-autoinstall
         }
         remote-access {
             connection ClientVPN {
                 authentication {
                     client-mode x509
                     local-id <local id>
                     server-mode x509
                     x509 {
                         ca-certificate <ca cert name>
                         certificate <cert name>
                     }
                 }
                 bind vti1
                 dhcp-interface eth0
                 esp-group ClientVPN-Client
                 ike-group ClientVPN-Client
                 pool Client-Pool-v4
                 pool Client-Pool-v6
             }
             pool Client-Pool-v4 {
                 name-server 10.23.58.1
                 range {
                     start 10.23.58.2
                     stop 10.23.58.254
                 }
             }
             pool Client-Pool-v6 {
                 name-server fdcc:2200:a8ee:2358::1
                 range {
                     start fdcc:2200:a8ee:2358::2
                     stop fdcc:2200:a8ee:2358::ffff
                 }
             }
         }
     }
 }

Smoketest result

 INFO - Executing VyOS smoketests
DEBUG - vyos@vyos:~$ /usr/bin/vyos-smoketest
DEBUG - /usr/bin/vyos-smoketest
DEBUG - Running Testcase: /usr/libexec/vyos/tests/smoke/cli/test_vpn_ipsec.py
DEBUG - test_dhcp_fail_handling (__main__.TestVPNIPsec.test_dhcp_fail_handling) ... ok
DEBUG - test_dmvpn (__main__.TestVPNIPsec.test_dmvpn) ... ok
DEBUG - test_flex_vpn_vips (__main__.TestVPNIPsec.test_flex_vpn_vips) ... ok
DEBUG - test_remote_access (__main__.TestVPNIPsec.test_remote_access) ... ok
DEBUG - test_remote_access_dhcp_fail_handling (__main__.TestVPNIPsec.test_remote_access_dhcp_fail_handling) ... ok
DEBUG - test_remote_access_eap_tls (__main__.TestVPNIPsec.test_remote_access_eap_tls) ... ok
DEBUG - test_remote_access_pool_range (__main__.TestVPNIPsec.test_remote_access_pool_range) ... ok
DEBUG - test_remote_access_vti (__main__.TestVPNIPsec.test_remote_access_vti) ... ok
DEBUG - test_remote_access_x509 (__main__.TestVPNIPsec.test_remote_access_x509) ... ok
DEBUG - test_site_to_site (__main__.TestVPNIPsec.test_site_to_site) ... ok
DEBUG - test_site_to_site_vti (__main__.TestVPNIPsec.test_site_to_site_vti) ... ok
DEBUG - test_site_to_site_x509 (__main__.TestVPNIPsec.test_site_to_site_x509) ... ok

Checklist:

  • I have read the CONTRIBUTING document
  • I have linked this PR to one or more Phabricator Task(s)
  • I have run the components SMOKETESTS if applicable
  • My commit headlines contain a valid Task id
  • My change requires a change to the documentation
  • I have updated the documentation accordingly

@vyosbot vyosbot requested review from a team, dmbaturin, sarthurdev, zdc, jestabro, sever-sever and c-po and removed request for a team April 1, 2024 00:44
@lucasec lucasec marked this pull request as draft April 1, 2024 02:28
@lucasec
Copy link
Contributor Author

lucasec commented Apr 1, 2024

Changing this to a draft as it appears the up/down script may need some work for this to work properly. Seems to have been lost in the rebase.

@GurliGebis
Copy link
Contributor

@lucasec is the fix in #3302 in any way related to the issue you mention?

@lucasec
Copy link
Contributor Author

lucasec commented Apr 12, 2024

It’s not related to this PR. I also don’t think that fix is related to the instability I’ve been seeing in https://vyos.dev/T6177. I am planning to resume work on this shortly while I continue to debug that issue.

@GurliGebis
Copy link
Contributor

Sounds great, thank you 🙂

@c-po
Copy link
Member

c-po commented Apr 13, 2024

Very interesting approach. But why I need also pools if now ecerything cones from the VTI? Is it b/c of IP address assignment reasons (fake dhcp?)

@lucasec
Copy link
Contributor Author

lucasec commented Apr 15, 2024

Yeah, the IP addresses are still assigned to clients via the IKE protocol, so the pool configuration is needed to tell strongswan what range to create client IPs from.

Getting the up/down logic for the interface right is a little interesting. I would assume if a remote access is bound to the VTI, the VTI interface should be up all the time.

Cleanest implementation is probably to have set_admin_state (here:

def set_admin_state(self, state):
) check for dependencies in the ipsec config, then only no-op if the interface is unbound or bound to a single site-to-site config (so for remote access, the interface would come up immediately via the normal interface up/down code).

Btw the logic for T6085 for site-to-site configs (9eb018c) has some potentially unintended behavior that disabling/re-enabling a VTI interface in the config does not immediately take effect—the interface state will only update after the ipsec connection gets torn down or ipsec is restarted.

@sever-sever
Copy link
Member

Any updates?

@lucasec
Copy link
Contributor Author

lucasec commented May 14, 2024

I think I set this aside awaiting further feedback on the approach for the up/down script I shared in this comment: #3221 (comment).

Let me re-visit later this week and try to put together an implementation—that may be a faster way to move this forward.

Copy link

This pull request has conflicts, please resolve those before we can evaluate the pull request.

@GurliGebis
Copy link
Contributor

This might be off topic, but how come there all of a sudden is a conflict, with no new commits?

@c-po
Copy link
Member

c-po commented May 23, 2024

This might be off topic, but how come there all of a sudden is a conflict, with no new commits?

The action was not run automatically for quiet some time

@GurliGebis
Copy link
Contributor

Okay, that explains it 🙂

@lucasec
Copy link
Contributor Author

lucasec commented May 30, 2024

Hi all, I haven't forgotten about this PR, but my work schedule has been rather unforgiving lately. Thanks for your patience and I'll hopefully have some improvements soon to get this closer to merging.

@lucasec
Copy link
Contributor Author

lucasec commented Jun 24, 2024

I am about half way through an implementation for the vti-up-down hook and just wanted to write out my approach in case anyone has early feedback on it.

There are three issues with the current vti-up-down hook:

  1. The hook does not work with ipsec remote-access configurations, since a single configuration can map to potentially many connections (e.g. one for each client). The current hook brings the entire interface up every time any connection connects, and down every time any connection disconnects.
  2. While technically a special case of 1, for the same reason as 1 the hook does not allow multiple ipsec connections to share a single VTI interface (see https://vyos.dev/T5874 for some motivation why we might want this).
  3. Also slightly out of scope of T5873, though I intend to address at the same time: setting or clearing the disable attribute on a VTI interface does not immediately take effect, as set_admin_state in the interface code is no-oped. The only way currently to disable a VTI interface is to configure it as disabled then restart the ipsec daemon.

My plan to introduce a new database file /tmp/ipsec_vti_interfaces that will store a space-separated list of entries. The vti-up-down hook will write entries to this DB of the following format: interface:connection-name:protocol, where interface is the VTI interface to bring up, connection-name is the ipsec connection calling for the interface to be brought up, and protocol is v4 for IPv4 or v6 for IPv6. The config logic can write an additional entry of the format interface if the VTI interface is called to always be up (in the case of remote-access connections).

For remote-access connections, the config logic will also bring the interface up at apply time if it is not already. The VTI class's set_admin_state method will also re-gain its ability to bring interfaces up and down, but before bringing an interface up it will check the DB file to see if a connection is calling for it to be up.

Obviously this has turned into a bigger change than I was expecting, but hopefully once I work the last few kinks out it will buy us a more robust foundation for future ipsec/VTI work.

@GurliGebis
Copy link
Contributor

@lucasec is there a way the database can end up inconsistent? (Like a connection not getting removed if something crashes)

@GurliGebis
Copy link
Contributor

Just got an idea - not sure if it would work - but what if the "down" parsed the output from swanctl --list-conns - wouldn't it be possible to detect if any more connections exist, and then only down the interface if the last one is gone?

@lucasec
Copy link
Contributor Author

lucasec commented Jul 5, 2024

Added additional comments and improved method naming.

New smoketests pass and this is largely ready for others to try out, but I am still refining/working on a few race conditions found via manual testing on my live system. Some discussion in Slack here: https://vyos-community.slack.com/archives/C01A6CJFW1F/p1720164511770299?thread_ts=1719111681.530719&cid=C01A6CJFW1F

@lucasec lucasec force-pushed the t5873 branch 2 times, most recently from 5eb85b1 to 934605b Compare July 7, 2024 09:25
@lucasec
Copy link
Contributor Author

lucasec commented Jul 7, 2024

From testing so far, the ipsec daemon seems to invoke the up-down hook synchronously, so there is not much need to worry about races on the VTI up-down DB (/tmp/ipsec-vti-interfaces file) outside of when the hook runs concurrently to a config commit. That latter case was already affected by a race (that happens to also affect the existing hook) where attempting to read the config during a commit can fail and return a blank interface config.

Both the DB race and config read race are largely solvable by calling wait_for_commit_lock before doing anything in the hook.
(Not 100% solvable as wait_for_commit_lock doesn't acquire any sort of lock itself, providing no guarantee that a commit won't start while the script is still running, but this possibility is both rare and not exclusive to this hook)

I am also removing some old logic around clearing route table 220 and warnings about the combination of VTI and the lack of disable-route-autoinstall. Needing to disable route installation was specific to the legacy vti interface type implementation and does not appear to be relevant to the xfrm implementation used now.

@lucasec lucasec marked this pull request as ready for review July 7, 2024 09:26
@lucasec lucasec requested a review from a team as a code owner July 7, 2024 09:26
@lucasec
Copy link
Contributor Author

lucasec commented Jul 8, 2024

FYI, the only smoke test failure in that latest execution is test_ospf_15_ldp_sync.

I still am not seeing the smoketest action including new tests I added in this PR, e.g. test_remote_access_pool_range and test_remote_access_vti. These are included in the run and passing on my local machine though.

I am reasonably confident in the implementation now and would say everything here is ready for review.

Copy link

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Copy link

Conflicts have been resolved. A maintainer will review the pull request shortly.

@lucasec
Copy link
Contributor Author

lucasec commented Jul 22, 2024

Rebased on top of changes merged in #3841.
FYI the smoketests CI job still doesn't seem to run the new tests added in this PR. I did confirm these to be passing on my local machine.

src/conf_mode/vpn_ipsec.py Outdated Show resolved Hide resolved
interface-definitions/vpn_ipsec.xml.in Show resolved Hide resolved
python/vyos/utils/vti_updown_db.py Outdated Show resolved Hide resolved
python/vyos/utils/vti_updown_db.py Outdated Show resolved Hide resolved
src/conf_mode/vpn_ipsec.py Outdated Show resolved Hide resolved
src/conf_mode/vpn_ipsec.py Outdated Show resolved Hide resolved
src/conf_mode/vpn_ipsec.py Show resolved Hide resolved
Copy link

CI integration 👍 passed!

Details

CI logs

  • CLI Smoketests 👍 passed
  • Config tests 👍 passed
  • RAID1 tests 👍 passed

@c-po c-po enabled auto-merge August 1, 2024 07:08
@c-po
Copy link
Member

c-po commented Aug 1, 2024

This should not be backported to 1.4 as it's a massive extension! Lets rather keep some nice features in 1.5

@lucasec
Copy link
Contributor Author

lucasec commented Aug 1, 2024

Great, thanks for reviewing!

re: backporting to 1.4. I am in favor of holding off, at least for now, as there is a definitely a lot of new functionality here that could benefit from some extended QA.

The re-write of the updown hook specifically might be worthy at some point as I believe the current hook in 1.4 has a few bugs in its current form (e.g. see https://vyos-community.slack.com/archives/C01A6CJFW1F/p1720165471164579). I am not convinced this PR solves 100% of those bugs... yet, so I'd be fine putting a pin in this and re-visiting after a while.

@c-po c-po merged commit 962ead6 into vyos:current Aug 1, 2024
13 of 14 checks passed
@lucasec lucasec deleted the t5873 branch August 3, 2024 07:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging this pull request may close these issues.

4 participants