Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tunnelling over stack with local and remote interfaces fail #3388

Closed
mwutzke opened this issue Dec 10, 2019 · 1 comment
Closed

tunnelling over stack with local and remote interfaces fail #3388

mwutzke opened this issue Dec 10, 2019 · 1 comment
Labels

Comments

@mwutzke
Copy link
Contributor

mwutzke commented Dec 10, 2019

When using the same tunnel ACL on both DP local and DP remote (via stack interface) ports, tunnelled traffic from the remote interface does not reach the tunnel egress interface.

Using a network configuration like the following

host11 -- sw1 ==== sw3 -- host31
           |
           |
      host199 (tunnel egress)

with the following faucet.yaml against 2 OVS bridge instances:

acls:
  tunnel-udp:
    - rule:
        dl_type: 0x0800         # IPv4
        nw_proto: 17            # UDP
        actions:
          output:
            tunnel:
              type: 'vlan'
              tunnel_id: 4010
              dp: sw1
              port: 99
          allow: true


vlans:
  home:
    vid: 100

dps:
  sw1:
    dp_id: 0x1
    hardware: "Open vSwitch"
    stack:
      priority: 1
    interfaces:
      1:
        native_vlan: home
        acls_in: [tunnel-udp]
      2:
        native_vlan: home
      10:
        stack:
          dp: sw3
          port: 10
      99:
        description: monitoring port
        output_only: true
  sw3:
    dp_id: 0x3
    hardware: "Open vSwitch"
    interfaces:
      1:
        native_vlan: home
        acls_in: [tunnel-udp]
      10:
        stack:
          dp: sw1
          port: 10

Injecting L2 flooded UDP traffic on sw1/1, this traffic gets tunnelled to sw1/99, as expected.

But, injecting the same UDP stream on sw3/1, this traffic does not egress via sw1/99.

But, by removing the 'acls_in' stanza from sw1/1, this allows the ingress UDP traffic on sw3/1 to be tunnelled to sw1/99 correctly.

Looking at sw1, in the working case (when executed with just the tunnel ACL applied to sw3/1), then I observer the following rules in sw1:

 cookie=0x5adc15c0, duration=25.883s, table=0, n_packets=11, n_bytes=506, priority=16384,dl_vlan=4010 actions=resubmit(,1)
 cookie=0x5adc15c0, duration=29.510s, table=0, n_packets=5, n_bytes=300, priority=9099,in_port="l-br1_10-br3_10",dl_dst=01:80:c2:00:00:00/ff:ff:ff:ff:ff:f0,dl_type=0x88cc actions=CONTROLLER:128
 cookie=0x5adc15c0, duration=29.509s, table=0, n_packets=0, n_bytes=0, priority=9099,in_port="veth-host199" actions=drop
 cookie=0x5adc15c0, duration=29.507s, table=0, n_packets=0, n_bytes=0, priority=9001,in_port="l-br1_10-br3_10",vlan_tci=0x0000/0x1fff actions=drop
 cookie=0x5adc15c0, duration=29.507s, table=0, n_packets=0, n_bytes=0, priority=9000,in_port="l-br1_10-br3_10" actions=resubmit(,2)
 cookie=0x5adc15c0, duration=29.507s, table=0, n_packets=0, n_bytes=0, priority=9000,in_port="veth-host11",vlan_tci=0x0000/0x1fff actions=mod_vlan_vid:100,resubmit(,2)
 cookie=0x5adc15c0, duration=29.507s, table=0, n_packets=0, n_bytes=0, priority=9000,in_port="veth-host12",vlan_tci=0x0000/0x1fff actions=mod_vlan_vid:100,resubmit(,2)
 cookie=0x5adc15c0, duration=29.502s, table=0, n_packets=0, n_bytes=0, priority=0 actions=drop
 cookie=0x5adc15c0, duration=25.883s, table=1, n_packets=11, n_bytes=506, priority=20480,dl_vlan=4010 actions=strip_vlan,output:"veth-host199"
 cookie=0x5adc15c0, duration=29.502s, table=1, n_packets=0, n_bytes=0, priority=0 actions=drop

This selects the tunnelled traffic (table0), and then egresses it out port sw1/99 after removing the vlan tag.

But, in the failing case failing case (both ports have acls_in), the rules in sw1 have significant differences (I've removed some rules from the output):

 cookie=0x5adc15c0, duration=128.550s, table=0, n_packets=0, n_bytes=0, priority=20480,in_port="veth-host12" actions=resubmit(,1)
 cookie=0x5adc15c0, duration=71.393s, table=0, n_packets=39, n_bytes=1962, priority=20480,in_port="l-br1_10-br3_10" actions=resubmit(,1)
 cookie=0x5adc15c0, duration=67.615s, table=0, n_packets=0, n_bytes=0, priority=20480,udp,in_port="veth-host11" actions=output:"veth-host199",resubmit(,1)
 cookie=0x5adc15c0, duration=128.547s, table=0, n_packets=0, n_bytes=0, priority=0 actions=drop
 cookie=0x5adc15c0, duration=128.549s, table=1, n_packets=0, n_bytes=0, priority=9099,in_port="veth-host199" actions=drop
 cookie=0x5adc15c0, duration=71.390s, table=1, n_packets=27, n_bytes=1242, priority=9000,in_port="l-br1_10-br3_10" actions=resubmit(,2)
 cookie=0x5adc15c0, duration=71.391s, table=1, n_packets=12, n_bytes=720, priority=9099,in_port="l-br1_10-br3_10",dl_dst=01:80:c2:00:00:00/ff:ff:ff:ff:ff:f0,dl_type=0x88cc actions=CONTROLLER:128
 cookie=0x5adc15c0, duration=128.549s, table=1, n_packets=0, n_bytes=0, priority=9000,in_port="veth-host11",vlan_tci=0x0000/0x1fff actions=mod_vlan_vid:100,resubmit(,2)
 cookie=0x5adc15c0, duration=128.549s, table=1, n_packets=0, n_bytes=0, priority=9000,in_port="veth-host12",vlan_tci=0x0000/0x1fff actions=mod_vlan_vid:100,resubmit(,2)
 cookie=0x5adc15c0, duration=71.391s, table=1, n_packets=0, n_bytes=0, priority=9001,in_port="l-br1_10-br3_10",vlan_tci=0x0000/0x1fff actions=drop
 cookie=0x5adc15c0, duration=128.547s, table=1, n_packets=0, n_bytes=0, priority=0 actions=drop
...
 cookie=0x5adc15c0, duration=67.614s, table=2, n_packets=27, n_bytes=1242, priority=4096,dl_vlan=4010 actions=CONTROLLER:96,resubmit(,3)
...
 cookie=0x5adc15c0, duration=128.547s, table=3, n_packets=27, n_bytes=1242, priority=0 actions=resubmit(,4)
...
 cookie=0x5adc15c0, duration=67.615s, table=4, n_packets=27, n_bytes=1242, priority=8241,in_port="l-br1_10-br3_10",dl_vlan=4010,dl_dst=ff:ff:ff:ff:ff:ff actions=drop

It appears that, in the failing case, the tunnel interception / de-encapsulation rules are not present.

@anarkiwi anarkiwi added the bug label Dec 10, 2019
@anarkiwi
Copy link
Member

Apologies for the delay.

We think is fixed - there were a few underlying issues.

#3577 adds the test coverage to check that this scenario should work.

#3476 should have fixed the underlying problem which was also related to #3389.

Please do let us know if it's still a problem (or if there is another problem!).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants