Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Blackhole implementation for e2e tests #17938

Commits on Sep 25, 2024

  1. Fix Blackhole implementation for e2e tests

    Based on Fu Wei's idea discussed in the issue [1], we employ the network traffic blocking on L7, using a forward proxy, without the need to use external tools.
    
    [Background]
    
    A peer will
    (a) receive traffic from its peers
    (b) initiate connections to its peers (via stream and pipeline).
    
    Thus, the current mechanism of only blocking peer traffic via the peer's existing reverse proxy is insufficient, since only scenario (a) is handled, and scenario (b) is not blocked at all.
    
    [Proposed solution]
    
    We introduce a forward proxy for each peer, which will be proxying all the connections initiated from a peer to its peers.
    
    We will remove the current use of the reverse proxy, as the forward proxy holds the information of the destination, we can block all in and out traffic that is initiated from a peer to others, without having to resort to external tools, such as iptables.
    
    The modified architecture will look something like this:
    ```
    A --- A's forward proxy ----- B
       ^ newly introduced
    ```
    
    It's verified that the blocking of traffic is complete, compared to previous solutions attempted in PRs [2][3].
    
    [Implementation]
    
    The main subtasks are
    - redesigned as an L7 forward proxy
    - Unix socket support is dropped: e2e test supports unix sockets for peer communication, but only several e2e test cases use Unix sockets as majority of e2e test cases use HTTP/HTTPS
    - introduce a new environment variable `E2E_TEST_FORWARD_PROXY_IP`
    - implement L7 forward proxy by drastically reducing the existing proxy server code and design to use blocking traffic
    
    Known limitations are
    - Doesn't support unix socket (L7 HTTP transport proxy only supports HTTP/HTTPS/and socks5)
    - It's L7 so we need to send a perfectly crafted HTTP request
    -Doesn’t support reordering, dropping, etc. packets on-the-fly
    
    [Testing]
    - `make gofail-enable && make build && make gofail-disable && go test -timeout 60s -run ^TestBlackholeByMockingPartitionLeader$ go.etcd.io/etcd/tests/v3/e2e -v -count=1`
    - `make gofail-enable && make build && make gofail-disable && go test -timeout 60s -run ^TestBlackholeByMockingPartitionFollower$ go.etcd.io/etcd/tests/v3/e2e -v -count=1`
    - `go test -timeout 30s -run ^TestServer_ go.etcd.io/etcd/pkg/v3/proxy -v -failfast`
    
    [References]
    [1] issue etcd-io#17737
    [2] PR (V1) https://github.com/henrybear327/etcd/tree/fix/e2e_blackhole
    [3] PR (V2) etcd-io#17891
    [4] etcd-io#17938 (comment)
    [5] etcd-io#17985 (comment)
    
    Signed-off-by: Siyuan Zhang <[email protected]>
    Co-authored-by: Iván Valdés Castillo <[email protected]>
    Signed-off-by: Chun-Hung Tseng <[email protected]>
    2 people authored and henrybear327 committed Sep 25, 2024
    Configuration menu
    Copy the full SHA
    ac592a2 View commit details
    Browse the repository at this point in the history
  2. Remove pause

    Signed-off-by: Chun-Hung Tseng <[email protected]>
    henrybear327 committed Sep 25, 2024
    Configuration menu
    Copy the full SHA
    e7b77bb View commit details
    Browse the repository at this point in the history
  3. Remove packet modification

    Signed-off-by: Chun-Hung Tseng <[email protected]>
    henrybear327 committed Sep 25, 2024
    Configuration menu
    Copy the full SHA
    85f7a92 View commit details
    Browse the repository at this point in the history
  4. Fix lint

    Signed-off-by: Chun-Hung Tseng <[email protected]>
    henrybear327 committed Sep 25, 2024
    Configuration menu
    Copy the full SHA
    65c1213 View commit details
    Browse the repository at this point in the history
  5. Improve naming

    Signed-off-by: Chun-Hung Tseng <[email protected]>
    henrybear327 committed Sep 25, 2024
    Configuration menu
    Copy the full SHA
    79464a3 View commit details
    Browse the repository at this point in the history
  6. Fix test condition

    Signed-off-by: Chun-Hung Tseng <[email protected]>
    henrybear327 committed Sep 25, 2024
    Configuration menu
    Copy the full SHA
    8d627e2 View commit details
    Browse the repository at this point in the history
  7. Remove latency accept

    The reason for removing latency accept is that we are operating on L7
    now, so we are not going to hijack L4-level accept() function call.
    
    Also, the initial implementation has a bug where if connections are not
    created consecutively, the latency accept will not work, as what will
    happen is:
    a) set latency accept
    b) latency in effect
    c) accept waiting for new connections
    d) new connection comes in, establish it
    e) go to b -> so the time spent on waiting for a new connection is
    actually counted as the "latency accept" time, not just the time between
    the connection request is initialized by the client and the time the
    server actually accepts it.
    
    Signed-off-by: Chun-Hung Tseng <[email protected]>
    henrybear327 committed Sep 25, 2024
    Configuration menu
    Copy the full SHA
    26b6772 View commit details
    Browse the repository at this point in the history
  8. Simplify server_test.go

    Signed-off-by: Chun-Hung Tseng <[email protected]>
    henrybear327 committed Sep 25, 2024
    Configuration menu
    Copy the full SHA
    b97050b View commit details
    Browse the repository at this point in the history