Skip to content

Commit

Permalink
Fix Blackhole implemention for e2e tests
Browse files Browse the repository at this point in the history
Based on Fu Wei's idea discussed in the [issue](etcd-io#17737),
we employ the blocking on L7 but without using external tools.

[Background]

A peer will
(a) receive traffic from its peers
(b) initiate connections to its peers (via stream and pipeline).

Thus, the current mechanism of only blocking peer traffic via the peer's
existing proxy is insufficient, since only scenario (a) is handled, and
scenario (b) is not blocked at all.

[Proposed solution]

We introduce an forward proxy for each peer, which will be proxying all
the connections initiated from a peer to its peers.

The modified architecture will look something like this:
```
A -- A's forward proxy ----- B's reverse proxy - B
     ^ newly introduced      ^ in the original codebase (renamed)
```

By adding this forward proxy, we can block all in and out traffic that
is initiated from a peer to others, without having to resort to external
tools, such as iptables.

[Testing]
- `make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionLeader$ go.etcd.io/etcd/tests/v3/e2e -v -count=1`
- `make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionFollower$ go.etcd.io/etcd/tests/v3/e2e -v -count=1`

[Issues]
- I run into `context deadline exceeded` sometimes
```
    etcd_mix_versions_test.go:175:
                Error Trace:    /Users/henrybear327/go/src/etcd/tests/e2e/etcd_mix_versions_test.go:175
                                                        /Users/henrybear327/go/src/etcd/tests/e2e/blackhole_test.go:75
                                                        /Users/henrybear327/go/src/etcd/tests/e2e/blackhole_test.go:31
                Error:          Received unexpected error:
                                [/Users/henrybear327/go/src/etcd/bin/etcdctl --endpoints=http://localhost:20006 put key-0 value-0] match not found.  Set EXPECT_DEBUG for more info Errs: [unexpected exit code [1] after running [/Users/henrybear327/go/src/etcd/bin/etcdctl --endpoints=http://localhost:20006 put key-0 value-0]], last lines:
                                {"level":"warn","ts":"2024-05-05T23:02:36.809726+0800","logger":"etcd-client","caller":"[email protected]/retry_interceptor.go:65","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0x140001ee960/localhost:20006","method":"/etcdserverpb.KV/Put","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
                                Error: context deadline exceeded
                                 (expected "OK", got []). Try EXPECT_DEBUG=TRUE
                Test:           TestBlackholeByMockingPartitionLeader
                Messages:       failed to put "key-0", error: [/Users/henrybear327/go/src/etcd/bin/etcdctl --endpoints=http://localhost:20006 put key-0 value-0] match not found.  Set EXPECT_DEBUG for more info Errs: [unexpected exit code [1] after running [/Users/henrybear327/go/src/etcd/bin/etcdctl --endpoints=http://localhost:20006 put key-0 value-0]], last lines:
                                {"level":"warn","ts":"2024-05-05T23:02:36.809726+0800","logger":"etcd-client","caller":"[email protected]/retry_interceptor.go:65","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0x140001ee960/localhost:20006","method":"/etcdserverpb.KV/Put","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
                                Error: context deadline exceeded
                                 (expected "OK", got []). Try EXPECT_DEBUG=TRUE
```

[References]
[1] issue etcd-io#17737
[2] PR (V1) https://github.com/henrybear327/etcd/tree/fix/e2e_blackhole
[3] PR (V2) etcd-io#17891

Signed-off-by: Chun-Hung Tseng <[email protected]>

It's verified that the blocking of traffic is complete, compared to
previous solutions [2][3].

[Implementation]

The main subtasks are
- set up an environment variable `E2E_TEST_FORWARD_PROXY_IP`
- implement forward proxy by extending the existing proxy server code
- implement enable/disable of the forward proxy in the e2e test

The result is that for every peer, we will have the arch like this
```
A -- A's forward proxy
     (connections initiated from A will be forwarded from this proxy)
 |   ^ covers case (b)
 |
 --- A's (currently existing) reverse proxy
     (advertised to other peers where the connection should come in from)
     ^ covers case (a)
```

Implement determineHTTPTransportProxyParsingFunc

Signed-off-by: Chun-Hung Tseng <[email protected]>

Tidy up net.SplitHostPort

Signed-off-by: Chun-Hung Tseng <[email protected]>

Remove reverse proxy and keep only forward proxy for all peers

There is no need of reverse proxy after the introduction of forward
proxy, as the forward proxy holds the information of the destination,
we can filter connection traffic from there, and this can reduce 1 hop
when the proxy is turned on.

Clearly separate L4 and L7 connection handling logic

Due to forward proxy's need to parse the CONNECT header, which is a
L7 layer feature, thus we are splitting the proxy into 2 types, for
better maintainability.

Use a custom Listener wrapper to implement existing L4 features but switch to L7 proxy design

If the first request is not a CONNECT, we will need to receive perfectly
parsable HTTP packets, not just random byte streams, as Serve() will
leverage ReadRequest() under the hood.

Transition to forward proxy

- Remove To and From fields, as this is only used in reverse proxy
- Add Listen field, to indicate which URL that the forward is listening
for requests
- Update the tests as required

TODO
- Figure out why DelayTx() isn't working anymore

- Doesn't support unix socket (L7 http transport proxy only support http/https/and socks5)
- It's L7 so we need to send perfectly crafted http request
- Bug fix: Doing var httpTransportProxyParsingFunc = determineHTTPTransportProxyParsingFunc() will initialize the function once only (env var read on program init)

If the traffic to the destination is HTTPS, a CONNECT request will be sent
first (containing the intended destination HOST).
If the traffic to the destination is HTTP, no CONNECT request will be sent
first, but normal HTTP request, with the HOST set to the final destination,
will be sent.

Reference:
- etcd-io#17938 (comment)
- etcd-io#17985 (comment)

Signed-off-by: Chun-Hung Tseng <[email protected]>
Signed-off-by: Siyuan Zhang <[email protected]>
Co-authored-by: Iván Valdés Castillo <[email protected]>
  • Loading branch information
2 people authored and henrybear327 committed Sep 25, 2024
1 parent 5704c61 commit 640b55e
Show file tree
Hide file tree
Showing 8 changed files with 872 additions and 817 deletions.
21 changes: 19 additions & 2 deletions client/pkg/transport/transport.go
Original file line number Diff line number Diff line change
Expand Up @@ -18,12 +18,29 @@ import (
"context"
"net"
"net/http"
"net/url"
"os"
"strings"
"time"
)

type unixTransport struct{ *http.Transport }

var httpTransportProxyParsingFunc = determineHTTPTransportProxyParsingFunc

func determineHTTPTransportProxyParsingFunc() func(req *http.Request) (*url.URL, error) {
// according to the comment of http.ProxyFromEnvironment: if the proxy URL is "localhost"
// (with or without a port number), then a nil URL and nil error will be returned.
// Thus, we workaround this limitation by manually setting an ENV named E2E_TEST_FORWARD_PROXY_IP
// and parse the URL (which is a localhost in our case)
if forwardProxy, exists := os.LookupEnv("E2E_TEST_FORWARD_PROXY_IP"); exists {
return func(req *http.Request) (*url.URL, error) {
return url.Parse(forwardProxy)
}
}
return http.ProxyFromEnvironment
}

func NewTransport(info TLSInfo, dialtimeoutd time.Duration) (*http.Transport, error) {
cfg, err := info.ClientConfig()
if err != nil {
Expand All @@ -39,7 +56,7 @@ func NewTransport(info TLSInfo, dialtimeoutd time.Duration) (*http.Transport, er
}

t := &http.Transport{
Proxy: http.ProxyFromEnvironment,
Proxy: httpTransportProxyParsingFunc(),
DialContext: (&net.Dialer{
Timeout: dialtimeoutd,
LocalAddr: ipAddr,
Expand All @@ -60,7 +77,7 @@ func NewTransport(info TLSInfo, dialtimeoutd time.Duration) (*http.Transport, er
return dialer.DialContext(ctx, "unix", addr)
}
tu := &http.Transport{
Proxy: http.ProxyFromEnvironment,
Proxy: httpTransportProxyParsingFunc(),
DialContext: dialContext,
TLSHandshakeTimeout: 10 * time.Second,
TLSClientConfig: cfg,
Expand Down
Loading

0 comments on commit 640b55e

Please sign in to comment.