Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add agent for testing pod networking #1448

Merged
merged 2 commits into from
May 3, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 34 additions & 0 deletions test/agent/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
FROM public.ecr.aws/bitnami/golang:1.13 as builder

WORKDIR /workspace
ENV GOPROXY direct

COPY go.mod go.mod
COPY go.sum go.sum

RUN go mod download

COPY cmd cmd
COPY pkg pkg

# Package all testing binaries into one docker file
# which can be used for different test scenarios

RUN CGO_ENABLED=0 GOOS=linux GOARCH=amd64 GO111MODULE=on go build \
-a -o traffic-server cmd/traffic-server/main.go

RUN CGO_ENABLED=0 GOOS=linux GOARCH=amd64 GO111MODULE=on go build \
-a -o traffic-client cmd/traffic-client/main.go

RUN CGO_ENABLED=0 GOOS=linux GOARCH=amd64 GO111MODULE=on go build \
-a -o networking cmd/networking/main.go

RUN CGO_ENABLED=0 GOOS=linux GOARCH=amd64 GO111MODULE=on go build \
-a -o metric-server cmd/metric-server/main.go

FROM public.ecr.aws/amazonlinux/amazonlinux:2
RUN yum update -y && \
yum clean all

WORKDIR /
COPY --from=builder /workspace/ .
41 changes: 41 additions & 0 deletions test/agent/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# Copyright Amazon.com Inc. or its affiliates. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"). You
# may not use this file except in compliance with the License. A copy of
# the License is located at
#
# http://aws.amazon.com/apache2.0/
#
# or in the "license" file accompanying this file. This file is
# distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF
# ANY KIND, either express or implied. See the License for the specific
# language governing permissions and limitations under the License.

VERSION ?= $(shell git describe --tags --always || echo "unknown")
IMAGE_NAME = amazon/amazon-k8s-cni/test/agent
REPO=$(AWS_ACCOUNT).dkr.ecr.$(AWS_REGION).amazonaws.com/$(IMAGE_NAME)
IMAGE ?= $(REPO):$(VERSION)

fmt:
go fmt .

# Run go vet against code
vet:
go vet .

docker-build: check-env
docker build . -t ${IMAGE}

docker-push: check-env
docker push ${IMAGE}

check-env:
@:$(call check_var, AWS_ACCOUNT, AWS account ID for publishing docker images)
@:$(call check_var, AWS_REGION, AWS region for publishing docker images)

check_var = \
$(strip $(foreach 1,$1, \
$(call __check_var,$1,$(strip $(value 2)))))
__check_var = \
$(if $(value $1),, \
$(error Undefined variable $1$(if $2, ($2))))
71 changes: 71 additions & 0 deletions test/agent/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
### Test Agent
The test agent contains multiple binaries that are used by Ginkgo Automation tests.

### List of Go Binaries in the Agent

Currently the agent supports the following Go Binaries.

#### Traffic Testing

For traffic validation across multiple pods we have the following 3 Go binaries.

**traffic-server**

The traffic server starts a TCP/UDP server that listens for incoming traffic from multiple clients.

**traffic-client**

The traffic client takes a list of servers as input parameter and tests connection to those server and finally prints the success/failure status to stdout and optionally pushes the results to metric server.

**metric-server**

The metric server is supposed collects metrics from all traffic clients and allow the test suite to just query the metric server instead of getting stdout for each client.

One way to run the traffic test is as follows
- Deploy N Servers
- Deploy M Client pods and pass the IP Address of N Servers as input.
- Query the metric server to see the Success/Failure Rate.

#### Networking Testing
Networking testing binary must be invoked on a pod that runs in Host Networking mode. It is capable of testing the Pod networking is setup correctly and once the Pod has been deleted the networking has been teared down correctly by the CNI Plugin.

### How to test the Agent locally
Apart from running the tests on your local environment. For some test cases where we want to run test on Pod (for instance Pod networking tests) we can copy over the binary to the Pod and execute it. For e2e testing, we can push docker image to ECR and use the image in automation test suite.

#### Running individual test component inside a Pod
- While development you could generate the binary for the component you want to test. Let's say you would like to test the pod networking setup on a host network pod.
```
COMPONENT=networking #Example, when testing networking go binary

CGO_ENABLED=0 \
GOOS=linux \
GOARCH=amd64 \
GO111MODULE=on \
go build -o $COMPONENT cmd/$COMPONENT/main.go
```
- Copy the binary to your tester pod.
```
TESTER_POD=<pod-name>
kubectl cp $COMPONENT TESTER_POD:/tmp/
```
- Execute the binary by doing an exec into the pod and see the desired results.
```
kubectl exec -ti $TESTER_POD sh
./tmp/<component> --<flags>
```

#### Running the docker Image

Run the following command to build the agent image and push to ECR. This needs an existing repository with name "amazon/amazon-k8s-cni/test/agent"
```
AWS_ACCOUNT=<account> AWS_REGION=<region> make docker-build docker-push
```
Change the agent image in the Go code and run the test case as usual.

#### Finalizing the changes
- Submit PR with the change to Agent.
- One of the AWS Maintainer will push the image to ECR (Till we have pipeline that does this for us)
- Use the updated image tag wherever you want to update the docker image in automation tests.

### Future Improvements
Currently the aws-vpc-cni Maintainers have to manually push the ECR image after any change to agent. In future, we would like to push a new image using a pipeline on any updates to the agent directory.
59 changes: 59 additions & 0 deletions test/agent/cmd/metric-server/main.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
// Copyright Amazon.com Inc. or its affiliates. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License"). You may
// not use this file except in compliance with the License. A copy of the
// License is located at
//
// http://aws.amazon.com/apache2.0/
//
// or in the "license" file accompanying this file. This file is distributed
// on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
// express or implied. See the License for the specific language governing
// permissions and limitations under the License.

package main

import (
"encoding/json"
"log"
"net/http"

"github.com/aws/amazon-vpc-cni-k8s/test/agent/pkg/input"
)

var connectivityMetric []input.TestStatus

// metric server stores metrics from test client and returns the aggregated metrics to the

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's be nice to expose different types of metrics/failures/successes add labels on each metric and additionally expose them in prometheus format. For log running validation and some other use-cases, it will be come in handy

// automation test
func main() {
http.HandleFunc("/submit/metric/connectivity", submitConnectivityMetric)
http.HandleFunc("/get/metric/connectivity", getConnectivityMetric)
log.Fatal(http.ListenAndServe(":8080", nil))
}

// adds the metric to list of metrics
func submitConnectivityMetric(_ http.ResponseWriter, r *http.Request) {
decoder := json.NewDecoder(r.Body)

var status input.TestStatus
err := decoder.Decode(&status)

if err != nil {
log.Printf("failed to decode the request body: %v", err)
return
}

log.Printf("received metric %+v", status)
connectivityMetric = append(connectivityMetric, status)
}

// returns the list of metrics
func getConnectivityMetric(w http.ResponseWriter, r *http.Request) {
metricByte, err := json.Marshal(connectivityMetric)
if err != nil {
log.Printf("failed to marshall: %v", err)
return
}
w.Header().Set("Content-Type", "application/json")
w.Write(metricByte)
}
62 changes: 62 additions & 0 deletions test/agent/cmd/networking/main.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
// Copyright Amazon.com Inc. or its affiliates. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License"). You may
// not use this file except in compliance with the License. A copy of the
// License is located at
//
// http://aws.amazon.com/apache2.0/
//
// or in the "license" file accompanying this file. This file is distributed
// on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
// express or implied. See the License for the specific language governing
// permissions and limitations under the License.

package main

import (
"encoding/json"
"flag"
"log"

"github.com/aws/amazon-vpc-cni-k8s/test/agent/cmd/networking/tester"
)

// TODO: Instead of passing the list of pods, get the pods from API Server so this agent can run as DS
// TODO: Export metrics via Prometheus for debugging and analysis purposes
func main() {
var podNetworkingValidationInput tester.PodNetworkingValidationInput
var podNetworkingValidationInputString string
var shouldTestSetup bool
var shouldTestCleanup bool

flag.StringVar(&podNetworkingValidationInputString, "pod-networking-validation-input", "", "json string containing the array of pods whose networking needs to be validated")
flag.BoolVar(&shouldTestCleanup, "test-cleanup", false, "bool flag when set to true tests that networking is teared down after pod has been deleted")
flag.BoolVar(&shouldTestSetup, "test-setup", false, "bool flag when set to true tests the networking is setup correctly after pod is running")

flag.Parse()

if shouldTestCleanup && shouldTestSetup {
log.Fatal("can only test setup or cleanup at one time")
}

err := json.Unmarshal([]byte(podNetworkingValidationInputString), &podNetworkingValidationInput)
if err != nil {
log.Fatalf("failed to unmarshall json string %s: %v", podNetworkingValidationInputString, err)
}

log.Printf("list of pod against which test will be run %v", podNetworkingValidationInput.PodList)

if shouldTestSetup {
log.Print("testing networking is setup for regular pods")
err := tester.TestNetworkingSetupForRegularPod(podNetworkingValidationInput)
if err != nil {
log.Fatalf("found 1 or more pod setup validation failure: %v", err)
}
} else {
log.Print("testing network is teared down for regular pods")
err := tester.TestNetworkTearedDownForRegularPods(podNetworkingValidationInput)
if err != nil {
log.Fatalf("found 1 or more pod teardown validation failure: %v", err)
}
}
}
35 changes: 35 additions & 0 deletions test/agent/cmd/networking/tester/input.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
// Copyright Amazon.com Inc. or its affiliates. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License"). You may
// not use this file except in compliance with the License. A copy of the
// License is located at
//
// http://aws.amazon.com/apache2.0/
//
// or in the "license" file accompanying this file. This file is distributed
// on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
// express or implied. See the License for the specific language governing
// permissions and limitations under the License.

package tester

type PodNetworkingValidationInput struct {
// CIDR Range associated with the VPC
VPCCidrRange []string
// Prefix for the veth pair on host network ns
VethPrefix string
// List of pod to validate the networking
PodList []Pod
}

type Pod struct {
// Name of the pod
PodName string
// Namespace of the pod, used to generate the Link
PodNamespace string
// IPv4 Address of the pod
PodIPv4Address string
// Set to true when the Pod is scheduled on IP
// from the Secondary ENI
IsIPFromSecondaryENI bool
}
Loading