Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding rest to allocation endpoint #1902

Merged
Merged
Show file tree
Hide file tree
Changes from 11 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
71 changes: 46 additions & 25 deletions cmd/allocator/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,6 @@ import (
"crypto/x509"
"fmt"
"io/ioutil"
"net"
"net/http"
"os"
"path/filepath"
Expand All @@ -37,13 +36,13 @@ import (
"agones.dev/agones/pkg/gameservers"
"agones.dev/agones/pkg/util/runtime"
"agones.dev/agones/pkg/util/signals"
gw_runtime "github.com/grpc-ecosystem/grpc-gateway/runtime"
"github.com/heptiolabs/healthcheck"
"github.com/pkg/errors"
"github.com/sirupsen/logrus"
"go.opencensus.io/plugin/ocgrpc"
"google.golang.org/grpc"
"google.golang.org/grpc/codes"
"google.golang.org/grpc/credentials"
"google.golang.org/grpc/keepalive"
"google.golang.org/grpc/status"
"gopkg.in/fsnotify.v1"
Expand All @@ -64,6 +63,19 @@ const (
sslPort = "8443"
)

// grpcHandlerFunc returns an http.Handler that delegates to grpcServer on incoming gRPC
// connections or otherHandler otherwise. Copied from https://github.com/philips/grpc-gateway-example.
func grpcHandlerFunc(grpcServer http.Handler, otherHandler http.Handler) http.Handler {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh that's clever!

@roberthbailey we probably should have done this on the sdk sidecar! oh well!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other option is apparently to use a dedicated muxer but since Agones doesn't have one pulled in atm I figured it is easier to use this approach.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this backwards compatible? I'm wondering if we could take the existing grpc port on the sidecar and add a similar handler to handle both grpc and http.

If so, we could migrate the default ports to be the same for both and then deprecate the ability to select a separate http port. After a grace period we could remove the ability to select the http port.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that it is backwards compatible in the sense that grpc clients will not be aware of the change when grpcHandlerFunc is added. I double checked it by running the example allocator-client.

return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
// This is a partial recreation of gRPC's internal checks https://github.com/grpc/grpc-go/pull/514/files#diff-95e9a25b738459a2d3030e1e6fa2a718R61
// We switch on HTTP/1.1 or HTTP/2 by checking the ProtoMajor
if r.ProtoMajor == 2 && strings.Contains(r.Header.Get("Content-Type"), "application/grpc") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am still not clear why ProtoMajor == 2 and not 3 for the gRPC server.
Can you please add more documentations?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ProtoMajor refers to the HTTP protocol major version part. We are checking if request is HTTP/1.1 or HTTP/2. Will add that as a comment to the code.

grpcServer.ServeHTTP(w, r)
} else {
otherHandler.ServeHTTP(w, r)
}
})
}
func main() {
conf := parseEnvFlags()

Expand Down Expand Up @@ -101,11 +113,6 @@ func main() {

h := newServiceHandler(kubeClient, agonesClient, health, conf.MTLSDisabled, conf.TLSDisabled, conf.remoteAllocationTimeout, conf.totalRemoteAllocationTimeout)

listener, err := net.Listen("tcp", fmt.Sprintf(":%s", sslPort))
if err != nil {
logger.WithError(err).Fatalf("failed to listen on TCP port %s", sslPort)
}

if !h.tlsDisabled {
watcherTLS, err := fsnotify.NewWatcher()
if err != nil {
Expand Down Expand Up @@ -179,11 +186,39 @@ func main() {
grpcServer := grpc.NewServer(opts...)
pb.RegisterAllocationServiceServer(grpcServer, h)

// serve GRPC for allocation
mux := gw_runtime.NewServeMux()
err = pb.RegisterAllocationServiceHandlerServer(context.Background(), mux, h)
if err != nil {
panic(err)
}

cfg := &tls.Config{}
if !h.tlsDisabled {
cfg.GetCertificate = h.getTLSCert
}
if !h.mTLSDisabled {
cfg.ClientAuth = tls.RequireAnyClientCert
cfg.VerifyPeerCertificate = h.verifyClientCertificate
}

// Create a Server instance to listen on port 8443 with the TLS config
server := &http.Server{
Addr: ":8443",
TLSConfig: cfg,
Handler: grpcHandlerFunc(grpcServer, mux),
}

go func() {
err := grpcServer.Serve(listener)
logger.WithError(err).Fatal("allocation service crashed")
os.Exit(1)
if !h.tlsDisabled {
err = server.ListenAndServeTLS("", "")
} else {
err = server.ListenAndServe()
}

if err != nil {
logger.WithError(err).Fatal("unable to start HTTP/HTTPS listener")
os.Exit(1)
}
}()

// Finally listen on 8080 (http) and block the main goroutine
Expand Down Expand Up @@ -255,24 +290,10 @@ func readTLSCert() (*tls.Certificate, error) {
// getServerOptions returns a list of GRPC server options.
// Current options are TLS certs and opencensus stats handler.
func (h *serviceHandler) getServerOptions() []grpc.ServerOption {
if h.tlsDisabled {
markmandel marked this conversation as resolved.
Show resolved Hide resolved
return []grpc.ServerOption{grpc.StatsHandler(&ocgrpc.ServerHandler{})}
}

cfg := &tls.Config{
GetCertificate: h.getTLSCert,
}

if !h.mTLSDisabled {
cfg.ClientAuth = tls.RequireAnyClientCert
cfg.VerifyPeerCertificate = h.verifyClientCertificate
}

// Add options for creds and OpenCensus stats handler to enable stats and tracing.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update the documentation to remove options for creds?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what you mean. Do you mean that I should remove/change the comment that says:
// Add options for creds and OpenCensus stats handler to enable stats and tracing.
or that I need to document changes to options?
I don't think this change changes the command line options as h.tlsDisabled and h.mTLSDisabled are still used.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the documentation says it adds options for creds and ... my comment was to remote certs from the doc as you removed the grpc.Creds(credentials.NewTLS(cfg)),. So change it to // Add an option for OpenCensus stats handler to enable stats and tracing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense! Fixed it now.

// The keepalive options are useful for efficiency purposes (keeping a single connection alive
// instead of constantly recreating connections), when placing the Agones allocator behind load balancers.
return []grpc.ServerOption{
grpc.Creds(credentials.NewTLS(cfg)),
grpc.StatsHandler(&ocgrpc.ServerHandler{}),
grpc.KeepaliveEnforcementPolicy(keepalive.EnforcementPolicy{
MinTime: 1 * time.Minute,
Expand Down
68 changes: 67 additions & 1 deletion site/content/en/docs/Advanced/allocator-service.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,20 @@ publishDate: 2019-10-25T05:45:05Z
description: >
Agones provides an mTLS based allocator service that is accessible from outside the cluster using a load balancer. The service is deployed and scales independent to Agones controller.
---

{{% feature expiryVersion="1.11.0" %}}
To allocate a game server, Agones in addition to {{< ghlink href="pkg/apis/allocation/v1/gameserverallocation.go" >}}GameServerAllocations{{< /ghlink >}}, provides a gRPC service with mTLS authentication, called `agones-allocator`.

The gRPC service is accessible through a Kubernetes service that is externalized using a load balancer. For the gRPC request to succeed, a client certificate must be provided that is in the authorization list of the allocator service.

The remainder of this article describes how to manually make a successful allocation request using the gRPC API.
{{% /feature %}}

{{% feature publishVersion="1.11.0" %}}
To allocate a game server, Agones in addition to {{< ghlink href="pkg/apis/allocation/v1/gameserverallocation.go" >}}GameServerAllocations{{< /ghlink >}}, provides a gRPC and REST service with mTLS authentication, called `agones-allocator`.
markmandel marked this conversation as resolved.
Show resolved Hide resolved

Both services are accessible through a Kubernetes service that is externalized using a load balancer and they run on the same port. For requests to succeed, a client certificate must be provided that is in the authorization list of the allocator service.
The remainder of this article describes how to manually make a successful allocation request using the API.
{{% /feature %}}
The guide assumes you have command line tools installed for [jq](https://stedolan.github.io/jq/), [go](https://golang.org/) and [openssl](https://www.openssl.org/).

## Find the external IP
Expand Down Expand Up @@ -111,6 +119,7 @@ kubectl get secret allocator-client-ca -o json -n agones-system | jq '.data["cli

The last command creates a new entry in the secret data map for `allocator-client-ca` for the client CA. This is for the `agones-allocator` service to accept the newly generated client certificate.

{{% feature expiryVersion="1.11.0" %}}
## Send allocation request

After setting up `agones-allocator` with server certificate and allowlisting the client certificate, the service can be used to allocate game servers. To start, take a look at the allocation gRPC client examples in {{< ghlink href="examples/allocator-client/main.go" >}}golang{{< /ghlink >}} and {{< ghlink href="examples/allocator-client-csharp/Program.cs" >}}C#{{< /ghlink >}} languages. In the following, the {{< ghlink href="examples/allocator-client/main.go" >}}golang gRPC client example{{< /ghlink >}} is used to allocate a Game Server in the `default` namespace.
Expand Down Expand Up @@ -139,6 +148,63 @@ go run examples/allocator-client/main.go --ip ${EXTERNAL_IP} \
--cert ${CERT_FILE} \
--cacert ${TLS_CA_FILE}
```
{{% /feature %}}

{{% feature publishVersion="1.11.0" %}}
## Send allocation request

After setting up `agones-allocator` with server certificate and allowlisting the client certificate, the service can be used to allocate game servers. Make sure you have a [fleet]({{< ref "/docs/Getting Started/create-fleet.md" >}}) with ready game servers in the game server namespace.

Set the following environment variables:
```
NAMESPACE=default # replace with any namespace
EXTERNAL_IP=$(kubectl get services agones-allocator -n agones-system -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
KEY_FILE=client.key
CERT_FILE=client.crt
TLS_CA_FILE=ca.crt
```

### Using gRPC

To start, take a look at the allocation gRPC client examples in {{< ghlink href="examples/allocator-client/main.go" >}}golang{{< /ghlink >}} and {{< ghlink href="examples/allocator-client-csharp/Program.cs" >}}C#{{< /ghlink >}} languages. In the following, the {{< ghlink href="examples/allocator-client/main.go" >}}golang gRPC client example{{< /ghlink >}} is used to allocate a Game Server in the `default` namespace.

```bash
#!/bin/bash

# allocator-client.default secret is created only when using helm installation. Otherwise generate the client certificate and replace the following.
# In case of MacOS replace "base64 -d" with "base64 -D"
kubectl get secret allocator-client.default -n "${NAMESPACE}" -ojsonpath="{.data.tls\.crt}" | base64 -d > "${CERT_FILE}"
kubectl get secret allocator-client.default -n "${NAMESPACE}" -ojsonpath="{.data.tls\.key}" | base64 -d > "${KEY_FILE}"
kubectl get secret allocator-tls-ca -n agones-system -ojsonpath="{.data.tls-ca\.crt}" | base64 -d > "${TLS_CA_FILE}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move L176-L178 to the section above as they are common between the two?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved the lines and renamed the section to be: Reuse following snippet for both grpc and rest examples:
Not sure whether this is good heading for it so open to suggestions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe something like Set the environment variables and store the client secrets before allocating using gRPC or REST APIs?

Also, please make the same change for the multi-cluster-allocation.md file to keep them in sync.


go run examples/allocator-client/main.go --ip ${EXTERNAL_IP} \
--port 443 \
--namespace ${NAMESPACE} \
--key ${KEY_FILE} \
--cert ${CERT_FILE} \
--cacert ${TLS_CA_FILE}
```

### Using REST

```bash
#!/bin/bash

# allocator-client.default secret is created only when using helm installation. Otherwise generate the client certificate and replace the following.
# In case of MacOS replace "base64 -d" with "base64 -D"
kubectl get secret allocator-client.default -n "${NAMESPACE}" -ojsonpath="{.data.tls\.crt}" | base64 -d > "${CERT_FILE}"
kubectl get secret allocator-client.default -n "${NAMESPACE}" -ojsonpath="{.data.tls\.key}" | base64 -d > "${KEY_FILE}"
kubectl get secret allocator-tls-ca -n agones-system -ojsonpath="{.data.tls-ca\.crt}" | base64 -d > "${TLS_CA_FILE}"

curl --key ${KEY_FILE} --cert ${CERT_FILE} --cacert ${TLS_CA_FILE} -H "Content-Type: application/json" --data '{"namespace":"'${NAMESPACE}'"}' https://${EXTERNAL_IP}/gameserverallocation -XPOST
```

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth providing a sample output? It might be nice to know the format of the JSON?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you factor out the section for setting the env variable in a separate section before gRPC and REST?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have modified both allocator-service.md and multi-cluster-allocation.md to have a factored out variable setting. Let me know if you think it looks ok now?
I think as a user I don't mind a bit of code duplication in the example scripts as it allows me to copy paste the script from documentation and have it run straight away, whereas with factored out env setting I need to do 2 steps.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eventually the code snippets will be turned into a script for running multiple times but having a clear documentation on the code snippets may make it easier for the first timer to understand the flow. @markmandel do you have any recommendation on this? I am not experienced in the user experience for documentation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me - I get it 👍

You should expect to see the following output:

```
{"gameServerName":"game-server-name","ports":[{"name":"default","port":7463}],"address":"1.2.3.4","nodeName":"node-name"}
```
{{% /feature %}}

## Secrets Explained

Expand Down
26 changes: 25 additions & 1 deletion site/content/en/docs/Advanced/multi-cluster-allocation.md
Original file line number Diff line number Diff line change
Expand Up @@ -91,10 +91,18 @@ EOF

To enable multi-cluster allocation, set `multiClusterSetting.enabled` to `true` in {{< ghlink href="proto/allocation/allocation.proto" >}}allocation.proto{{< /ghlink >}} and send allocation requests. For more information visit [agones-allocator]({{< relref "allocator-service.md">}}). In the following, using {{< ghlink href="examples/allocator-client/main.go" >}}allocator-client sample{{< /ghlink >}}, a multi-cluster allocation request is sent to the agones-allocator service.

Follow [agones-allocator]({{< relref "allocator-service.md#send-allocation-request">}}) to set the environment variables.
Set the following environment variables:
```
NAMESPACE=default # replace with any namespace
EXTERNAL_IP=$(kubectl get services agones-allocator -n agones-system -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
KEY_FILE=client.key
CERT_FILE=client.crt
TLS_CA_FILE=ca.crt
```

```bash
#!/bin/bash

go run examples/allocator-client/main.go --ip ${EXTERNAL_IP} \
--namespace ${NAMESPACE} \
--key ${KEY_FILE} \
Expand All @@ -103,6 +111,22 @@ go run examples/allocator-client/main.go --ip ${EXTERNAL_IP} \
--multicluster true
```

{{% feature publishVersion="1.11.0" %}}
If using REST use

```bash
#!/bin/bash

# allocator-client.default secret is created only when using helm installation. Otherwise generate the client certificate and replace the following.
# In case of MacOS replace "base64 -d" with "base64 -D"
kubectl get secret allocator-client.default -n "${NAMESPACE}" -ojsonpath="{.data.tls\.crt}" | base64 -d > "${CERT_FILE}"
kubectl get secret allocator-client.default -n "${NAMESPACE}" -ojsonpath="{.data.tls\.key}" | base64 -d > "${KEY_FILE}"
kubectl get secret allocator-tls-ca -n agones-system -ojsonpath="{.data.tls-ca\.crt}" | base64 -d > "${TLS_CA_FILE}"

curl --key ${KEY_FILE} --cert ${CERT_FILE} --cacert ${TLS_CA_FILE} -H "Content-Type: application/json" --data '{"namespace":"'${NAMESPACE}'", "multi_cluster_settings":{"enabled":"true"}}' https://${EXTERNAL_IP}/gameserverallocation -XPOST
```
{{% /feature %}}

## Troubleshooting

If you encounter problems, explore the following potential root causes:
Expand Down
81 changes: 77 additions & 4 deletions test/e2e/allocator_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -15,16 +15,20 @@
package e2e

import (
"bytes"
"context"
"crypto/rand"
"crypto/rsa"
"crypto/tls"
"crypto/x509"
"crypto/x509/pkix"
"encoding/json"
"encoding/pem"
"fmt"
"io/ioutil"
"math/big"
"net"
"net/http"
"testing"
"time"

Expand Down Expand Up @@ -99,6 +103,68 @@ func TestAllocator(t *testing.T) {
assert.NoError(t, err)
}

func TestRestAllocator(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yay! E2E test!

ip, port := getAllocatorEndpoint(t)
requestURL := fmt.Sprintf(allocatorReqURLFmt, ip, port)
tlsCA := refreshAllocatorTLSCerts(t, ip)

flt, err := createFleet(framework.Namespace)
if !assert.Nil(t, err) {
return
}
framework.AssertFleetCondition(t, flt, e2e.FleetReadyCount(flt.Spec.Replicas))
request := &pb.AllocationRequest{
Namespace: framework.Namespace,
RequiredGameServerSelector: &pb.LabelSelector{MatchLabels: map[string]string{agonesv1.FleetNameLabel: flt.ObjectMeta.Name}},
PreferredGameServerSelectors: []*pb.LabelSelector{{MatchLabels: map[string]string{agonesv1.FleetNameLabel: flt.ObjectMeta.Name}}},
Scheduling: pb.AllocationRequest_Packed,
MetaPatch: &pb.MetaPatch{Labels: map[string]string{"gslabel": "allocatedbytest"}},
}
tlsCfg, err := getTLSConfig(allocatorClientSecretNamespace, allocatorClientSecretName, tlsCA)
if !assert.Nil(t, err) {
return
}
client := &http.Client{
Transport: &http.Transport{
TLSClientConfig: tlsCfg,
},
}
jsonRes, err := json.Marshal(request)
if !assert.Nil(t, err) {
return
}
req, err := http.NewRequest("POST", "https://"+requestURL+"/gameserverallocation", bytes.NewBuffer(jsonRes))
if !assert.Nil(t, err) {
logrus.WithError(err).Info("failed to create rest request")
return
}

// wait for the allocation system to come online
err = wait.PollImmediate(2*time.Second, 5*time.Minute, func() (bool, error) {
resp, err := client.Do(req)
if err != nil {
logrus.WithError(err).Info("failed Allocate rest request")
return false, nil
}
body, err := ioutil.ReadAll(resp.Body)
if err != nil {
logrus.WithError(err).Info("failed to read Allocate response body")
return false, nil
}
defer resp.Body.Close() // nolint: errcheck
var response pb.AllocationResponse
err = json.Unmarshal(body, &response)
if err != nil {
logrus.WithError(err).Info("failed to unmarshal Allocate response")
return false, nil
}
validateAllocatorResponse(t, &response)
return true, nil
})

assert.NoError(t, err)
}

// Tests multi-cluster allocation by reusing the same cluster but across namespace.
// Multi-cluster is represented as two namespaces A and B in the same cluster.
// Namespace A received the allocation request, but because namespace B has the highest priority, A will forward the request to B.
Expand Down Expand Up @@ -229,6 +295,15 @@ func getAllocatorEndpoint(t *testing.T) (string, int32) {

// createRemoteClusterDialOption creates a grpc client dial option with proper certs to make a remote call.
func createRemoteClusterDialOption(namespace, clientSecretName string, tlsCA []byte) (grpc.DialOption, error) {
tlsConfig, err := getTLSConfig(namespace, clientSecretName, tlsCA)
if err != nil {
return nil, err
}

return grpc.WithTransportCredentials(credentials.NewTLS(tlsConfig)), nil
}

func getTLSConfig(namespace, clientSecretName string, tlsCA []byte) (*tls.Config, error) {
kubeCore := framework.KubeClient.CoreV1()
clientSecret, err := kubeCore.Secrets(namespace).Get(clientSecretName, metav1.GetOptions{})
if err != nil {
Expand All @@ -253,12 +328,10 @@ func createRemoteClusterDialOption(namespace, clientSecretName string, tlsCA []b
return nil, errors.New("could not append PEM format CA cert")
}

tlsConfig := &tls.Config{
return &tls.Config{
Certificates: []tls.Certificate{cert},
RootCAs: rootCA,
}

return grpc.WithTransportCredentials(credentials.NewTLS(tlsConfig)), nil
}, nil
}

func createFleet(namespace string) (*agonesv1.Fleet, error) {
Expand Down
1 change: 1 addition & 0 deletions vendor/modules.txt
Original file line number Diff line number Diff line change
Expand Up @@ -232,6 +232,7 @@ golang.org/x/net/context
golang.org/x/net/context/ctxhttp
golang.org/x/net/http/httpguts
golang.org/x/net/http2
golang.org/x/net/http2/h2c
golang.org/x/net/http2/hpack
golang.org/x/net/idna
golang.org/x/net/internal/timeseries
Expand Down