Cluster Design

#Cluster Design Overview

Currently NATS is a single server design, which limits scalability, performance, and HA.

NATS will be extended to have multiple servers participate in a cluster. This will allow client failover, client sharding, and higher system performance with high availability guarantees.

#Design

The current system will be extended to allow routed connections between servers. These connections will direct the interest graph and control message flow in an optimized fashion between servers.

Cluster topologies will be controlled by the route type: Full Mesh One-Hop, or Directed Acyclic Graph. Topologies should avoid cycles with the use of the multi-hop DAG route type.

The configuration will allow for active and passive connections for both security concerns and network issues. Each listed route will be actively solicited by the server. Each server can also listen on a specified port for incoming connections.

##Sample Config

In the following sample config, the NATS server will listen on port 4244 for incoming route connections. It will require authorization similar to client based authorization. It will also actively try to reach out and connect to two other servers listed in the routes section.

cluster:
  port: 4244

  authorization:
    user: route_user
    password: cafebabe
    token: deadbeef
    timeout: 1

  # These are actively connected from this server. Other servers
  # can connect to us if they supply the correct credentials from
  # above.

  routes:
    nats-route://foo:[email protected]:4220
    nats-route://foo:[email protected]:4221

#Security

Security for the routes will be similar to the client based authorization and timeouts.

#Clients

The servers will add additional information to the INFO protocol message describing other activer servers that clients will track. Credential information will not be part of the information exchange, clients will be expected to have the correct credentials for any server they wish to connect to.

Older clients will continue to work as they do today, and will not process the additional information.

When a new client detects a connection drop from the existing server, it can optionally try to connect to a new server that was presented to it under the INFO protocol. INFO protocol messages can be sent by the server to all connected clients at any time, so the list of available alternative servers can change dynamically.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cluster Design

Clone this wiki locally