Elasticsearch Cluster HA

This infrastructure deploy a fully horizontal scaling elasticsearch cluster self manage on AWS Cloud. The original service provide by AWS are not really flexible and lack of update. It can also be expensive as we saw it as one of the top expense of aws average customer billing.

Schema

Deployment instruction

Get you a nice AWS account with a user that have cloudformation and s3 permission
Get a role dedicated to cloudformation service with admin permission ( or narrow to the need of stacks: EC2, IAM, LAMBDA, R53, EVENT BRIDGE )
Configure an environment target in environments/mytarget.mvars
Build packer opening terminal in packer env profile=fuzion-testing-cloudformation subnet_id=subnet-0100d50d248aff53e make build

Configuration

Key	Type	Description
accountId	String
roleName	String
vpcId	AWS::EC2::VPC::Id
publicSubnets	ListAWS::EC2::Subnet::Id
privateSubnets	ListAWS::EC2::Subnet::Id
availabilityZones	ListAWS::EC2::AvailabilityZone::Name
elasticAmiId	AWS::EC2::Image::Id
keyPair	AWS::EC2::KeyPair::KeyName
bastionSecurityGroupId	AWS::EC2::SecurityGroup::Id
dataInstanceType	String
clusterName	String
privateDomain	String
privateHostedZoneId	AWS::Route53::HostedZone::Id
baseDomain	String
enableMasterHa	Boolean
enableKibana	Boolean
snapshotS3BucketName	String

Source

Resilience in small clusters: https://www.elastic.co/guide/en/elasticsearch/reference/current/high-availability-cluster-small-clusters.html
Node: https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-node.html

Snapshot index

https://www.elastic.co/guide/en/elasticsearch/reference/current/snapshot-restore.html#snapshot-workflow

GOALS

Improve the security with the elasticsearch data nodes.

Use a dedicated subnets
Use traefik as proxy gateway to secure Kibana
Implement different auth provider like oAuth to use central identity service with our cluster

Considering integrate oAuth2 proxy on Traefik to use SSO on Kibana
https://github.com/oauth2-proxy/oauth2-proxy/issues/46

Scaling DATA nodes

We already have an ASG for Data nodes using SPOT instances, but there are no rules to scale out instances.

Monitoring

Integration of monitoring metrics of nodes

Prometheus
AWS Cloudwatch dashboard
Alarm when cluster is not green anymore

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
documentation		documentation
environments		environments
packer		packer
stacks		stacks
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
requirements.txt		requirements.txt
schema-aws-elastic.png		schema-aws-elastic.png
schema.drawio		schema.drawio
shell.nix		shell.nix

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Elasticsearch Cluster HA

Schema

Deployment instruction

Configuration

Source

Snapshot index

GOALS

About

Releases

Packages

Languages

License

PLAB-IO/aws-elasticsearch-cluster

Folders and files

Latest commit

History

Repository files navigation

Elasticsearch Cluster HA

Schema

Deployment instruction

Configuration

Source

Snapshot index

GOALS

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages