Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Terraform instead of CloudFormation #546

Closed
saulshanabrook opened this issue Jun 20, 2015 · 25 comments
Closed

Use Terraform instead of CloudFormation #546

saulshanabrook opened this issue Jun 20, 2015 · 25 comments

Comments

@saulshanabrook
Copy link

Have you looked into using Terraform instead of CloudFormation for AWS provisioning?

I used it recently to deploy a lot of services on AWS and found it to be very nice. I liked how it made updating resources easy and how the syntax was readable and allowed for extensibility/reusability. I have also found their developers to be incredibly responsive. I had resolution from bug reports I made in days.

More resources:

We selected Terraform over CloudFormation, primarily because it enabled us to write reusable scripts and it gave us better visibility on infrastructure updates. Though it is early to predict its future, we like the vision of Terraform. Of course, they have a long way to go to become a stable product. It is definitely a tool which will change the way we manage infrastructure.

As you can see, overall both Terraform and CloudFormation have their pros and cons. The reason we went ahead with Terraform was that the planning phase was quite important for our workflow. Also, we had a workaround in mind to solve the state management problem. Regarding the missing resources, we started contributing whatever was essential for us.

@phobologic
Copy link
Contributor

Yep - I've looked at Terraform a few times, and while it's an awesome and ambitious project, there have been quite a few issues with stability. Internally at Remind we use https://github.com/remind101/stacker - which handles building the Empire stacks for us, and does a good job of maintaining cross-stack dependencies. As well, stacker uses troposphere, which means that almost as soon as something gets added to Cloudformation, it's accessible to us. (My favorite example being that on the 11th troposphere got a pull request for all of the features that were added to CF on that same day - cloudtools/troposphere#259)

That doesn't mean we wouldn't accept a Terraform config for Empire to share along WITH Cloudformation, but at the moment there's no reason to do away with Cloudformation itself.

@saulshanabrook
Copy link
Author

@phobologic Thanks for the explanation, makes sense to me.

@phobologic
Copy link
Contributor

Cool - also, it's in my backlog to work on sharing our stacker blueprints for a production Empire environment. There's a bit of Remind specific stuff in the ones we have right now (mostly related to userdata on the hosts), but hopefully soon anyone can use stacker to kick off & maintain their empire clusters.

@pikeas
Copy link

pikeas commented Jun 26, 2015

@phobologic Perhaps re-open this issue (or a new one) until the stacker blueprints are shared?

@phobologic
Copy link
Contributor

@pikeas At this point I'm not sure stacker, or even Terraform, are something that necessitate an issue on empire itself. It's definitely in the works though.

@pikeas
Copy link

pikeas commented Jun 28, 2015

@phobologic That's fine. Do you have a rough estimate for when those blueprints might be available?

Regarding Terraform, given the issue I reported at #558, I've been translating the Cloudformation config to Terraform myself. When this is done, is there any interest in a PR to bring this into your docs (or possibly an alternative bootstrap script)?

@phobologic
Copy link
Contributor

@pikeas I'd have to talk to the rest of the team to see how they feel about it - I think the only downside would be that none of us have any great experience with Terraform and so we'd have to rely heavily on the community to support it (unless we later got that experience, but we've been very happy with cloudformation so far).

If nothing else we could certainly link to it in the docs :)

@saulshanabrook
Copy link
Author

I also have some experience with terraform and would be happy to work together on a PR for it, with @pikeas, if there is interest.

@rgabo
Copy link
Contributor

rgabo commented Jun 29, 2015

We've largely ported the Empire CloudFormation to fit into our Terraform infrastructure and you can find all relevant .tf and .json files in this Gist: https://gist.github.com/rgabo/bfd0a78742572a9a7cd6

Disclaimer: This is the example Empire cluster that is open to the world and only uses public subnets. I've actually included our public/private subnets in vpc.tf if its useful for someone and I'm more than happy to share our NAT setup, which is based on https://www.airpair.com/aws/posts/ntiered-aws-docker-terraform-guide.

We will be working on hardening the Empire cluster based on @ejholmes 's production notes in #557 and will deviate from the example cluster, but I think this is a great starting point for anyone. I know I would've loved to have it a couple days ago ;)

One thing that Terraform does not handle perfectly is Launch Configuration/Autoscaling Groups. It also means instances get created outside of Terraform which is not ideal if you want to be able to quickly create/destroy terraform environments. My current thinking is that we get rid of LC/ASG and create the ECS instances using Terraform and register with the ECS cluster.

Once all of this is done (we're actually moving into production this week), I'm more than happy to update the Gist or create a new one that incorporates the controller plane/minion horde seperation 😉

Thank you all at Remind101, we love Empire.

@rgabo
Copy link
Contributor

rgabo commented Jun 29, 2015

I obviously changed all subnet CIDRs, names, IDs so don't use all of this information to break our infra 😆

@phobologic
Copy link
Contributor

Very cool @rgabo.

I'd be hesitant to stop using Autoscaling groups. We rely on them quite a bit to ensure that if an instance dies a new one is brought up to replace it. Also, I don't think it's that far fetched a thought that we might, at some point, integrate with autoscaling groups.

@rgabo
Copy link
Contributor

rgabo commented Jun 29, 2015

Makes sense. We might as well put the same amount of effort into Terraform and make it handle ASGs better.

It's just that ASGs are harder to decommission, because instances within the ASG have to be decommissioned first and Terraform does not know about those instances. It could potentially set the ASG to DesiredCount=0 right before destroying. At the same time, LaunchConfigurations cannot be changed until they are associated with an ASG, again something that does not play too well with Terraform.

@phobologic
Copy link
Contributor

To be fair, I honestly have 0 experience with Autoscaling groups outside of Cloudformation, so I'm not sure what the 'right way' to delete them is. Cloudformation just seems to know how to handle it. Do you have to scale the ASG down normally before deleting them?

Anyway, if you guys can make Terraform work for you and provide the same features, that'd be awesome for the project :)

@rgabo
Copy link
Contributor

rgabo commented Jun 29, 2015

That's our plan :) I will keep posting as we learn more about running Empire in production and adjust our Terraform infrastructure according to that. Definitely looking forward to the documentation being updated with Production notes but #557 is already more than enough for us.

One slightly unrelated bit which we'd love to incorporate: either Relay functionality or the new Runner implementation. Is there an ETA on that @ejholmes?

@ejholmes
Copy link
Contributor

@rgabo hopefully gonna get the runner in within the next 2-3 days.

@saulshanabrook
Copy link
Author

@rgabo What problems did you have with autoscaling groups and terraform? Looks like they do have some support for them.

@rgabo
Copy link
Contributor

rgabo commented Jun 30, 2015

@saulshanabrook absolutely, we use ASGs with Terraform. The problem is more around how the actual AWS resources behave, not Terraform. The problem is mostly with Launch Configurations. If you change a Launch Configuration in Terraform, it will try to recreate it by destroying it first, but a Launch Configuration cannot be deleted until it is associated with an ASG. That does not play well with Terraform as it does not treat the ASG as changed so it won't handle this situation well.

A partial solution is to taint the ASG which will cause it to recreate it, but this results in all instances in the ASG terminated, the ASG and the LC recreated and the instances launched. Those instances need to boot, find their way into the ELB where it will be another 2x30 seconds by default until they are treated as healthy.

It's more of a hassle than a major issue and when ASGs do kick in to automatically scale or replace unhealthy hosts, they're awesome. They can be a headache when working with them using Terraform, that's all.

@saulshanabrook
Copy link
Author

@rgabo Do you know if there is an issue filed in the terraform repo for this?

@pikeas
Copy link

pikeas commented Jun 30, 2015

hashicorp/terraform#1109

Using create_before_destroy worked for a couple of people on the thread, but not everyone. I'm still working on my translation of Empire CF to Terraform, I'll be trying this solution myself in a few days.

@cordoval
Copy link

@pikeas no newer updates?

UPDATE: @rgabo i think your gist is about empire, i was more looking into a terraform for conveyor. Thanks though.

@Almad
Copy link

Almad commented Jun 3, 2016

@rgabo any news / experience about empire + terraform?

Looking into using that for my infra, would love to learn more.

@freerobby
Copy link

@rgabo Looks like your gist has been taken down since you first posted it (this one: https://gist.github.com/rgabo/bfd0a78742572a9a7cd6). If by chance you still have a working TF config for Empire, I'd love to give it a whirl!

@rgabo
Copy link
Contributor

rgabo commented Jul 9, 2018

Unfortunately the gist is gone and I don't know how much has Empire/ECS changed for our old code to be still relevant. Apologies!

As far as Empire goes, we moved away from ECS/Empire to more mainstream waters and are now running on Kubernetes. With Amazon EKS finally GA I think that we will just reinforce our decision to do so.

I can say nothing but good things about Empire (the control layer) itself, but I would argue that the ecosystem around Kubernetes/Helm is a huge gravitational force.

@freerobby
Copy link

@rgabo Ah, no worries. Thanks for the context.

I agree about Kubernetes. We use it here at Wistia and we love it. I am considering Empire for a personal project (http://astroswarm.com). It consists of about a dozen micro services, so it quickly becomes expensive to host on a PaaS like Heroku. I have very limited time to work on it, so I'm looking for a system that will minimize the amount of "ops" work. Empire seemed like it could fit the bill nicely. Right now I'm running on multi-container Elastic Beanstalk, which works well, but I will outgrow it quickly, as there is no way to scale the container services independently from each other except to create dedicated instances for them.

Glad to hear you had good experiences with Empire, even if you moved off of it. Are you using any orchestration layer on top of Kubernetes, or any other tool that provides an Empire-like/PaaS interface?

@rgabo
Copy link
Contributor

rgabo commented Jul 9, 2018

@freerobby re orchestration layer: not right now and it hurts. We're moving to EKS and to standardized tooling around Helm. It seems to be quite mature at this point, but I don't have much to share in terms of own experience. Hand-rolling manifest.yaml files is a PITA but it works and gives you full control. Helm takes some of the complexity away, but it still requires you to understand the Kubernetes building blocks. I think that's a good thing.

Let us know if you come across something that is worth mentioning.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants