Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data[base|store]s & Fly #293

Open
sudhirj opened this issue Jul 23, 2020 · 11 comments
Open

Data[base|store]s & Fly #293

sudhirj opened this issue Jul 23, 2020 · 11 comments
Labels
explainer Article about a technical concept

Comments

@sudhirj
Copy link

sudhirj commented Jul 23, 2020

Based on the number of questions about databases I'm seeing (and having myself), it might make sense to do a explainer on how to use the following databases/stores when deploying on Fly:

  • AWS DynamoDB
  • Postgres on Heroku
  • AWS RDS: Postgres & MySQL
  • AWS Aurora serverless

With emphasis on:

  • security: how best to configure credentials and access control for each system
  • scaling: which systems can scale up automatically, and cost-benefit notes on the configs
  • geographical replication: which systems can replicate across the world, which ones will offer strong vs weak consistency while doing so, the difference between sync/async replication and strong/eventual consistency.
  • Latency considerations with Fly containers running all over the world, and notes on how to decide which Fly regions to run in depending on where the database(s) are, maybe including a latency table from each Fly region to each AWS region.

Let me know if this is something we want, I can pick it up. And if there are other DBs that are commonly requested we can add them in too.

@mrkurt
Copy link
Member

mrkurt commented Jul 23, 2020

The #1 thing on my wish list is a guide to connecting to an RDS instance inside a VPC. We have a lot of people who want to do that securely. If you're cool tackling that first we're all for it, and I think it's about 80% of the work required for talking to Dynamo securely too.

My theoretical favorite option is a wireguard connection into a VPC, but that's hard without some supporting infrastructure. You'd need consul and something dynamically updating wireguard configs as new firecracker VMs came up for it to work.

Other than that, I'm not sure what's simplest. The AWS VPN services might work. Have you tried anything for this?

@sudhirj
Copy link
Author

sudhirj commented Jul 23, 2020

The VPC-to-Fly ask seems like a bit of security-theatre, though. We run VPCs all the time, and the whole point of a VPC is that you don't let anything outside the VPC talk to what's in the VPC. And anyone deploying on a different network like Fly already knows those servers aren't in the VPC — which is also the case when deploying app servers in different regions, even on AWS.

To run VPC peering correctly Fly would have to know and share the IP addresses of the container hosts beforehand, because they'd have to be configurable in AWS security groups. And these container hosts couldn't host anybody else's containers, which I don't think is feasible, especially in IPv4.

Running a Wireguard proxy/tunnel inside a VPC may work out, but then it would be simpler for Fly to make the Wireguard network of all the app containers first, and add the AWS-resident tunnel machine into that wireguard via a Fly bastion host.

The current state-of-the-art on security that I can see would be to open the servers to 0.0.0.0/0 network wise — no getting around that. But there's IAM-based credentials for RDS as well (which is what DynamoDB originally started with) which integrates pretty strongly in the AWS system.

The defaults on AWS have been to allow access to all managed services from 0.0.0.0/0 and enforce security with IAM credentials – that's how S3 and DynamoDB originally worked, with the link adapters inside the VPC being added only to prevent extra networking costs. S3 and DynamoDB still don't care whether you're accessing them from inside or outside a VPC as long as you have the correct credentials, and I don't think there's even a way to prevent non-VPC access. unless explicitly locked to a specific VPC. The same IAM capabilities have recently been added to RDS services as well.

With the current options available on Fly, the security story would be same as a system like Heroku-RDS — open the databases to 0.0.0.0/0 (S3 and DynamoDB are and always have been open) and control access with IAM credentials.

@sudhirj
Copy link
Author

sudhirj commented Jul 23, 2020

Running a Wireguard proxy/tunnel inside a VPC may work out, but then it would be simpler for Fly to make the Wireguard network of all the app containers first, and add the AWS-resident tunnel machine into that wireguard via a Fly bastion host.

This would also be more expensive and throttled on the AWS side, since all traffic would be going through a NAT gateway. It would really make me question whether using Fly is a contradiction to whatever policy or compliance measure is causing the DB to be stuck inside the VPC.

Basically if there's a rule saying that a database must be inside a VPC, then a likely implication would be that the service should not be using Fly. Or Heroku (standard). Or any other outside system that cannot be securely folded into the VPC.

@sudhirj
Copy link
Author

sudhirj commented Jul 23, 2020

I think one good way to phrase it would be that when you make a VPC you create a network that you have complete control over. And you peer that VPC only with other networks that you also have complete control over.

If I peer my VPC with a network that Fly controlled, that would be a nonsensical decision because I would have no control over who or what else Fly added into that network — I would now have a combined network that I did not have complete control over, so I would never peer my VPC with a Fly network.

@sudhirj
Copy link
Author

sudhirj commented Jul 23, 2020

To run VPC peering correctly Fly would have to know and share the IP addresses of the container hosts beforehand, because they'd have to be configurable in AWS security groups. And these container hosts couldn't host anybody else's containers, which I don't think is feasible, especially in IPv4.

Heroku seems to pulling this off by running inside AWS, and they manage a CIDR block of servers as a VPC under their control, allow you to peer with that VPC if you trust them. https://blog.heroku.com/private-space-peering

@mrkurt
Copy link
Member

mrkurt commented Jul 28, 2020

The VPC-to-Fly ask seems like a bit of security-theatre, though. We run VPCs all the time, and the whole point of a VPC is that you don't let anything outside the VPC talk to what's in the VPC.

Tunneling into a VPC is a bit more like VPC peering or point to point VPN, really. VPCs are nice because they're private networks. Fly VMs could theoretically use Wireguard to talk to that private network (it wouldn't open it up to anything else).

Our longer term plan is to (a) roll out private networks for apps and (b) give people a really easy way to "peer" with a VPC, but there's not technical reason that can't work right now.

Incidentally, Serverless Aurora is VPC only, you can't provision those DBs with public IPs.

@sudhirj
Copy link
Author

sudhirj commented Jul 28, 2020

Hmm. Yeah, might have to get around to it eventually. How would Fly do the VPC access? Would Fly maintain a VPC on AWS that users can peer with using the native peering, and run the Wireguard tunnel inside that (like Heroku does).

Or would Fly ask customers to run an EC2 instance inside the VPC and run the Wireguard tunnel through that?

Will check out the Wireguard system, not sure how to put Fly containers into a Wireguard network right now.

@mrkurt
Copy link
Member

mrkurt commented Jul 28, 2020

We'd likely give people a CloudFormation template to create Wireguard gateway(s) in a VPC. Having our own VPCs + peering would be pretty interesting way to do that, although since we're not already on AWS it'll end up being different than Heroku.

Wireguard's config makes this harder than necessary right now. There are probably simpler things to start with. RDS doesn't support TLS client cert authentication natively, so some kind of gateway in a VPC that accepts TLS + validate client certs might be easier. HAProxy can definitely do this.

@sudhirj
Copy link
Author

sudhirj commented Jul 28, 2020

so some kind of gateway in a VPC that accepts TLS + validate client certs might be easier.

This seems like a lot of work to me, it's actually easier to just run containers in that region on Fargate + Load Balancer, which is a bit of an anti-climax.

Even when resources are required to be in a VPC, it should still be possible to put them in a public subnet, and especially with the RDS Proxy they all support IAM auth now — which is secure enough to be the primary access control for everything including S3 and DynamoDB. I'd suggest asking customers to go with that.

@mrkurt
Copy link
Member

mrkurt commented Jul 28, 2020

This seems like a lot of work to me, it's actually easier to just run containers in that region on Fargate + Load Balancer, which is a bit of an anti-climax.

That's a really good point. I think you've convinced me that the best thing to show/explain is how to use stuff like RDS for global apps. People ask about this a lot. If you want to do more writing and lest demo app creation that'd be a pretty good article.

@mrkurt mrkurt added the explainer Article about a technical concept label Jul 28, 2020
@sudhirj
Copy link
Author

sudhirj commented Jul 29, 2020

Yeah... maybe this article can focus on RDS. Heroku Postgres is already a solved problem, just set the DATABASE_URL and it'll work. This can be a demo to

  • create an RDS database inside a VPC, but on a public subnet, with the DB locked down to allow only IAM based access.
  • Example to off that with Fly, with suggestions to restrict the Fly regions to within a 1000KM radius of the DB. Or a sub-50ms ping time. I can also calculate and publish the latency matrix for every Fly region to every AWS region.

That's the main portion. The sides can be:

  • Explain the benefits of the RDS Proxy, can also be set up inside a VPC on a public subnet with locked down IAM authentication.

  • Note that apps could run in every Fly region if they use the local Redis as a cache. Can maybe add a pointer on how to use write triggers+DB notifications to invalidate a cache globally.

  • Notes and examples about the new AWS Multi-region database options and how they benefit Fly users.

The examples can be sample code and AWS CLI commands to configure and lock down everything correctly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
explainer Article about a technical concept
Projects
None yet
Development

No branches or pull requests

2 participants