-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data sharing #17
Data sharing #17
Conversation
I'm putting this in draft for now because there's still a bit of work to finish off the update logic, and probably some cleanup as well. But I wanted to go ahead and open the PR to show what it will look like. Before putting this into ready for review I'll probably squash the commits. |
…ove entire schema with all assets works. Modify manually-managed assets in datashare schema still not implemented.
6203392
to
598db51
Compare
One other thing to note... UPDATE: I've identified the issue, but it's kind of a deal-breaker for being able to manage functions in a data share. The issue is that when you manage UDFs in Redshift, you have to include the parameter list, like so: ALTER DATASHARE ADD FUNCTION foo(varchar); (NB: this is also the syntax you need in order to However you can't read back the parameter list -
I can think of 3 ways of handling this:
@winglot what is your preferred approach? I think option 1 makes the most sense from a user standpoint, but I don't want to rip out all of the code I've written yet if you think there's value in being able to manage specific tables within a datashare/schema. |
further to my comment above, I think what I will probably do is create an alternative branch/PR for option 1 and close this one for the time being. That will make things work in a reliable and expected way, and leave the door open to more fine-grained controls if AWS ever roll out a fix for the above issue, without having to rewrite all the code from scratch. |
…t-digest Update actions/checkout digest to 8ade135
Adds a
redshit_datashare
resource to manage data sharing between Redshift clusters. This should be defined on the producer cluster. For managing the schemas and objects, we use a nested attribute block.Note that we're rolling up the management of schemas (and their objects) into two modes,
auto
andmanual
. This avoids a lot of weird edge cases that would otherwise come up if we'd tried to expose individual settings forALL TABLES
,ALL FUNCTIONS
, andINCLUDENEW
in theALTER DATASHARE
command:auto
mode, we addALL TABLES IN SCHEMA
andALL FUNCTIONS IN SCHEMA
andSET INCLUDENEW=true FOR SCHEMA
, so that newly-created tables/functions are automatically exposed to the datashare by the redshift cluster itself, without needing to re-run terraform.manual
mode, we only expose the specific tables/functions that are configured in theschema
block.This PR turned out to be so big that I decided it best not to include the corresponding data source. For that one I intend for it to follow the same structure as what's in this PR, and allow it to be defined on either the producer or the consumer.
The update code turned out to be way more involved than I'd originally hoped, due to issues with terraform-plugin-sdk. What I really wanted for the nested schema blocks was to use blocks, but treat them in the backend as a map instead of a set/list (in other words, treat
name
as the unique identifier for the nested schema block). Unfortunately,schema.TypeMap
can only store primitive types. Doing a simple hash function on the name meant terraform wasn't picking up changes to the tables/functions inside the schema block. I tried this in combination withCustomizeDiff
but never could get it to properly detect the changes.I ended up taking inspiration from the
aws_security_group
resource in terraform-provider-aws, which also has to work around this issue of doing incremental updates to nested attributes in aschema.TypeSet
. The terraform plan output makes it appear that we're completely dropping the schema from the datashare and then re-adding it (I'd hoped that hashing on the name would solve that problem, but instead terraform simply didn't detect changes to the schema configuration), but during the update there's a bunch of extra logic to figure out what tables/functions actually need adding/removing. It's uglier than I'd like, but it works when none of the other approaches I've taken did.