Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add external schema support #14

Merged
merged 3 commits into from
Aug 6, 2021

Conversation

sworisbreathing
Copy link
Contributor

This PR extends the redshift_schema resource and data source to support all of the available flavours of external schema documented at https://docs.aws.amazon.com/redshift/latest/dg/r_CREATE_EXTERNAL_SCHEMA.html. There is a new nested configuration block in the redshift_schema resource, named external_schema.

Supported external schema sources are:

  • data_catalog_source for AWS Glue Data Catalog
  • hive_metastore_source for a Hive metastore (I think really it only supports EMR but I haven't tested it as I don't have a Hadoop environment to test against, EMR or otherwise)
  • rds_postgres_source for federated query against RDS/Aurora PostgreSQL
  • rds_mysql_source for federated query against RDS/Aurora MySQL (note that this is currently a Preview feature and thus subject to change. Your Redshift cluster must be on the Preview maintenance track to use this)
  • redshift_source for querying a datashare from another redshift cluster (both clusters must use an instance family which supports data sharing... currently only the new RA3 instance types support data sharing)

Destroy/recreate is forced when updating a terraform-managed schema, if:

  • you add external_schema to an existing redshift schema resource (switch the schema from internal to external), or
  • you remove external_schema from an existing redshift schema resource (switch the schema from external to internal), or
  • you change any of the settings in the external_schema block

This is required because ALTER SCHEMA does not support external schemas (except for changing the owner)

I have included acceptance tests for all data sources, however they are conditionally enabled on specific environment variables being set. I have successfully run the ones for data_catalog_source, rds_postgres_source, and redshift_source.

@winglot winglot added the enhancement New feature or request label Aug 3, 2021
redshift/helpers.go Outdated Show resolved Hide resolved
redshift/resource_redshift_schema.go Outdated Show resolved Hide resolved
@winglot winglot merged commit 9bab168 into brainly:master Aug 6, 2021
@sworisbreathing sworisbreathing deleted the external-schema branch August 6, 2021 07:01
@sworisbreathing
Copy link
Contributor Author

Just a quick note on this one for the sake of having a written record somewhere... AWS support have confirmed that svv_external_schemas.esoptions will be truncated to 256 characters whenever you join on svv_schema_quota_state. Joining on svv_schema_quota_state turns the query into a "padb query", and padb queries treat text columns as varchar(256). They haven't really made it clear to me why it's a padb query but they said that's the reason.

So, if anyone is looking to do make changes around the schema resource/data source, please be sure that the query on esoptions doesn't have any joins in it, otherwise you'll risk introducing a regression which prevents the provider from reading back the external schema configuration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants