-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix docs generation for cross-db sources in REDSHIFT RA3 node #3408
Conversation
Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: kostkaw.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution @kostek-pl! I'm in favor of the overall concept. There's a little bit of reorganization needed in order to get the tests passing.
def _get_catalog_schemas(self, manifest): | ||
# postgres/redshift only allow one database (the main one) | ||
# postgres/redshift(not ra3) only allow one database (the main one) | ||
schemas = super()._get_catalog_schemas(manifest) | ||
try: | ||
return schemas.flatten() | ||
return schemas.flatten(allow_duplicates=self.config.credentials.ra3_node) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Postgres credentials don't have an ra3_node
property, so this is raising an error and breaking tests. I think the right move here is to reimplement _get_catalog_schemas
within redshift/impl.py
, including this RA3-specific logic.
core/dbt/adapters/base/relation.py
Outdated
@@ -433,13 +433,14 @@ def search( | |||
for schema in schemas: | |||
yield information_schema_name, schema | |||
|
|||
def flatten(self): | |||
def flatten(self, allow_duplicates: bool=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a naming nit: this change isn't so much about duplicates per se; it's about allowing multiple databases at all. In the past, the only reason we'd expect to see multiple databases on Postgres/Redshift was if a metadata query accidentally returned duplicates.
Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: kostkaw.
|
Thank you very much @jtcohen6 for the review and tips. |
Thanks @kostek-pl! The change here is looking good to me. Since we don't have an RA3 cluster available for automated tests, could you confirm that it resolves the issue for you locally? Namely, that you're able to generate docs that include sources from another database. A number of tests were failing because they were missing secrets, on account of running from the
|
Code adjustments according to flake8
Error message adjusted to be more precise
@jtcohen6 thank you a lot for the remarks. Yes, I can confirm that This PR resolves the problem locally with docs generating. with RA3_node not set or set to false I get an error: Error sending message, disabling tracking when RA3_node is set to true then docs are generated example used: profile: [...]
target: dev
outputs:
dev:
type: redshift
ra3_node: true
[...] schema: version: 2
sources:
- name: sources
database: sources
schema: public
tables:
- name: cross_db_test_table
models:
- name: cross_db_step1
- name: cross_db_step2 cross_db_step1: select * from {{ source('sources', 'cross_db_test_table') }} cross_db_step2: select * from {{ ref('cross_db_step1') }} result: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great @kostek-pl!
Could you add a changelog entry, and add yourself to the list of contributors? We've just released 0.20.0rc1, and this PR represents net-new functionality, so it should really go under a new section dbt next
(i.e. v0.21). If we end up putting out another release candidate for v0.20, however, I'll consider sneaking it in :)
CHANGELOG update
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this @kostek-pl!
Update: We will be putting out v0.20.0rc2 to fix some bugs and installation issues with v0.20.0rc1. That said, we're going to keep the scope of v0.21 narrow, and be disciplined about putting out a first prerelease within 6-8 weeks of v0.20.0-final. So I'm going to stay disciplined here, too, and keep the change in v0.21. |
* Fix docs generating for cross-db sources * Code reorganization * Code adjustments according to flake8 * Error message adjusted to be more precise * CHANGELOG update
* Fix docs generating for cross-db sources * Code reorganization * Code adjustments according to flake8 * Error message adjusted to be more precise * CHANGELOG update
* Fix docs generation for cross-db sources in REDSHIFT RA3 node (#3408) * Fix docs generating for cross-db sources * Code reorganization * Code adjustments according to flake8 * Error message adjusted to be more precise * CHANGELOG update * add static analysis info to parsing data * update changelog * don't use `meta`! need better separation between dbt internal objects and external facing data. hacked an internal field on the manifest to save off this parsing info for the time being * fix partial parsing case Co-authored-by: kostek-pl <[email protected]>
* Fix docs generation for cross-db sources in REDSHIFT RA3 node (#3408) * Fix docs generating for cross-db sources * Code reorganization * Code adjustments according to flake8 * Error message adjusted to be more precise * CHANGELOG update * add static analysis info to parsing data * update changelog * don't use `meta`! need better separation between dbt internal objects and external facing data. hacked an internal field on the manifest to save off this parsing info for the time being * fix partial parsing case Co-authored-by: kostek-pl <[email protected]>
* Fix docs generation for cross-db sources in REDSHIFT RA3 node (#3408) * Fix docs generating for cross-db sources * Code reorganization * Code adjustments according to flake8 * Error message adjusted to be more precise * CHANGELOG update * add static analysis info to parsing data * update changelog * don't use `meta`! need better separation between dbt internal objects and external facing data. hacked an internal field on the manifest to save off this parsing info for the time being * fix partial parsing case Co-authored-by: kostek-pl <[email protected]>
…) (#3500) * Fix docs generation for cross-db sources in REDSHIFT RA3 node (#3408) * Fix docs generating for cross-db sources * Code reorganization * Code adjustments according to flake8 * Error message adjusted to be more precise * CHANGELOG update * add static analysis info to parsing data * update changelog * don't use `meta`! need better separation between dbt internal objects and external facing data. hacked an internal field on the manifest to save off this parsing info for the time being * fix partial parsing case Co-authored-by: kostek-pl <[email protected]> Co-authored-by: kostek-pl <[email protected]>
* Fix docs generation for cross-db sources in REDSHIFT RA3 node (#3408) * Fix docs generating for cross-db sources * Code reorganization * Code adjustments according to flake8 * Error message adjusted to be more precise * CHANGELOG update * add static analysis info to parsing data * update changelog * don't use `meta`! need better separation between dbt internal objects and external facing data. hacked an internal field on the manifest to save off this parsing info for the time being * fix partial parsing case Co-authored-by: kostek-pl <[email protected]> automatic commit by git-black, original commits: 4d24656
* Fix docs generation for cross-db sources in REDSHIFT RA3 node (#3408) * Fix docs generating for cross-db sources * Code reorganization * Code adjustments according to flake8 * Error message adjusted to be more precise * CHANGELOG update * add static analysis info to parsing data * update changelog * don't use `meta`! need better separation between dbt internal objects and external facing data. hacked an internal field on the manifest to save off this parsing info for the time being * fix partial parsing case Co-authored-by: kostek-pl <[email protected]> automatic commit by git-black, original commits: 4d24656 a76ec42
* Fix docs generation for cross-db sources in REDSHIFT RA3 node (#3408) * Fix docs generating for cross-db sources * Code reorganization * Code adjustments according to flake8 * Error message adjusted to be more precise * CHANGELOG update * add static analysis info to parsing data * update changelog * don't use `meta`! need better separation between dbt internal objects and external facing data. hacked an internal field on the manifest to save off this parsing info for the time being * fix partial parsing case Co-authored-by: kostek-pl <[email protected]> automatic commit by git-black, original commits: 2cc0579 4d24656
* Fix docs generation for cross-db sources in REDSHIFT RA3 node (#3408) * Fix docs generating for cross-db sources * Code reorganization * Code adjustments according to flake8 * Error message adjusted to be more precise * CHANGELOG update * add static analysis info to parsing data * update changelog * don't use `meta`! need better separation between dbt internal objects and external facing data. hacked an internal field on the manifest to save off this parsing info for the time being * fix partial parsing case Co-authored-by: kostek-pl <[email protected]> automatic commit by git-black, original commits: 4d24656 7563b99 aadf3c7
resolves #3236(partially)
This PR is related to #3236. It fixes documentation generation for cross-db objects used as SOURCES in models. However it not resolves full cross-db support as it's not possible at this moment due to AWS cross-db restrictions.
Checklist
CHANGELOG.md
and added information about my change to the "dbt next" section.