Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-2194] publication standards for public models #7062

Closed
MichelleArk opened this issue Feb 27, 2023 · 1 comment
Closed

[CT-2194] publication standards for public models #7062

MichelleArk opened this issue Feb 27, 2023 · 1 comment
Assignees
Labels
model_groups_access Issues related to groups multi_project Refinement Maintainer input needed

Comments

@MichelleArk
Copy link
Contributor

From the model groups & access discussion:

Users may optionally define their own set of expectations, overriding the default, that would be checked against every public model in the project.

These expectations should be defined in a separate file. (Teams can take advantage of .CODEOWNERS rules, e.g. to require reviews from repository maintainers any time these expectations are updated.)
These rules would be validated during parsing. The idea is not for public models to magically inherit these configurations, but simply to make sure that they match up.
For example, a data team may want to enforce that every public model has persist_docs enabled (for integration with an external data catalog), is materialized as a view (on top of an underlying private table), and has at least a certain number of data quality tests. Imagine something like:

# public_models.yml
description: true  # every public model must be described
config:  # every public model must match these configs
  constraints_enabled: true
  persist_docs:
    relation: true
    columns: true
  materialized: view
columns:
  description: true  # every column must be described
tests:  # matches 'test_name', with optional package prefix
  unique: 1  # at least one unique test, on any column
  installed_package.totally_custom_test: 3  # at least 3 of whatever this is

For totally custom & complex validation logic (e.g. "every column named email should have a BigQuery policy tag, a dbt pii tag, and a description containing the word 'pseudonymized'"), these rules could, as they can today, be written in:

Jinja macros, enforced at compile/runtime via hook (a la dbt_project_evaluator and dbt_meta_testing)
Custom scripts that parse dbt metadata artifacts (manifest.json)

@github-actions github-actions bot changed the title publication standards for public models [CT-2194] publication standards for public models Feb 27, 2023
@MichelleArk MichelleArk added the model_groups_access Issues related to groups label Feb 27, 2023
@jtcohen6
Copy link
Contributor

This is a really cool topic, and something we should go much deeper on in the future. This is more than just a one-off feature; it might be an entire package, plugin, or product.

I'm going to close this issue for now, and kick it out of scope for our nearer-term work on multi-project deployments. In the meantime, it will be possible to write similar rules (in Jinja) following the same pattern used by dbt_project_evaluator and dbt_meta_testing.

@jtcohen6 jtcohen6 closed this as not planned Won't fix, can't repro, duplicate, stale Mar 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
model_groups_access Issues related to groups multi_project Refinement Maintainer input needed
Projects
None yet
Development

No branches or pull requests

2 participants