[CT-2194] publication standards for public models #7062

MichelleArk · 2023-02-27T15:25:11Z

From the model groups & access discussion:

Users may optionally define their own set of expectations, overriding the default, that would be checked against every public model in the project.

These expectations should be defined in a separate file. (Teams can take advantage of .CODEOWNERS rules, e.g. to require reviews from repository maintainers any time these expectations are updated.)
These rules would be validated during parsing. The idea is not for public models to magically inherit these configurations, but simply to make sure that they match up.
For example, a data team may want to enforce that every public model has persist_docs enabled (for integration with an external data catalog), is materialized as a view (on top of an underlying private table), and has at least a certain number of data quality tests. Imagine something like:

# public_models.yml
description: true  # every public model must be described
config:  # every public model must match these configs
  constraints_enabled: true
  persist_docs:
    relation: true
    columns: true
  materialized: view
columns:
  description: true  # every column must be described
tests:  # matches 'test_name', with optional package prefix
  unique: 1  # at least one unique test, on any column
  installed_package.totally_custom_test: 3  # at least 3 of whatever this is

For totally custom & complex validation logic (e.g. "every column named email should have a BigQuery policy tag, a dbt pii tag, and a description containing the word 'pseudonymized'"), these rules could, as they can today, be written in:

Jinja macros, enforced at compile/runtime via hook (a la dbt_project_evaluator and dbt_meta_testing)
Custom scripts that parse dbt metadata artifacts (manifest.json)

jtcohen6 · 2023-03-26T20:05:36Z

This is a really cool topic, and something we should go much deeper on in the future. This is more than just a one-off feature; it might be an entire package, plugin, or product.

I'm going to close this issue for now, and kick it out of scope for our nearer-term work on multi-project deployments. In the meantime, it will be possible to write similar rules (in Jinja) following the same pattern used by dbt_project_evaluator and dbt_meta_testing.

MichelleArk added Team:Language Refinement Maintainer input needed multi_project labels Feb 27, 2023

MichelleArk assigned jtcohen6 Feb 27, 2023

github-actions bot changed the title ~~publication standards for public models~~ [CT-2194] publication standards for public models Feb 27, 2023

MichelleArk mentioned this issue Feb 27, 2023

[CT-1915] [Epic] Multi-project collaboration - Milestone 1 #6747

Closed

MichelleArk added the model_groups_access Issues related to groups label Feb 27, 2023

jtcohen6 closed this as not planned Won't fix, can't repro, duplicate, stale Mar 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CT-2194] publication standards for public models #7062

[CT-2194] publication standards for public models #7062

MichelleArk commented Feb 27, 2023

jtcohen6 commented Mar 26, 2023

[CT-2194] publication standards for public models #7062

[CT-2194] publication standards for public models #7062

Comments

MichelleArk commented Feb 27, 2023

jtcohen6 commented Mar 26, 2023