Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-2296] [Spike] Remove the distinction between configs and non-configs #7157

Open
Tracked by #9099
jtcohen6 opened this issue Mar 13, 2023 · 5 comments
Open
Tracked by #9099
Assignees
Labels
spike tech_debt Behind-the-scenes changes, with little direct impact on end-user functionality

Comments

@jtcohen6
Copy link
Contributor

copying from internal Slack

At first blush, these feel like desirable UX outcomes for our artifacts / metadata interface:

  • Flatten all "known" fields (defined & validated in dbt-core) to be top-level in generated artifacts.
  • Group together related fields (incremental-related, test-related, snapshot-related)
  • Possibly stick the "unresolved" forms of database/schema/alias in their own nested dictionary
  • Stick user-provided configs (unknown to dbt-core, can be any data type) into an extra dictionary. That would still be our "anything goes" grab bag.

And for the developer UX:

  • Support setting all attributes within config and top-level attributes equivalent in yaml files. (I foresee lots of little changes, e.g. we'd need to add doc blocks to model rendering context)
  • Support a mechanism to explicitly prohibit setting certain attributes in dbt_project.yml, for multiple models at once — e.g. access, description, tests
  • Make config.get('something') and model['something'] equivalent
  • We rewrite these docs (again): "Configs, properties, same thing!"

Internal Notion with more background & context: https://www.notion.so/dbtlabs/Why-Configs-0ab08bbf3bb34c84bc70c3e57b4f575e?pvs=4

Example

models:
  - name: my_model
    description: ...
    materialized: ...
    any_field_i_want: ...
{{ config(description = ..., materialized = ..., any_field_i_want = ...) }}

With the addition of frontmatter (#7100):

---
description: ...
materialized: ...
any_field_i_want: ...
---

select ...
@github-actions github-actions bot changed the title [Spike] Remove the distinction between configs and non-configs [CT-2296] [Spike] Remove the distinction between configs and non-configs Mar 13, 2023
@joellabes
Copy link
Contributor

Does any_field_i_want imply breaking out of the meta sandbox, or do you just mean any_already_defined_field_i_want?

@jtcohen6
Copy link
Contributor Author

Does any_field_i_want imply breaking out of the meta sandbox, or do you just mean any_already_defined_field_i_want?

Fair point! Users (and adapter maintainers) can already define whatever custom configs they want, and access them via {{ config.get() }}.

If this is the route we go down, there will be very little functional difference between sticking a totally custom field in the "extra" dict (implicitly), versus putting it in the meta dict (explicitly). We'd either want to:

  • Preserve both patterns, and document the reasons for doing each. E.g. "You should add custom top-level fields to change the model's runtime behavior in a custom way (custom config). By contrast, meta allows you to decorate it with some extra fields that appear in more-predictable ways in downstream metadata (artifacts & logs)."
  • Deprecate the ability to pass whatever fields you want into meta... knowing that we'd need to keep around this back-compat for a long long time, and we'd also need to get "adapter-specific" configs actually working again, lest we be recommending that adapter maintainers continue relying on an officially deprecated pattern.

@joellabes
Copy link
Contributor

Huh! TIL.

I would lean towards the first option - keep them both but explain what context is appropriate for each.

@jtcohen6
Copy link
Contributor Author

jtcohen6 commented Mar 17, 2023

If we totally removed the config dict, and allowed users to specify whatever (valid) keys they wanted as top-level properties, it would make it more difficult for us to catch typos & provide a helpful warning: #5605 (comment) / #6008

But it's already the case that we don't / can't catch a typo for a config entry, e.g.:

config:
  matrialized: table

@jtcohen6
Copy link
Contributor Author

@gshank and I discussed potentially pulling this as follow-on work, as part of stabilizing artifact (manifest) interfaces. It's still not top priority must-have, but it would be thematic with that effort.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
spike tech_debt Behind-the-scenes changes, with little direct impact on end-user functionality
Projects
None yet
Development

No branches or pull requests

3 participants