Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove the distinction between --select and --models flags when working with node types #3210

Closed
joellabes opened this issue Mar 30, 2021 · 7 comments · Fixed by #3791
Closed
Assignees
Labels
1.0.0 Issues related to the 1.0.0 release of dbt enhancement New feature or request node selection Functionality and syntax for selecting DAG nodes
Milestone

Comments

@joellabes
Copy link
Contributor

Describe the feature

It would be less confusing if all nodes were selectable using a single flag, instead of having to switch back and forth.

Describe alternatives you've considered

Getting over it

Additional context

I imagine it would make more sense to settle on --select and -s because they're more broadly applicable. The migration might be uncomfortable, but could be a slow-boiled frog in the same vein as config-version:

  • A minor version of dbt where --select is the default version in the docs etc for new users and -m users keep going without any issues
  • Some minor versions of dbt where -m users get a WARN during compilation
  • A future version of dbt (maybe 1.0?) where -m is deprecated and no longer respected.

Or, just add -s as an alias and leave the old ones around forever; they're not really doing any harm!

Who will this benefit?

New users in particular (this confused me ~9 months ago), but it also just bit me again 😑.

Are you interested in contributing this feature?

Sure! I'm sure I could work it out by looking at how the -model alias is currently handled. Most of the work feels like it would be in the deprecation warnings

@joellabes joellabes added enhancement New feature or request triage labels Mar 30, 2021
@jtcohen6
Copy link
Contributor

jtcohen6 commented Mar 30, 2021

@joellabes I agree! In all likelihood, I think we should:

  • Switch all commands / tasks to use -s, --select
  • Keep supporting -m, --models, forever, as an (undocumented) alias for -s, --select

There are two things we'll need to sort out before we do this:

Trivial: list

The list task / dbt ls command is the only one that supports both -m, --models and -s, --select, and they do slightly different things. Instead of -m and -s being direct aliases, --models xyz is functionally equivalent to --resource-type model --select xyz. I think we could deprecate --models and still have it perform that same functionality. We already do a good job today of raising an exception if --models is passed alongside either --select or --resource-type, since they're mutually exclusive.

Trickier: test selection

dbt test --models has a "magic jump" baked in, such that dbt test -m model_a really means, "Select all the tests that directly depend on model_a, i.e. dbt ls -s model_a+1 --resource-type test. It's a bit unintuitive, but it also enables really compelling syntax for "run and test model_a." But I think it _only_ makes more conceptual sense for test -m, i.e. execute tests that directly apply to a model, even if the tests themselves are not directly selected—whereas test -s`, to me, sounds like it should execute only the tests that are directly selected.

Before we fully switch dbt test -m to dbt test -s, I think we should first make the change outlined in #2891:

We could modulate the magic, and make it only select tests if all their parents are all the way present.

This still gets us the compelling syntax we'd want while handling the worst surprises. dbt test -s model_a model_b would run:

  • any test named model_a or model_b
  • any test that is defined on / directly depends on model_a, model_b, or model_a + model_b (but not, say, a relationships test between model_b + model_c)

I realize that, by saying this, I'm blocking a semantic change behind a meatier functional change—maybe it's not fair to use one as the forcing function for the other—let me know if you disagree! In any case, I think this is something we should seriously consider doing ahead of v1.0.

@jtcohen6 jtcohen6 added 1.0.0 Issues related to the 1.0.0 release of dbt and removed triage labels Mar 30, 2021
@joellabes
Copy link
Contributor Author

I think you've explained the magic jump to me a couple of times in the past, but I get it this time! Thanks.

test -s, to me, sounds like it should execute only the tests that are directly selected.

by saying this, I'm blocking a semantic change behind a meatier functional change

I'd make a half-hearted argument that flags' meanings (especially short flags) can be anything we want them to be. The strongest argument to back this up is that dbt test --models my_custom_data_test selects that test, with nary a model in sight. It's never occurred to me that that's weird, because I was thinking of -m as the thing I type to make dbt do what I want, as opposed to deeply considering the flag's name.

Side note: realistically, this all only applies to custom data tests, right? Am I right that (for selection purposes) schema tests' True Names are the kwarg portmanteaus we all know and love? If so, then I doubt anyone is typing dbt test --models dbt_utils_source_not_null_where_tasks_tasks_datedeleted__isdeleted_1 on a regular basis. I think hope that the fact that someone is only ever selecting

  • data tests, or
  • schema tests that touch a model

glosses over a lot of the complexity. Having those two PLUS selecting a single instance of a schema test would probably be gnarlier to build a mental model upon.

For the record, I am on board with the magic modulation! But I don't think it has to block this. With that said, this isn't at all urgent, so if we wanted to put it on the backburner until after #2891 then I'd be fine.

Also, agreed that supporting -m forever is the right choice.

@jtcohen6
Copy link
Contributor

jtcohen6 commented Apr 6, 2021

Ok, I think all your instincts here are right! We should do this, and we should also do #2891; we should both before v1.0. Let's hold off on it for the moment, not on the backburner per se, but in the spirit of planning a number of nomenclature changes for the minor versions between v0.20 and v1.0.

@jtcohen6 jtcohen6 added this to the Oh-Twenty-One milestone Apr 27, 2021
@iknox-fa iknox-fa mentioned this issue May 27, 2021
4 tasks
@leahwicz
Copy link
Contributor

leahwicz commented Jun 4, 2021

@jtcohen6
Copy link
Contributor

@joellabes I'd love to get this in for v0.21. Is this still something you'd be interested in contributing? We can offer some helping hands along the way :)

@joellabes
Copy link
Contributor Author

@jtcohen6 yep let's give it a whirl!

@jtcohen6 jtcohen6 added the node selection Functionality and syntax for selecting DAG nodes label Aug 2, 2021
@jtcohen6
Copy link
Contributor

jtcohen6 commented Aug 6, 2021

@gshank I'd really like to prioritize this one and get it in for v0.21. This will go nicely with the more-consistent dbt source freshness and the net-new dbt build. The build command (in beta) currently uses --models instead of --select, which doesn't make a lot of sense!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1.0.0 Issues related to the 1.0.0 release of dbt enhancement New feature or request node selection Functionality and syntax for selecting DAG nodes
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants