-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Return to eager test selection by default, with an option to tone it down #4082
Comments
Thanks for opening this @joellabes! We've discussed this extensively offline, and you've managed to convince me. In retrospect, we should have introduced non-greedy test selection in v0.20 as an option (as called for by #2891!) rather than a change to the default. In my hubris, I thought I had managed to square the circle of test selection in a way that would make everyone happy. Instead, it made everyone confused. We live and you learn :) I think this change should be accompanied by:
Then, the actual change just looks like:
Now that greedy selection is going to be the default, a few questions:
Thrilled to see that you and @VersusFacit are going to take a swing at this :) We've got a couple of weeks ahead of v1.0.0-rc1. Let's get this in before then! |
The only way to be right all the time is to change your mind when you get one wrong! You're still on a pretty solid batting average ⚾️ Thinking a bit more about the naming, I don't love greedy's implications. I know there's a ton of precedent for the term in technical contexts, and I'm not going to die on this hill, but do you think we could get the same thing done while calling it eager-selection or similar?
No I think this can be removed again. If someone has a compelling reason for needing it, adding logging will be a non-breaking change in the future.
It's only in the last few weeks that I've even realised that my_model+1 is a first-class way to access tests (see also my comments about discoverability of them in the visual DAG). |
As far as slugging percentage is concerned, we have to take some big swings, and more than a few will be misses!
Certainly! I'd be all for renaming this to Since we're already breaking this by turning it on by default, we should also take the opportunity to rename it to something we'll be happy with for a long while. Let's think a bit more about whether less-eager selection should mean "less eager for multi-parent tests only," vs. "avoid indirect selection entirely" (a much more sweeping change). There's a real chance that the latter is the right move in the 5% of cases where "full control mode" is desirable: Only execute the tests that are themselves selected. Or, to put it in the terms you described above: Do what's asked only, and not at all what might be meant (but not asked). It will come as no surprise that I think selection is really important, especially as projects get bigger, and complementary tools like deferral/cloning get more powerful. The right answer, ultimately, might be to support multiple configurations, or gradations of eagerness. At one time, that would have felt like overdoing it; nowadays, it's clear to me that there's a real need here. This is something that folks are willing to think about in great depth and detail, especially if we're able to give them solid tools to do that thinking with—e.g., a DAG for visualizing test selection, too. |
I don't think cautious correctly covers it as a standalone word, but I'm on board with an off-by-default naming pattern. I'll have a ponder. For the rest of this I'll call it I'm wary of making more big, sweeping behavioural changes right before 1.0 ships (especially ones I don't use so don't have strong opinions on their behaviour), but I'm on board with keeping the door open for us to do it in the future. How about this?
Or perhaps better still (writing is thinking):
I like this because it means we're not constrained to a boolean representation of a spectrum, and eager and cautious as the two opposite options are very grokkable ☯️ . The naming gets very fraught if we ever plan to have a fourth option... |
I'm on board! Your second proposal tickles me. The naming and string type is a bit against convention, but the thing we're asking people to do here is more complex than a standard on/off global config. It feels more like "dbt language," wherein we speak about ephemeral materializations, high-maturity exposures, etc. |
Is there an existing feature request for this?
Describe the Feature
@
and+
can cause models to be tested that don't exist (1827, 2132). Singular tests that touch a model are run even if they are only indirectly selected (3832)--greedy
flag. As I dug into the edge cases, my thinking changed and I now think we should restore the 0.19.0 behaviour as the default, with an option to disable greedy selection for the use cases that necessitated the changes in the first place.Matrix of functionality/options:
Describe alternatives you've considered
Make the new behaviour much more discoverable.
Change nothing: there's a risk of creating more confusion by flipping back to the old way and annoying the people who liked it
Try harder to internalise the new system.
NB: Venn diagram is entirely unscientific
--select
flag (or-s
flag), followed by the name of the model"Who will this benefit?
dbt run -m some_model
anddbt test -m some_model
and have all tests attached tosome_model
run.dbt test --exclude tag:x
anddbt test --select tag:x
is no longer the same asdbt test
when one model in a test is taggedx
and one isn't.--greedy
and not. All of the cases devolved towards "yes, but..." or "it depends". If the settings were flipped, I think a lot of the guidance becomes easier because we can describe the handful of cases where it becomes desirable with less ambiguity: "Exclude the other parents when they won't exist"Are you interested in contributing this feature?
Sure am! @VersusFacit and I are going to tag-team it
Anything else?
No response
The text was updated successfully, but these errors were encountered: