-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Checkpointing in dbt #3891
Comments
@fclesio Agree big-time! The way I've been thinking about this: dbt should be able to read the artifact from a prior run (namely In order to make this work, we'd need to teach dbt about remote results, in the same way we taught dbt about remote manifests for the sake of Does that artifact-based approach make sense to you? Is there anything you're thinking of that I'm missing? |
@jtcohen6 thanks for the prompt response. The main issue that I see with the For instance: Let's say after execute From the API perspective, I see something like
2.1) When the checkpoint is not chosen, dbt does not need to care about the remote states; 2.2) Who chose the checkpoint will explicitly know that it will have some toll in terms of execution time, due to the remote states;
I do not know if it's clear or not, but I'm happy to clarify more if needed. |
I prefer the I can also imagine a combination like this: |
Sorry for the stubbornness sungchun12, but considering a scenario of an automatic retry to avoid dbt run everything from scratch, it should be implicitly having an option establishing the checkpoint? The idea is for every execution I should run |
@fclesio Thanks for the thoughtful responses! I think there's an important distinction here:
I took the thrust of this issue to mean the latter more than the former. If a specific model fails because its query timed out, or ran into intermittent network and/or database issues, the right answer feels like automatic retry—based on logic coded into the database adapter that checks for the timeout/transient error code. If that's more the thing you're after, I think #3303 is the right place to continue that conversation. I'd be happy to open a separate issue for "smart manual retry," which is the @matt-winkler @sungchun12 To that end: I'd be thrilled to work with both of you on making this happen. The One question: Should |
@fclesio I don't see your response as stubborn at all! I welcome it and you raise a valid point with retrying I expect the best of both worlds, even with implicit logic for transient errors. Given this paradigm already exists for specific database adapters, we shouldn't need an explicit flag like: For result-based selection criteria, I recommend limiting it to ( @jtcohen6 I would love to help out and branch off your experiment branch. @fclesio @matt-winkler, I would love to work with you too to make this a reality. @fclesio, I understand if you want to focus on the transient error checkpoint logic for your specific database instead. What database are you using today for dbt? |
@sungchun12 I'm using Redshift. |
@fclesio, cool! when we have a working pull request, we'd love for you to test drive it! |
Describe the feature
A checkpointing mechanism that can be used to start from a failure point when running a
dbt run
statement.Describe alternatives you've considered
As I'm running
dbt run
via cli, I'm using some shell/python scripts to create a checkpoint file (a txt file with the name of last executed model) that if the execution fails, I can re-rundbt run
that it will start from the last execution model.Additional context
Using the
dbt run
command if any problem occurs, we need or (a) re-start the execution manually from the model that failed or (b) re-run the full execution again.Who will this benefit?
dbt run
again ordbt run model_1, dbt run model_n
manually; start from the last point before the interruptionAre you interested in contributing this feature?
Yes.
The text was updated successfully, but these errors were encountered: