Replies: 6 comments 20 replies
-
When a target throws an error, the default behavior is to stop the whole pipeline immediately. Other choices are available, either globally through the |
Beta Was this translation helpful? Give feedback.
-
Can you help me understand why the default behaviour is to stop running the whole pipeline rather than just the steps that are downstream of the error? |
Beta Was this translation helpful? Give feedback.
-
Just to make sure we're talking about the same thing, I've drawn a little diagram: (Apologies for not matching the conventions of targets, which I'm not that familiar with, but hopefully you get the gist). If the red circle is a failure, I'm suggesting that the other branches in A would be cancelled, and B certainly wouldn't be run, but C and D would still run. I think what you're saying is that once the red circle errors, everything stops? Is that correct? What happens if A contains 100 targets, and the first 99 succeed and only the 100th fails? What happens to the results of the 99 jobs that ran successfully? What happens if it fails on the 10th job, and there are 5 other jobs running in parallel at the same time. Are they all automatically stopped? What happens to their results? |
Beta Was this translation helpful? Give feedback.
-
Hello Friends, As a long time I think it's the most ergonomic option for a pipeline in development. The main reason why this is true is that when an error happens in development there is no guarantee that only downstream targets of the error will be affected by a fix. The root of the error could originate anywhere upstream due to faulty data or assumptions, and the resolution could affect any number of targets anywhere in the plan, since the plan code and or data flows may need to be refactored in the fix. If the default was changed to this "reactive" mode (in discussion) I suspect most developers would just mash Ctrl-C as soon an an error is encountered anyway for these reasons.
It's a strong assumption that people coming to But here's my take on why the two contexts are fundamentally different: In Shiny:
In Targets:
It's nice to have the option to keep processing, and I think the current "null" option already gets you quite close to the "reactive" mode, at least in terms of maximising valid work done. This is just my personal experience, but in all my years of R in prod with these frameworks I have turned on the option to keep processing just once. I think part of this is the way I design prod pipelines:
Great discussion! |
Beta Was this translation helpful? Give feedback.
-
Happy to! An important thing to keep in mind is that I am talking about a default suitable for pipelines in development. So immature, not yet established in prod etc. For these pipelines the graph structure is not set, and will like be revised many times. An error can easily be a thing that leads to a change in graph structure. For example by carving off a target representing a special case off from the main ‘trunk’. A specific and common example of where an error invalidates a chain of targets that are not descendent from it is where the error manifests in one target, but the root cause is invalid data which is handled further up the chain, e.g. in a ‘filter on read’ type approach. Another example is one Will alluded to where the error is cause by tracked function (possibly in a package) that is a dependency of multiple targets in the plan, some of which did not error. So in the development context where reproducing error condition state is so fast and easy, as we have said, it feels better to just to deal with problems as they arise rather than waiting for expensive targets to complete that have some chance of being rendered invalid anyway. For more mature pipelines I think the ‘keep processing’ approach is much more viable. |
Beta Was this translation helpful? Give feedback.
-
#1332 adds |
Beta Was this translation helpful? Give feedback.
-
Help
Description
I've read https://books.ropensci.org/targets/debugging.html, but I'm looking for a higher-level description of how errors affect the computation of the reactive graph.
Beta Was this translation helpful? Give feedback.
All reactions