-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: when key not present assume NodeRunning. Fixes 11843 #11847
Conversation
Signed-off-by: isubasinghe <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is equivalent logic as before, as you wrote in your analysis in #11843 (comment), so I'm fine approving.
But I am wondering if there is a deeper bug (that silently existed before) that we should try to fix here? Since the current logic means it got a nodeId
that doesn't exist in the nodes map??
boundaryNode.children
should all be valid nodes, so my only guess is that getRetryNodeChildrenIds
might be getting an incorrect id, although the logic of that function seemed fine at a glance.
@agilgur5 I think its somewhat impossible to say if the default initialisation was intended or not. The nodeId doesn't exist yet because the controller didn't create it yet, this implies code is yet to run. It makes sense to me given the overall design of the controller. I do not like the fact that the controller even attempts to run this code, because it should know that these nodes have not been created. Making the controller more "context aware" would be nice, it wouldn't need to try paths that it knows it doesn't need to run yet, would probably reduce bugs as well. |
To make a more general point, this is why default initialised values are bad, they do not scale (with respect to #users). When a programmer uses default initialised values, they make an implicit assumption that this behaviour is fine/valid. The So yeah I think we should only allow for code relying on default initialised values, if it has been commented at least. |
Oh for sure, agreed on that. We're fighting the language design a bit clearly. I do wonder if there's a linter check for this scenario as well though. |
Wait so does the id exist but the struct around it wasn't created? I actually didn't expect that scenario initially but I suppose that could happen. 😵💫
So as far as my current understanding of the controller goes, it seems that speculation like this is primarily limited to DAGs as in the But yea agreed that making things more deterministic is usually better. Some of the templating changes I'd like to make are conceptually similar (e.g. erroring out earlier at validation with type-checking etc instead of run-time issues) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving since, as I mentioned above, this fixes the regression and is equivalent behavior to the previous version.
But would still like to continue the conversation here, both for my own edification, and to help with finding root cause
I definitely prefer mypy and TS over Go, Go has weird footguns like this that it invented for no real reason.
Yup the node id exists because it is something that is generated at runtime (which is also not ideal), but its possible the struct wasn't created yet.
I maybe wrong, but my assessment was that this kind of speculation is actually relatively common. Feel like a lot of issues I've dealt with were at least indirectly because of this. Might just be my bias. |
well that would explain why I couldn't seem to find any deletions in #11451 (comment) -- it's not that they were deleted, it's that they didn't exist yet 😵💫
You probably know this part of the codebase better than me, to be fair. But as a piece of anecdotal evidence, I never encountered these issues while using Workflows and most of our usage was with |
Signed-off-by: isubasinghe <[email protected]>
…1847) Signed-off-by: isubasinghe <[email protected]> Signed-off-by: Dillen Padhiar <[email protected]>
Fixes #11843
When a node is not present in the nodes map, this implies that the DAG is still running.
We errored out which is what causes #11843, this fixes that by assuming NodeRunning.
Verification
Tested on the workflow posted in #11843