Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WX-757 Fix stdout, stderr in workflow body causing crashes #7386

Merged
merged 9 commits into from
Mar 14, 2024

Conversation

aednichols
Copy link
Collaborator

Same symptom as #7385, different cause. This is @sjfleming's report in the ticket.

After:

INFO  - WorkflowManagerActor: Workflow b23e299b-5fbd-4b12-8389-9aa73321fba1 failed (during ExecutingWorkflowState): cromwell.engine.workflow.lifecycle.execution.WdlRuntimeException: Failed to evaluate 'break_with_stderr.load_data_csv' (reason 1 of 2): Evaluating select_first([stdout(), stderr()]) failed: stdout is not implemented at the workflow level, Failed to evaluate 'break_with_stderr.load_data_csv' (reason 2 of 2): Evaluating select_first([stdout(), stderr()]) failed: stderr is not implemented at the workflow level
INFO  - WorkflowManagerActor: Workflow actor for b23e299b-5fbd-4b12-8389-9aa73321fba1 completed with status 'Failed'. The workflow will be removed from the workflow store.
{
	"status": "Failed",
	"id": "b23e299b-5fbd-4b12-8389-9aa73321fba1"
}

Before:

ERROR - stdout is not implemented at the workflow level
java.lang.UnsupportedOperationException: stdout is not implemented at the workflow level
	at cromwell.core.io.WorkflowCorePathFunctionSet.fail(CorePathFunctionSet.scala:12)
	at cromwell.core.io.WorkflowCorePathFunctionSet.stdout(CorePathFunctionSet.scala:20)
	at wdl.transforms.base.linking.expression.values.EngineFunctionEvaluators$$anon$1.evaluateValue(EngineFunctionEvaluators.scala:54)
	at wdl.transforms.base.linking.expression.values.EngineFunctionEvaluators$$anon$1.evaluateValue(EngineFunctionEvaluators.scala:48)
	at wdl.model.draft3.graph.expression.ValueEvaluator$Ops.evaluateValue(ValueEvaluator.scala:10)
	at wdl.model.draft3.graph.expression.ValueEvaluator$Ops.evaluateValue$(ValueEvaluator.scala:10)
	at wdl.model.draft3.graph.expression.ValueEvaluator$ops$$anon$1.evaluateValue(ValueEvaluator.scala:10)
	at wdl.draft3.transforms.linking.expression.values.package$$anon$1.evaluateValue(package.scala:75)
	at wdl.draft3.transforms.linking.expression.values.package$$anon$1.evaluateValue(package.scala:22)
	at wdl.model.draft3.graph.expression.ValueEvaluator$Ops.evaluateValue(ValueEvaluator.scala:10)
	at wdl.model.draft3.graph.expression.ValueEvaluator$Ops.evaluateValue$(ValueEvaluator.scala:10)
	at wdl.model.draft3.graph.expression.ValueEvaluator$ops$$anon$1.evaluateValue(ValueEvaluator.scala:10)
	at wdl.transforms.base.linking.expression.values.LiteralEvaluators$$anon$6.$anonfun$evaluateValue$15(LiteralEvaluators.scala:103)
	at cats.data.Chain$.$anonfun$traverseViaChain$3(Chain.scala:795)
	at cats.Eval$.loop$1(Eval.scala:317)
	at cats.Eval$.cats$Eval$$evaluate(Eval.scala:363)
	at cats.Eval$FlatMap.value(Eval.scala:284)
	at cats.data.Chain$.traverseViaChain(Chain.scala:817)
	at cats.instances.ListInstances$$anon$1.traverse(list.scala:100)
	at cats.instances.ListInstances$$anon$1.traverse(list.scala:17)
	at cats.Traverse$Ops.traverse(Traverse.scala:181)
	at cats.Traverse$Ops.traverse$(Traverse.scala:180)
	at cats.Traverse$ToTraverseOps$$anon$3.traverse(Traverse.scala:206)
	at wdl.transforms.base.linking.expression.values.LiteralEvaluators$$anon$6.evaluateValue(LiteralEvaluators.scala:102)
	at wdl.transforms.base.linking.expression.values.LiteralEvaluators$$anon$6.evaluateValue(LiteralEvaluators.scala:95)
	at wdl.model.draft3.graph.expression.ValueEvaluator$Ops.evaluateValue(ValueEvaluator.scala:10)
	at wdl.model.draft3.graph.expression.ValueEvaluator$Ops.evaluateValue$(ValueEvaluator.scala:10)
	at wdl.model.draft3.graph.expression.ValueEvaluator$ops$$anon$1.evaluateValue(ValueEvaluator.scala:10)
	at wdl.draft3.transforms.linking.expression.values.package$$anon$1.evaluateValue(package.scala:37)
	at wdl.draft3.transforms.linking.expression.values.package$$anon$1.evaluateValue(package.scala:22)
	at wdl.model.draft3.graph.expression.ValueEvaluator$Ops.evaluateValue(ValueEvaluator.scala:10)
	at wdl.model.draft3.graph.expression.ValueEvaluator$Ops.evaluateValue$(ValueEvaluator.scala:10)
	at wdl.model.draft3.graph.expression.ValueEvaluator$ops$$anon$1.evaluateValue(ValueEvaluator.scala:10)
	at wdl.transforms.base.linking.expression.values.EngineFunctionEvaluators$$anon$23.evaluateValue(EngineFunctionEvaluators.scala:544)
	at wdl.transforms.base.linking.expression.values.EngineFunctionEvaluators$$anon$23.evaluateValue(EngineFunctionEvaluators.scala:537)
	at wdl.model.draft3.graph.expression.ValueEvaluator$Ops.evaluateValue(ValueEvaluator.scala:10)
	at wdl.model.draft3.graph.expression.ValueEvaluator$Ops.evaluateValue$(ValueEvaluator.scala:10)
	at wdl.model.draft3.graph.expression.ValueEvaluator$ops$$anon$1.evaluateValue(ValueEvaluator.scala:10)
	at wdl.draft3.transforms.linking.expression.values.package$$anon$1.evaluateValue(package.scala:100)
	at wdl.draft3.transforms.linking.expression.values.package$$anon$1.evaluateValue(package.scala:22)
	at wdl.model.draft3.graph.expression.ValueEvaluator$Ops.evaluateValue(ValueEvaluator.scala:10)
	at wdl.model.draft3.graph.expression.ValueEvaluator$Ops.evaluateValue$(ValueEvaluator.scala:10)
	at wdl.model.draft3.graph.expression.ValueEvaluator$ops$$anon$1.evaluateValue(ValueEvaluator.scala:10)
	at wdl.transforms.base.wdlom2wom.expression.WdlomWomExpression.evaluateValue(WdlomWomExpression.scala:37)
	at wom.graph.expression.ExpressionNode.evaluateAndCoerce(ExpressionNode.scala:35)
	at wom.graph.expression.ExpressionNode.$anonfun$evaluate$2(ExpressionNode.scala:45)
	at scala.util.Either.flatMap(Either.scala:352)
	at wom.graph.expression.ExpressionNode.evaluate(ExpressionNode.scala:44)
	at cromwell.engine.workflow.lifecycle.execution.keys.ExpressionKey.processRunnable(ExpressionKey.scala:31)
	at cromwell.engine.workflow.lifecycle.execution.WorkflowExecutionActor.$anonfun$startRunnableNodes$7(WorkflowExecutionActor.scala:644)
	at scala.collection.immutable.List.map(List.scala:246)
	at cromwell.engine.workflow.lifecycle.execution.WorkflowExecutionActor.cromwell$engine$workflow$lifecycle$execution$WorkflowExecutionActor$$startRunnableNodes(WorkflowExecutionActor.scala:636)
	at cromwell.engine.workflow.lifecycle.execution.WorkflowExecutionActor$$anonfun$5.applyOrElse(WorkflowExecutionActor.scala:235)
	at cromwell.engine.workflow.lifecycle.execution.WorkflowExecutionActor$$anonfun$5.applyOrElse(WorkflowExecutionActor.scala:233)
	at scala.PartialFunction$OrElse.apply(PartialFunction.scala:266)
	at akka.actor.FSM.processEvent(FSM.scala:710)
	at akka.actor.FSM.processEvent$(FSM.scala:704)
	at cromwell.engine.workflow.lifecycle.execution.WorkflowExecutionActor.akka$actor$LoggingFSM$$super$processEvent(WorkflowExecutionActor.scala:57)
	at akka.actor.LoggingFSM.processEvent(FSM.scala:847)
	at akka.actor.LoggingFSM.processEvent$(FSM.scala:829)
	at cromwell.engine.workflow.lifecycle.execution.WorkflowExecutionActor.processEvent(WorkflowExecutionActor.scala:57)
	at akka.actor.FSM.akka$actor$FSM$$processMsg(FSM.scala:701)
	at akka.actor.FSM$$anonfun$receive$1.applyOrElse(FSM.scala:695)
	at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:35)
	at cromwell.engine.workflow.lifecycle.execution.WorkflowExecutionActor$$anonfun$receive$1.applyOrElse(WorkflowExecutionActor.scala:576)
	at akka.actor.Actor.aroundReceive(Actor.scala:539)
	at akka.actor.Actor.aroundReceive$(Actor.scala:537)
	at cromwell.engine.workflow.lifecycle.execution.WorkflowExecutionActor.akka$actor$Timers$$super$aroundReceive(WorkflowExecutionActor.scala:57)
	at akka.actor.Timers.aroundReceive(Timers.scala:51)
	at akka.actor.Timers.aroundReceive$(Timers.scala:40)
	at cromwell.engine.workflow.lifecycle.execution.WorkflowExecutionActor.aroundReceive(WorkflowExecutionActor.scala:57)
	at akka.actor.ActorCell.receiveMessage(ActorCell.scala:614)
	at akka.actor.ActorCell.invoke(ActorCell.scala:583)
	at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:268)
	at akka.dispatch.Mailbox.run(Mailbox.scala:229)
	at akka.dispatch.Mailbox.exec(Mailbox.scala:241)
	at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
	at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
	at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
	at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
2024-03-12 20:24:51 cromwell-system-akka.actor.default-dispatcher-4 INFO  - Message [cromwell.engine.workflow.lifecycle.EngineLifecycleActorAbortCommand$] from Actor[akka://cromwell-system/user/cromwell-service/WorkflowManagerActor/WorkflowActor-a06a4c5e-fbf7-4c1d-ac71-b036aaf48fbc#2096097107] to Actor[akka://cromwell-system/user/cromwell-service/WorkflowManagerActor/WorkflowActor-a06a4c5e-fbf7-4c1d-ac71-b036aaf48fbc/WorkflowExecutionActor-a06a4c5e-fbf7-4c1d-ac71-b036aaf48fbc#659989485] was not delivered. [1] dead letters encountered, no more dead letters will be logged. If this is not an expected behavior, then [Actor[akka://cromwell-system/user/cromwell-service/WorkflowManagerActor/WorkflowActor-a06a4c5e-fbf7-4c1d-ac71-b036aaf48fbc/WorkflowExecutionActor-a06a4c5e-fbf7-4c1d-ac71-b036aaf48fbc#659989485]] may have terminated unexpectedly, This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
{
	"status": "Aborting",
	"id": "a06a4c5e-fbf7-4c1d-ac71-b036aaf48fbc"
}

@aednichols aednichols requested a review from a team as a code owner March 12, 2024 20:47
workflowName: break_with_stderr
status: Failed
"failures.0.message": "Workflow failed"
"failures.0.causedBy.0.message": "Failed to evaluate 'break_with_stderr.load_data_csv' (reason 1 of 2): Evaluating select_first([stdout(), stderr()]) failed: stdout is not implemented at the workflow level, Failed to evaluate 'break_with_stderr.load_data_csv' (reason 2 of 2): Evaluating select_first([stdout(), stderr()]) failed: stderr is not implemented at the workflow level"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really appreciate how our error handling reports both errors on the same line!

* An exception specific to conditions inside the executing WDL, as opposed to one that is "Cromwell's fault"
* @param message Description suitable for user display
*/
final case class WdlRuntimeException(message: String) extends RuntimeException with NoStackTrace {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copied from #7385 so I could capture stack-trace-free logs with complete honesty.

@aednichols aednichols enabled auto-merge (squash) March 13, 2024 02:52
@aednichols aednichols merged commit 171f3e9 into develop Mar 14, 2024
34 checks passed
@aednichols aednichols deleted the aen_wx_757_2 branch March 14, 2024 15:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants