Skip to content

Commit

Permalink
fix(engine): prevent StackOverFlowError from breaking partition
Browse files Browse the repository at this point in the history
When terminating a child instance we bubble up to the flowscope to let
it know the child has been terminated. This flow scope can then check if
 it can also terminate itself.
 If there is a lot of nested children (usually due to a modelling error)
  this bubbling up could cause a StackOverFlowError. Currently this
  would break a partition and put the cluster in an unrecoverable state.

  This commit catches the SO and turns it into a RuntimeException. The
  Engine can handle this and will consider this to be an
  UNEXPECTED_ERROR. As a result the instance gets banned.

(cherry picked from commit 357b98d)
  • Loading branch information
remcowesterhoud authored and github-actions[bot] committed Jul 3, 2023
1 parent 2863648 commit 79afa3d
Showing 1 changed file with 20 additions and 2 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,6 @@
import java.util.function.Function;

public final class BpmnStateTransitionBehavior {

private static final String ALREADY_MIGRATED_ERROR_MSG =
"The Processor for the element type %s is already migrated no need to call %s again this is already done in the BpmnStreamProcessor for you. Happy to help :) ";
private static final String NO_PROCESS_FOUND_MESSAGE =
Expand Down Expand Up @@ -421,7 +420,19 @@ public void onElementTerminated(
element,
childContext,
(containerProcessor, containerScope, containerContext) -> {
containerProcessor.onChildTerminated(containerScope, containerContext, childContext);
try {
containerProcessor.onChildTerminated(containerScope, containerContext, childContext);
} catch (final StackOverflowError stackOverFlow) {
// This is a dirty quick "fix" for https://github.com/camunda/zeebe/issues/8955
// It's done so a cluster doesn't die when a user encounters this.
final var message =
String.format(
"""
Process instance `%d` has too many nested child instances and could not be terminated. \
The deepest nested child instance has been banned as a result.""",
containerContext.getProcessInstanceKey());
throw new ChildTerminationStackOverflowException(message);
}
return Either.right(null);
});
}
Expand Down Expand Up @@ -515,6 +526,13 @@ public <T extends ExecutableFlowElement> void terminateChildProcessInstance(
() -> containerProcessor.onChildTerminated(element, context, null));
}

private static final class ChildTerminationStackOverflowException extends RuntimeException {

public ChildTerminationStackOverflowException(final String message) {
super(message);
}
}

@FunctionalInterface
private interface ElementContainerProcessorFunction {
Either<Failure, ?> apply(
Expand Down

0 comments on commit 79afa3d

Please sign in to comment.