Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes Internal Server Error returned on simultaneous SignalWithStart requests for the same WF ID #719

Merged
merged 2 commits into from
Sep 10, 2020
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 31 additions & 5 deletions service/history/handler.go
Original file line number Diff line number Diff line change
Expand Up @@ -982,12 +982,38 @@ func (h *Handler) SignalWithStartWorkflowExecution(ctx context.Context, request
return nil, h.error(err1, scope, namespaceID, workflowID)
}

resp, err2 := engine.SignalWithStartWorkflowExecution(ctx, request)
if err2 != nil {
return nil, h.error(err2, scope, namespaceID, workflowID)
}
for {
resp, err2 := engine.SignalWithStartWorkflowExecution(ctx, request)
if err2 == nil {
return resp, nil
}

return resp, nil
// Two simultaneous SignalWithStart requests might try to start a workflow at the same time.
// This can result in one of the requests failing with one of two possible errors:
// 1) If it is a brand new WF ID, one of the requests can fail with WorkflowExecutionAlreadyStartedError
// (createMode is persistence.CreateWorkflowModeBrandNew)
// 2) If it an already existing WF ID, one of the requests can fail with a CurrentWorkflowConditionFailedError
// (createMode is persisetence.CreateWorkflowModeWorkflowIDReuse)
// If either error occurs, just go ahead and retry. It should succeed on the subsequent attempt.
// For simplicity, we keep trying unless the context finishes or we get an error that is not one of the
// two mentioned above.
_, isExecutionAlreadyStartedErr := err2.(*persistence.WorkflowExecutionAlreadyStartedError)
_, isWorkflowConditionFailedErr := err2.(*persistence.CurrentWorkflowConditionFailedError)

isContextDone := false
select {
case <-ctx.Done():
isContextDone = true
mastermanu marked this conversation as resolved.
Show resolved Hide resolved
if ctxErr := ctx.Err(); ctxErr != nil {
err2 = ctxErr
}
default:
}

if (!isExecutionAlreadyStartedErr && !isWorkflowConditionFailedErr) || isContextDone {
return nil, h.error(err2, scope, namespaceID, workflowID)
}
}
}

// RemoveSignalMutableState is used to remove a signal request ID that was previously recorded. This is currently
Expand Down