Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[STRMCMP-844] Fix rollback when SubmittingJob times out #172

Merged
merged 3 commits into from
Feb 28, 2020

Conversation

mwylde
Copy link
Contributor

@mwylde mwylde commented Feb 19, 2020

If we time out after the new job has been submitted (generally because the tasks are never all running, possibly due to parallelism being misconfigured), we will enter the rollback phase. However, because the new job was submitted, the jobID is set in our status, and so [this check|https://github.com/lyft/flinkk8soperator/blob/4142437353666b8692e62acf075a9b2c70514dd9/pkg/controller/flinkapplication/flink_state_machine.go#L396] prevents us from submitting the job to the old cluster.

This PR fixes that issue by clearing the jobId before moving to the RollingBack phase.

@@ -460,6 +460,7 @@ func (s *FlinkStateMachine) handleSubmittingJob(ctx context.Context, app *v1beta
// Something's gone wrong; roll back
s.flinkController.LogEvent(ctx, app, corev1.EventTypeWarning, "JobSubmissionFailed",
fmt.Sprintf("Failed to submit job: %s", reason))
app.Status.JobStatus.JobID = ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Want to add a small unit test?

@mwylde
Copy link
Contributor Author

mwylde commented Feb 27, 2020

Added a test

@mwylde mwylde merged commit be521f6 into master Feb 28, 2020
@mwylde mwylde deleted the micah_submitjob_rollback branch February 28, 2020 19:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants