Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[STRMCMP-558] Event improvements #44

Merged
merged 2 commits into from
Jul 11, 2019
Merged

[STRMCMP-558] Event improvements #44

merged 2 commits into from
Jul 11, 2019

Conversation

mwylde
Copy link
Contributor

@mwylde mwylde commented Jul 11, 2019

This PR makes two improvements to the events that the operator emits for FlinkApplication resources.

First, it ensures that all deploy-related events are unique to a particular deploy (generally by including the hash of the resource in the message). This ensures that if two deploys occur in rapid succession, the second one will still produce events (if more than 10 events with the same message are emitted within a 10 minute period, later events are dropped). This prevents a situation where a developer does a deploy, encounters an error that produces several events, tries the deploy again but this time doesn't seen any events because we have already exhausted our limit.

The second change is to fix the reason field of the events we emit. When I originally added events I misunderstood what the reason field was. Re-reading the docs, I now understand it to be basically a machine-readable code for the event, with the message as its human-readable counterpart:

'reason' is the reason this event is generated. 'reason' should be short and unique; it should be in UpperCamelCase format (starting with a capital letter). "reason" will be used to automate handling of events, so imagine people writing switch statements to handle them. You want to make that easy.

Using unique reasons for each event also prevents distinct events from being improperly combined.

fmt.Sprintf("Failed to create job managers: %v", err))
f.LogEvent(ctx, application, corev1.EventTypeWarning, "CreateClusterFailed",
fmt.Sprintf("Failed to create job managers for deploy %s: %v",
HashForApplication(application), err))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: hash := HashForApplication(app), and use hash.

return err
}

if newlyCreatedJm || newlyCreatedTm {
f.LogEvent(ctx, application, corev1.EventTypeNormal, "Flink cluster created")
f.LogEvent(ctx, application, corev1.EventTypeNormal, "CreatingCluster",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, Move reason to constant variable ?

@@ -388,7 +391,8 @@ func (f *Controller) DeleteOldResourcesForApp(ctx context.Context, app *v1alpha1
}

for k := range deletedHashes {
f.LogEvent(ctx, app, corev1.EventTypeNormal, fmt.Sprintf("Deleted old cluster with hash %s", k))
f.LogEvent(ctx, app, corev1.EventTypeNormal, "ToreDownCluster",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same. Nit, Move reason to constant variable ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really see any reason to. It's only used here and is descriptive of this event.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing major. Just a nit, so that they can be reused later.

@anandswaminathan
Copy link
Contributor

+1

@mwylde mwylde merged commit 1a62f03 into master Jul 11, 2019
@mwylde mwylde deleted the micah_unique_events branch July 11, 2019 21:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants