This repository includes a sample .Net Core application and Helm template with probes for checking the liveness of the underneath Service Bus Subscription connection.
The repository contains the following folders:
- .github: containing the GitHub Actions definition for building and pushing the Docker image of the sample application
- deploy: containing the Helm template for deploying the application on your Kubernetes Cluster
- src: containing the source files for the application and some Unit Tests
This repository contains a helm chart to deploy the processor to a Kubernetes cluster.
In order to run the sample application, you need to authenticate and authorize requests to Azure Service Bus.
-
Retrieve the connection string
-
Navigate to the deploy/helm folder
-
Properly format and run the following command:
helm install release-name SubscriptionProcessorChart --set ServiceBusConfiguration.ConnectionString="ConnectionStringYouRetrievedBefore" --set ServiceBusConfiguration.EntityPath="TopicName"
- Follow instructions in the AAD Pod Identity repo in order to deploy the required prerequisites
- Assign your Managed Identity the "Contributor" role scoped to Azure Service Bus Namespace
- Assign your Managed Identity the "Azure Service Bus Data Receiver" role to Azure Service Bus Topic
- Navigate to the deploy/helm folder
- Properly format and run the following command:
helm install release-name SubscriptionProcessorChart --set ServiceBusConfiguration.Namespace="Namespace" --set ServiceBusConfiguration.EntityPath="TopicName" --set PodIdentity.Enabled=true --set PodIdentity.BindingLabel="The Label You Specified In Step1"
After the solution is successfully deployed and running, disable the Service Bus subscription that the application created on the fly.
If you take a look at the logs (kubectl logs -f pod-name) and at the events (kubectl get events) you will realize that:
- Multiple exceptions will be quickly thrown
- Liveness probe is executed by Kubernetes every 10 seconds
- After 3 consecutive "Unhealthy" status, the pod gets killed
- The pod restarts and creates a new Subscription. Everything is working normally again.
The following table lists the configurable parameters of the chart and their default values.
Parameter | Description | Required | Default |
---|---|---|---|
image.repository |
The image repository to pull from | danigian/aks-servicebus-health | |
image.pullPolicy |
The image pull policy | Always | |
imagePullSecrets |
The image secrets for pulling | [] | |
resources.requests.cpu |
CPU resource requests | 300m | |
resources.limits.cpu |
CPU resource limits | 300m | |
resources.requests.memory |
Memory resource requests | 256Mi | |
resources.limits.memory |
Memory resource limits | 256Mi | |
PodIdentity.Enabled |
Boolean for enabling AAD PodIdentity Binding | false | |
PodIdentity.BindingLabel |
Label for AAD PodIdentity Binding | if PodIdentity.Enabled is true |
"" |
ServiceBusConfiguration.SbMinimumAllowedBackoffTime |
Minimum backoff time for the Exponential Retry Policy (in seconds) | 0 | |
ServiceBusConfiguration.SbMaximumAllowedBackoffTime |
Maximum backoff time for the Exponential Retry Policy (in seconds) | 30 | |
ServiceBusConfiguration.SbMaximumAllowedRetries |
Maximum number of retries for the Exponential Retry Policy | 5 | |
ServiceBusConfiguration.SbMonitorGracePeriod |
Grace period for the internal application liveness health check (in seconds) | 120 | |
ServiceBusConfiguration.ConnectionString |
Connection string to Azure Service Bus Namespace | required if PodIdentity.Enabled is false |
"" |
ServiceBusConfiguration.EntityPath |
ServiceBus Topic Name | always required | "" |
ServiceBusConfiguration.Namespace |
ServiceBus Namespace | required if PodIdentity.Enabled is true |
"" |
The source code you find in this repository is evaluating healthiness of the system based on the exceptions raised by the SubscriptionClient.
These exceptions can be transient, therefore retryable, or not.
For the C# SDK, the default RetryExponential policy, inherits from the abstract class RetryPolicy.
If a exception is transient, it will be retried following the defined policy, otherwise it will be immediately thrown.
In this application implementation, whenever an exception gets raised, it will be reported to the SubscriptionMonitor.
If multiple non transient exceptions get thrown, there must be something definitely wrong and Kubernetes should kill the pod as soon as possible.
If transient exceptions get thrown, we should try to understand if the application recovered in the grace period or not. Therefore, at its startup, the SubscriptionMonitor calculates how many exceptions will be thrown, as a maximum, given the number of possible retries in the defined grace period.
WARNING:
The logic of this sample solution is depending on the specific implementation of the RetryExponential retry policy. If that changes in the future, the code may break and stop working as expected.
Even if the suggested approach is working, other alternatives should be considered.
If sending an heartbeat message to the topic will not affect other applications listening to that topic, your application health check should be based on the actual receipt of that heartbeat message.
This would ensure the liveness of the underneath connection and the proper acceptance of the message.