- error rate - percent of errors (in log or 4xx/5xx from cw)
- availability yield - percent of well formed requests that succeeded
- availability harvest - data in request / total data
- Latency
- Traffic
- Errors
- Saturation
- monitoring of all resources in path of request ("there is a problem and it doesn't look like code")
- aws managed resource monitoring
- host monitoring at fleet level
- host monitoring at host level
- alerting on errors in logs, or metrics made from logs
- tracing for path of request through services in a nice visual format and correlate trace to log messages
- dashboards
- endpoint monitoring of starz app, but possibly also third parties
- robust api for 3rd party integrations and getting data out for retention