Generate diagnostic report from system menu #399

ghukill · 2019-04-18T14:30:49Z

One consistent pain point is diagnosing problems in the operations of Combine. This is due, in part, to the variety of services that Combine relies on:

MySQL
MongoDB
ElasticSearch
Livy and Spark
Celery

Each have their own logs, that provide helpful information, but this is not readily available through the GUI.

Proposing a "Run Diagnostics" button what would generate a zip file full of potentially helpful information. Perhaps even a "Diagnostics" page that shows which services are up and operational.

richardcadler · 2019-07-09T19:12:49Z

This looks like a very helpful thing to me.

antmoth · 2019-07-31T17:10:14Z

Two possible things to go with this ticket:

An array of green/yellow/red status lights, per service
A pile of viewable logs somewhere

I'm thinking that what we need to do is to get the logs for everything co-located into a spot on the filesystem that Combine has access to, and allow the user to view them...

Only certain services will be amenable to the 'array of green lights' option if used from inside Django. I think, but it seems like we could potentially set up a 'status'/'diagnostics' page that bypasses all of the running services, allowing us to Check Stuff Out when Django is down?

antmoth · 2019-07-31T20:23:56Z

I'm thinking that what we may want to do for the array-of-green-lights is to stitch together health-checks for all our services into a little command-line script, then set up (somehow?) a /status endpoint that doesn't rely on any of those services being up to run. The endpoint can call the script and construct HTML based on it? Am I totally off-base here?

Celery: There's apparently a web monitor program called Flower (as in flow, not as in botany). It's also possible to query redis-cli to monitor queue lengths.

Livy/Spark: Here, getting the status of all the Livy sessions might be the best we can do.

ElasticSearch: Actually has a health-check endpoint: GET _cluster/health.

Mongo: The mongo CLI has a ping command.

MySQL: It might be that the best way to check MySQL health is to try connecting to the db and performing a SELECT 1;.

ghukill closed this as completed Apr 23, 2019

ghukill reopened this Apr 23, 2019

antmoth self-assigned this Jul 9, 2019

antmoth added Hard High Priority August labels Jul 31, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generate diagnostic report from system menu #399

Generate diagnostic report from system menu #399

ghukill commented Apr 18, 2019

richardcadler commented Jul 9, 2019

antmoth commented Jul 31, 2019 •

edited

Loading

antmoth commented Jul 31, 2019 •

edited

Loading

Generate diagnostic report from system menu #399

Generate diagnostic report from system menu #399

Comments

ghukill commented Apr 18, 2019

richardcadler commented Jul 9, 2019

antmoth commented Jul 31, 2019 • edited Loading

antmoth commented Jul 31, 2019 • edited Loading

antmoth commented Jul 31, 2019 •

edited

Loading

antmoth commented Jul 31, 2019 •

edited

Loading