Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compute online/offline transitions on subs-notifs side instead of relying on online/online-prev in NE status #9

Open
konstan opened this issue Feb 1, 2023 · 0 comments
Assignees
Labels

Comments

@konstan
Copy link
Contributor

konstan commented Feb 1, 2023

There might be cases when the combined work of the API server, job-engine NE state updater, and ES-to-Kafka exporter on nuvlabox-status creates a situation where we might miss offline-to-online transition expressed as online-prev=False and online=True. Here is the example

"current-time"  "online-prev" "online" "updated"                           "updated-by"
# NE online
2023-01-31T13:02:03Z  True  True  2023-01-31T13:02:16.534Z  nuvlabox/30eb7dd1-3e14-4f13-a71c-fd47c5e8d474
2023-01-31T13:02:03Z  True  True  2023-01-31T13:03:09.633Z  nuvlabox/30eb7dd1-3e14-4f13-a71c-fd47c5e8d474
# NE goes offline as detected by job-engine
2023-01-31T13:02:03Z  True  False  2023-01-31T13:04:37.594Z  group/nuvla-admin
# NE goes online - telemetry published by NE
2023-01-31T13:02:03Z  False  True  2023-01-31T13:05:21.604Z  nuvlabox/30eb7dd1-3e14-4f13-a71c-fd47c5e8d474
# NE continues to be online (it is sending telemetry)
2023-01-31T13:05:37Z  True  True  2023-01-31T13:05:50.595Z  nuvlabox/30eb7dd1-3e14-4f13-a71c-fd47c5e8d474
2023-01-31T13:06:06Z  True  True  2023-01-31T13:06:19.497Z  nuvlabox/30eb7dd1-3e14-4f13-a71c-fd47c5e8d474
2023-01-31T13:06:35Z  True  True  2023-01-31T13:06:48.755Z  nuvlabox/30eb7dd1-3e14-4f13-a71c-fd47c5e8d474
2023-01-31T13:07:04Z  True  True  2023-01-31T13:07:17.640Z  nuvlabox/30eb7dd1-3e14-4f13-a71c-fd47c5e8d474
# NE goes offline as detected by job-engine
2023-01-31T13:07:04Z  True  False  2023-01-31T13:08:38.026Z  group/nuvla-admin
# >>> NE is online... but the nuvlabox-status with "False to True" transition is missing
2023-01-31T13:08:02Z  True  True  2023-01-31T13:08:50.914Z  nuvlabox/30eb7dd1-3e14-4f13-a71c-fd47c5e8d474
2023-01-31T13:09:06Z  True  True  2023-01-31T13:09:20.854Z  nuvlabox/30eb7dd1-3e14-4f13-a71c-fd47c5e8d474

In the case of # >>> NE is online... but the nuvlabox-status with "False to True" transition is missing user doesn't get the online notification and when the next time job-engine detects NE is offline user receives the offline notification.

So, instead, the new implementation should rely purely on the online flag transitions. This will require a memory that is shared across the scaled instances of the service.

@konstan konstan added the +++ label Feb 1, 2023
@konstan konstan self-assigned this Feb 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant