You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When trying to get the changed_count (which requires generating a diff) we end up executing an O(n^2) loop on millions of records which hangs without a timeout or error, preventing the cron from finishing and also not raising any error other than the grafana (filter not uploaded) error
The loop should take a matter of seconds and have complexity O(n)
Additionally, if this type of error were to happen, we should be able to catch it on stage before prod and finally we should get more explicit warnings/errors that this kind of loop is running for a long time. It has been over 12 hours and no notifications.
Additionally, if this type of error were to happen, we should be able to catch it on stage before prod and finally we should get more explicit warnings/errors that this kind of loop is running for a long time. It has been over 12 hours and no notifications.
What happened?
When trying to get the changed_count (which requires generating a diff) we end up executing an O(n^2) loop on millions of records which hangs without a timeout or error, preventing the cron from finishing and also not raising any error other than the grafana (filter not uploaded) error
grafana: https://earthangel-b40313e5.influxcloud.net/d/IWZpIQgMk?orgId=1
logs: https://console.cloud.google.com/kubernetes/pod/us-west1/webservices-high-prod/amo-prod/addons-server-v1-cronjob-upload-mlbf-to-remote-setti-28860rv9s9/details?invt=Abhivw&project=moz-fx-webservices-high-prod&cloudshell=true
What did you expect to happen?
The loop should take a matter of seconds and have complexity O(n)
Additionally, if this type of error were to happen, we should be able to catch it on stage before prod and finally we should get more explicit warnings/errors that this kind of loop is running for a long time. It has been over 12 hours and no notifications.
Is there an existing issue for this?
┆Issue is synchronized with this Jira Task
The text was updated successfully, but these errors were encountered: