Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Monitoring] Testcase upload metrics for the triage lifecycle #4364

Merged
merged 26 commits into from
Nov 13, 2024

Conversation

vitorguidi
Copy link
Collaborator

@vitorguidi vitorguidi commented Oct 30, 2024

Motivation

Chrome security shepherds manually upload testcases through appengine, triggering analyze task and, in case of a legitimate crash, the followup progression tasks:

  • Minimize
  • Analyze
  • Impact
  • Regression
  • Cleanup cronjob, when updating a bug to inform the user that all above stages were finished

This PR adds instrumentation to track the time elapsed between the user upload, and the completion of the above events.

Attention points

  • TestcaseUploadMetadata.timestamp was being mutated on the preprocess stage for analyze task. This mutation was removed, so that this entity can be the source of truth for when a testcase was in fact uploaded by the user.

  • The job name could be retrieved from the JOB_NAME env var within the uworker, however this does not work for the cleanup use case. For this reason, the job name is fetched from datastore instead.

  • The query_testcase_upload_metadata method was moved from analyze_task.py to a helpers file, so it could be reused across tasks and on the cleanup cronjob

Testing strategy

Every task mentioned was executed locally, with a valid uploaded testcase. The codepath for the metric emission was hit and produced the desired output, both for the tasks and the cronjob.

Part of #4271

@vitorguidi vitorguidi changed the title [WIP] Testcase upload metrics for the triage lifecycle Testcase upload metrics for the triage lifecycle Nov 1, 2024
@vitorguidi vitorguidi changed the title Testcase upload metrics for the triage lifecycle [Monitoring] Testcase upload metrics for the triage lifecycle Nov 8, 2024
@jonathanmetzman
Copy link
Collaborator

TestcaseUploadMetadata.timestamp was being mutated on the preprocess stage for analyze task. This mutation was removed, so that this entity can be the source of truth for when a testcase was in fact uploaded by the user.

Why was it OK to remove this?

Copy link
Collaborator

@jonathanmetzman jonathanmetzman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Collaborator

@alhijazi alhijazi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM % Jonathan's comments

@vitorguidi
Copy link
Collaborator Author

TestcaseUploadMetadata.timestamp was being mutated on the preprocess stage for analyze task. This mutation was removed, so that this entity can be the source of truth for when a testcase was in fact uploaded by the user.

Why was it OK to remove this?

The only place where this gets used is here

query = datastore_query.Query(data_types.TestcaseUploadMetadata)

This would only change the presentation of the uploaded testcases page in appengine, so I expect no badness from this movement.

@vitorguidi vitorguidi merged commit 2073870 into master Nov 13, 2024
7 checks passed
@vitorguidi vitorguidi deleted the feat/upload-time branch November 13, 2024 15:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants