OJ-36874: introduce ado adapter to agent #354

gavinpitt-jf · 2024-08-05T17:11:35Z

Description

Introduce the ADO adapter to Agent.

A lot of this seems to be isort and black?

The bulk of this logic is to allow somebody to provide an ADO config/creds combo and run the Agent. To see my proof of testing, see this PR (which is in Jellyfish, to protect a customer name)

Testing

To test backwards compatibility, I ran the Agent for orthog for both Jira and Git. Additionally I did more testing for a specific customer, but I am keeping that testing plan in this private PR for customer privacy concerns.

Logging setup complete with handlers for log file, stdout, and streaming.
Will write output files into ./output/20240805_175013
Running ingestion healthcheck validation!
Validating configuration...

Jira details:
  URL:      https://jelly-ai.atlassian.net
  Username: [email protected]
  Password: **********
==> Testing Jira connection...
Authenticating to Jira API at https://jelly-ai.atlassian.net using the username and password secrets for [email protected] of company orthogonal-networks
==> Getting Jira version...
Found Jira version as 1001.0.0-SNAPSHOT
==> Getting Jira deployment type...
Response headers does not contain X-ANODEID! Customer is NOT running Jira Data Center.
==> Getting Jira permissions...
Found granted permissions as ['DELETE_OWN_WORKLOGS', 'CREATE_ISSUES', 'WORK_ON_ISSUES', 'DELETE_OWN_COMMENTS', 'MODIFY_REPORTER', 'EDIT_ISSUES', 'ADD_COMMENTS', 'EDIT_OWN_COMMENTS', 'ASSIGN_ISSUES', 'BROWSE_PROJECTS', 'EDIT_OWN_WORKLOGS', 'EDIT_ALL_WORKLOGS', 'EDIT_ALL_COMMENTS', 'CLOSE_ISSUES', 'SET_ISSUE_SECURITY', 'SCHEDULE_ISSUES', 'USER_PICKER', 'ADMINISTER_PROJECTS', 'DELETE_ALL_COMMENTS', 'RESOLVE_ISSUES', 'DELETE_ISSUES', 'VIEW_READONLY_WORKFLOW', 'MOVE_ISSUES', 'ASSIGNABLE_USER', 'TRANSITION_ISSUES', 'DELETE_ALL_WORKLOGS', 'LINK_ISSUES']
==> Testing Jira user browsing permissions...
Downloading Users...
Done downloading Users! Found 505 users
We can access 505 Jira users.
==> Testing Jira project permissions...
With provided credentials, the following projects are discoverable: {'JFR', 'OJ'}.
Checking project access.
Testing access for project: "JFR"
With provided credentials, we can access issues, versions, and components within project JFR
Testing access for project: "OJ"
With provided credentials, we can access issues, versions, and components within project OJ
Checking access to fields
Checking access to resolutions
Checking access to issue types
Checking access to issue link types
Checking access to priorities
Checking access to boards
Checking access to sprints
. Skipping Git Validation.

Memory & Disk Usage:
  Available memory: 743.87 MB
  Disk usage for jf_agent/output: 299 GB / 460 GB
  Size of jf_agent/output dir:  24K
Attempting to upload healthcheck result to s3...
Successfully uploaded healthcheck.json
Successfully uploaded jf_agent.log
Successfully uploaded healthcheck result to s3!

Done
Obtained Jira configuration, attempting download...
Attempting to use JF Ingest for Jira Ingestion
Set global value INGESTION_TYPE to AGENT
Beginning load_and_push_jira_to_s3
Using local version of ingest
Authenticating to Jira API at https://jelly-ai.atlassian.net using the username and password secrets for [email protected] of company orthogonal-networks
Data will not be saved locally
Data will be submitted to jellyfish
Downloading Jira Projects...
Done downloading Projects!
Downloading Jira Project Components...
Done downloading Project Components!
Downloading Jira Versions...
Done downloading Jira Versions!
Done downloading Jira Project, Components, and Version. Found 2 projects
Downloading Jira Fields...
Done downloading Jira Fields! Found 220 fields
Downloading Users...
Done downloading Users! Found 505 users
Downloading Jira Resolutions...
Done downloading Jira Resolutions! Found 9 resolutions
Downloading IssueTypes...
Done downloading IssueTypes! found 34 Issue Types
Downloading IssueLinkTypes...
Done downloading IssueLinkTypes! Found 10 Issue Link Types
Downloading Jira Priorities...
Done downloading Jira Priorities! Found 5 priorities
Downloading Jira Statuses...
Done downloading Jira Statuses! Found 124
Downloading Boards...
Done downloading Boards! Found 58 boards
Downloading Sprints...: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 58/58 [00:16<00:00,  3.48it/s]
Attempting to pull issue metadata for 2 projects, with a pull from date set as 2017-01-01 00:00:00+00:00
Getting total issue counts for 2 projects (Thread Count: 10): 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00,  3.20it/s]
Pulling issue data across 2 projects by Date (Thread Count: 10): 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 37943/37943 [00:31<00:00, 1203.29it/s]
Attempting to pull metadata for an additional 13 issues, which represents issue parents that we need to potentially redownload. (Parent Search Depth = 1)
Pulling issue data for 13 Jira Issue IDs (Thread Count: 10): 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 13/13 [00:00<00:00, 29.91it/s]
Only grabbing the first level of parents, because recursively_download_parents is False
Using IssueMetadata we have detected that 336 issues are missing, 745 issues are out of date, 0 issues need to be redownloaded (because of rekey and parent relations), for a total of 1081 issues to download
Attempting to pull 1081 full issues
Pulling issue data for 1081 Jira Issue IDs (Thread Count: 10): 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1081/1081 [00:08<00:00, 126.51it/s]
Successfully saved 1081 Jira Issues in 2 separate batches, with each batch limited to 50MB per batch
22096 issues have been detected as being deleted
Downloading Jira Worklogs...
Fetching updated worklogs
Done fetching updated worklogs
Fetching deleted worklogs
Done fetching deleted worklogs
Done downloading Worklogs! Found 0 worklogs and 0 deleted worklogs
Data has not been saved locally, because save_locally was set to false in the ingest config!
Data has been submitted to jellyfish
Starting Git download for 1 provided git configurations
downloading github users... ✓
downloading github projects... ✓
downloading github repos... ✓
downloading commits on branch develop for jellyfish: 340commits [00:03, 104.12commits/s]
downloading PRs for jellyfish: 34prs [01:01,  1.81s/prs]
Shutting down Systems Diagnostics Thread
Closing Diagnostics file
Compressing ./output/20240805_175013/diagnostics.json
Compressing ./output/20240805_175013/healthcheck.json
Sending data to Jellyfish...
Starting 8 threads
Successfully uploaded healthcheck.json.gz
Successfully uploaded status.json.gz
Successfully uploaded git_8v1crHqhmq/bb_projects.json.gz
Successfully uploaded diagnostics.json.gz
Successfully uploaded git_8v1crHqhmq/bb_users.json.gz
Successfully uploaded git_8v1crHqhmq/bb_repos.json.gz
Successfully uploaded git_8v1crHqhmq/bb_prs.json.gz
Successfully uploaded git_8v1crHqhmq/bb_commits.json.gz
Successfully uploaded config.yml
Agent run succeeded: True
Successfully uploaded jf_agent.log
Successfully uploaded .done
Done!
Closing the agent log stream.
Log stream stopped.

gavinpitt-jf · 2024-08-05T17:39:41Z

pyproject.toml

@@ -18,7 +18,7 @@ dependencies = [
    "click~=8.0.4",
    "requests>=2.31.0",
    "python-dotenv>=1.0.0",
-    "jf-ingest==0.0.105",
+    "jf-ingest==0.0.120",


TODO: After this PR gets approved and deployed, this should be bumped to 121: https://github.com/Jellyfish-AI/jf_ingest/pull/160

jruel4 · 2024-08-06T16:33:08Z

jf_agent/config_file_reader.py

@@ -371,6 +388,24 @@ def get_ingest_config(
    if config.jira_url and (
        (creds.jira_username and creds.jira_password) or creds.jira_bearer_token
    ):
+        issue_metadata: List[IssueMetadata] = IssueMetadata.from_json(


Q: Is this to handle a change in jf_ingest?

Also did we do a quick Jira test to validate this?

This is logic we were already doing, extracting the issue_metadata

Directly below it, though, we're building a new dictionary of "project IDs to pull froms", which is something the new jira sync uses (Kirk's speedup). We need to supply JF Ingest with a pull_from date for each project in order to get all the updated issues for that project

The new sync is currently not used in Agent, so this is pretty low impact. I did, however, do a Jira test with orthogonal networks just to make sure and it behaved as expected

jruel4 · 2024-08-06T16:40:03Z

jf_agent/main.py

+                        ]
+                    )
+                    # Jira is supported by all customers, always skip it
+                    directories_to_skip_uploading_for.add('jira')


Q: Previously, were we dual-submitting, or is this due to changes in jf_ingest version bump?

Yeah, there was a bug where we were double submitting github data. Not really a huge deal, but better practice to not do it. It might have had some run time performance if a customer tried to upload a massive file? But this is SUPER threaded so it was likely minimal

Gavin Pitt added 2 commits August 5, 2024 13:11

OJ-36874: introduce ado adapter to agent

eb3ee96

OJ-36874: updating JF Ingest for backwards compatibility check

5617262

gavinpitt-jf commented Aug 5, 2024

View reviewed changes

gavinpitt-jf requested a review from a team August 5, 2024 17:48

Gavin Pitt added 3 commits August 6, 2024 11:58

OJ-36874: updating jf ingest

acc9da9

OJ-36874: merge conflicts

c4a8b9e

OJ-36874: add colorama back in

c62ca82

jruel4 reviewed Aug 6, 2024

View reviewed changes

jruel4 approved these changes Aug 6, 2024

View reviewed changes

gavinpitt-jf merged commit ca0617e into master Aug 6, 2024
5 checks passed

gavinpitt-jf deleted the OJ-36874-introduce-ado-adpater-to-agent branch August 6, 2024 20:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OJ-36874: introduce ado adapter to agent #354

OJ-36874: introduce ado adapter to agent #354

gavinpitt-jf commented Aug 5, 2024 •

edited

Loading

gavinpitt-jf Aug 5, 2024

jruel4 Aug 6, 2024 •

edited

Loading

jruel4 Aug 6, 2024

gavinpitt-jf Aug 6, 2024

jruel4 Aug 6, 2024

gavinpitt-jf Aug 6, 2024

OJ-36874: introduce ado adapter to agent #354

OJ-36874: introduce ado adapter to agent #354

Conversation

gavinpitt-jf commented Aug 5, 2024 • edited Loading

Description

Testing

gavinpitt-jf Aug 5, 2024

Choose a reason for hiding this comment

jruel4 Aug 6, 2024 • edited Loading

Choose a reason for hiding this comment

jruel4 Aug 6, 2024

Choose a reason for hiding this comment

gavinpitt-jf Aug 6, 2024

Choose a reason for hiding this comment

jruel4 Aug 6, 2024

Choose a reason for hiding this comment

gavinpitt-jf Aug 6, 2024

Choose a reason for hiding this comment

gavinpitt-jf commented Aug 5, 2024 •

edited

Loading

jruel4 Aug 6, 2024 •

edited

Loading