Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add some level of synchronization to the root of the API #283

Merged
merged 3 commits into from
Sep 9, 2017

Conversation

mmitche
Copy link
Contributor

@mmitche mmitche commented May 16, 2016

This adds some synchronization to the maps at the root of the API to avoid duplicated calls to the actual GH REST API. Specifically this is targeted around the two maps, orgs and users. This fix makes the GHPRB jenkins plugin behave much better when there are lots of projects that could build for a specific repo (even if only a few are actually triggered)

@kohsuke
Copy link
Collaborator

kohsuke commented Jun 3, 2016

OTOH this change introduces contention when multiple threads try to access different users and orgs. I wonder if it'd be worthwhile to take the fix one step further and address that.

@mmitche
Copy link
Contributor Author

mmitche commented Jun 3, 2016

Wouldn't that still cause problems? We should avoid concurrent modification of the maps. What are you thinking?

I'm also thinking that this is an incomplete fix. There are a lot of non-final fields around the API that could be modified by multiple threads if someone passed...say...a pull request object to multiple threads and made multiple concurrent calls to certain APIs. However, the maps at the root appear to be the primary source of concurrency issues.

// map, the point is to avoid making unnecessary GH API calls, which are expensive from
// an API rate standpoint
synchronized (users) {
GHUser cachedUser = users.get(this.login);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kohsuke This is invalid right now (we call getMyself in the constructor). I'm fixing this to check for a valid login, but in the meantime don't merge this.

@@ -200,7 +200,7 @@ public String getMergeCommitSha() throws IOException {
* Depending on the original API call where this object is created, it may not contain everything.
*/
private void populate() throws IOException {
if (merged_by!=null) return; // already populated
if (mergeable_state != null) return; // already populated by id
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

merged_by may be null if the PR was unmerged, leading to an API call every time we call populate()

This adds some synchronization to the maps at the root of the API to avoid duplicated calls to the actual GH REST API.  Specifically this is targeted around the two maps, orgs and users.  This fix makes the GHPRB jenkins plugin behave much better when there are lots of projects that could build for a specific repo (even if only a few are actually triggered)

There are also a few fixes around GHUser and GHPullRequest
* GHPullRequest was checking a field that may be null (merged_by) when determining whether to fetch details.  An unmerged PR would make a bunch of Github API calls for each property accessed.
* Where GHUser was returned in various objects, we weren't going through the caching mechanism at the root, so calls to APIs on GHUSer often resulted in new REST calls.  Instead, return from the cache wherever possible.
@mmitche
Copy link
Contributor Author

mmitche commented Jun 8, 2016

@kohsuke I added a bunch of logging to the API and made a bunch of additional fixes that should improve overall performance (few REST calls). PTAL.

@mmitche
Copy link
Contributor Author

mmitche commented Jun 21, 2016

@kohsuke Have you had a chance to look at the updated version here?

This reduced our GH API calls by around 50% in a large jenkins installation

@akostadinov
Copy link

In a small install we are hitting the rate_limit issue. Any update on this pull request?

@mmitche
Copy link
Contributor Author

mmitche commented Aug 29, 2017

@kohsuke Addressed issues. Changed over to using a ConcurrentHashMap, which has fast reads (the common case) and only locks on write.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants