-
Notifications
You must be signed in to change notification settings - Fork 252
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dotnet build run into 429 and ignore Retry-After Header #10558
Comments
Issue moved from dotnet/sdk#15619
From @dotnet-issue-labeler[bot] on Tuesday, February 2, 2021 1:11:18 PM I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label. |
Consider using maxHttpRequestsPerSource as a client side workaround. Sample: (tweak the connection count) <config>
<add key="maxHttpRequestsPerSource" value="16" />
</config> Related: https://github.com/NuGet/docs.microsoft.com-nuget/issues/2033 |
Thanks for the hotfix @nkolev92 , I will try this. Anyway will a proper handling of 429 ever implemented? I work for a large organization and will be not the only one with this problem. Furthermore, I also expect that other companies will face the same issue when they use NuGet package manager. |
Hey @fotto1, We prioritize issues based on customer reports, workarounds available etc. At this point, I can't make a commitment when this will be fixed. The NuGet client/server protocol currently does not handle 429. |
We might need to raise the priority of this. Our latest insertions into the .NET SDK failed because of this. PR has failed checks: dotnet/sdk#15854 This is in no way specific to NuGet's insertion into the .NET SDK, hence all of dnceng might be affected by these HTTP responses. We'll also need to work with the Azure Artifacts team to understand if their limits are too low. But maybe we need to implement at least a basic solution to this request. |
To the people who 👍 my comment, or others who would like this feature request implemented, please react 👍 to the original comment/post, so that this issue will bubble up the list of all open issues sorted by 👍 |
@marcpopMSFT @nkolev92 @heng-liu Would you accept a community contribution for this? |
I wanted to be helpful and write an comment with links helping you to get started, but then I realized that all the product code that needs to be changed is in However, I haven't yet tested this whatsoever. If you'd like to try to figure out NuGet.Protocol's tests, or build and then somehow manually test this code against a server you know (or you can force) to return HTTP 429 responses, then I'd love that help. Otherwise, I'll try to figure out unit tests over the upcoming days/weeks. |
Wow, nice! Happy to help however I can! Is there a straightforward way to get your changes in a custom |
https://github.com/NuGet/NuGet.Client/blob/dev/docs/debugging.md#debugging-restore-task-in-dotnetexe Replace msbuild with dotnet, and it's the same, as mentioned in debugging.md. Also, I'm pretty sure only NuGetRestoreTargets needs to be defined, because the other two properties are defined in nuget.targets.
|
@zivkan Thanks a lot for the great help setting this up! It worked out great and I was able to play around with your branch a bit. Unfortunately the changes don't seem to be effective for ... as already indicated by your comment at https://github.com/NuGet/NuGet.Client/compare/dev-zivkan-handle-http-429#diff-2ae20246c40c717c164cb40b9858455a72a6bc262b23e4d0a14fc7824fc1d860R265-R267 😀 I guess we're now left with a. either implementing the "retry-after" logic at the call site or b. remove the retry logic from the call site so that the new code becomes effective... 🤷 |
@zivkan I noticed that you removed this from the sprint. Can I help bring this forward somehow? Also happy to dig deeper into the code if this does not have enough priority for you at the moment. |
I was looking into this in December, but I only had a day or two of time available. The fix for this is needed to also fix #11226, which is currently in this sprint, but honestly I'm not confident that I'll have enough time given all the other work I have planned for the end of the month (and this doesn't include work I have scheduled in private repos and Visual Studio's Developer Community, and Microsoft internal issue tracking. But most of our planned work is here on GitHub). If you click that link after January 2022, it'll be empty except for completed work, since all work I had assigned will be bumped to other sprints. Anyway, my idea for the fix would be to add a Finally, This is a little tricky/time consuming because most of these APIs are public, and I feel there's a high probability that developers using So now it's just a question if I can find time in a sprint to work on it. I currently have 11226 in my quality week, but I'm definitely also overcommitted. However, I've also made good progress on my non-quality week work this sprint, so if I'm lucky I'll finish early and have extra time for these types of issues before the end of the month. If someone else wants to try it and send us a PR, that's fine, but I will be looking very closely at every |
Thanks @zivkan for the in-depth answer, I can fully understand your situation! Your input is surely super helpful to implement the necessary changes. It looks like a lot of work, so I am not sure when I'll find the time needed to provide a high-quality contribution. I'll definitely ping you before starting bigger changes! |
Hey all, In 6.2, we added a fix that changed how NuGet did retries and added an exponential back-off, NuGet/NuGet.Client@dc37ed5, NuGet/NuGet.Client@73c1eea. While this isn't a fix specific to the 429 status code, it is very likely to improve this scenario as well. Please let us know if you are still seeing similar issues after this fix by upvoting this comment. |
@danmoseley in reference to dotnet/sdk#27843 (comment). as noted above. proper handling of 429 means respecting the |
Thanks @kasperk81 . IIRC the changes linked by @nkolev92 were actually originally motivated by making our CI builds more stable, and we certainly have them already. Are you suggesting the HttpRetryHandler should add special treatment for 429 (HttpStatusCode.TooManyRequests)? |
While I fully support honoring the retry-after field and am happy to comment, I only added retries and backoff in places where it was lacking. I made no efforts to try to honor the Retry-After field as there was nothing in place as far as I can tell to handle HTTP 429s at the time, and I was trying to change as little as possible while working around intermittent known issues with Azure DevOps package feeds which in most cases was achieved by having any retry-with-delay at all. |
@MattGal is there any way to query how often our infra fails with 429? eg grep the world for |
Ulises on my team (can't tag him here, but linked the issue) has been working on the ability to grep through build logs for this. Short answer is "I don't think so" but I will check out what I can find in the 1ES AzureDevOps Kusto instance, and get back to you. |
tagging should work here. @ulisesh |
Memorializing the failure in question here, anyway, until we have data on whether it's common --
For completeness we do not know of course whether doing anything special for 429 for this server would help. Rerunning the build succeeded. |
I will add that the SDK still hits 429s quite regularly. This was more common before we started predownloading many packages (we can't pre-download current version runtime specific packages easily as we'd be constantly chasing those versions). However I still see this pretty much every time we have branding or branching as usually those trigger lots of PRs all at the same time. |
Just in dotnet/sdk repo, I found 377 failed working items with the string |
@marcpopMSFT @danmoseley is it ok if I create a known issue (in this repo) to track how often we get 429s? |
@ulisesh it's not my repo, but I can't imagine how they'd mind. Seems like a good idea if it helps. |
8.4% of failures are 429, worth looking into. cloud services are known to issue 429 errors based on load and scale. honoring |
@aortiz-msft do you know who might know when |
I just started an internal email thread about this, since it's discussing Azure DevOps and Azure services, that are not on GitHub. |
Issue moved from dotnet/sdk#15619
From @fotto1 on Tuesday, February 2, 2021 1:11:15 PM
We using for our project docker image mcr.microsoft.com/dotnet/core/sdk:3.1 with digest mcr.microsoft.com/dotnet/core/sdk@sha256:0fece15a102530aa2dad9d247bc0d05db6790917696377fc56a8465604ef1aff to run dotnet cli.
Our nuget packages are stored on an gitlab package manager version 13.8
After there are to many http request nuget receives 429 following RFC6585 with Retry-After Header in Response like described in https://tools.ietf.org/html/rfc6585#section-4
Instead of do a retry later
dotnet build
continue with download next packages and exit with 1 after third download fails with 429Can you please check how 429 is handled in
dotnet build
in case nuget is configuredThe text was updated successfully, but these errors were encountered: