-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Set up CI with Azure Pipelines #6495
Conversation
@btholt Is it possible for the pipeline to be configured to run on multiple systems? It seems to be currently running on Ubuntu, but could I run it on OSX and Windows as well? I found how to make matrices work for the node executable, but not yet the OS. |
Well, that was easy. Let's see if we can factorize a bit the build steps ... |
|
It also seems like the pipeline is configured as public project, but the organization "Retention & parallel jobs" tab report it as private (already 70 minutes over 1800!) |
@arcanis I'm asking about it showing up as private. |
@btholt Something weird: it seems that my https://dev.azure.com/yarnpkg/yarn/_build/results?buildId=7&view=logs I've checked the logs for the "install and build" step, and as you can see the
Contrast that with the Linux job, that clearly shows that the
Any idea what I'm doing wrong? |
Interesting - using |
I think there's still something wrong - we have 122 synchronous tests that have a timeout of 5 seconds each, which should give at most 10 minutes of execution. But the tests have been running for 45 minutes now with no output 😢 I guess we'll have to ask someone with Windows to investigate and find what could block the execution. |
Seems like:
https://dev.azure.com/yarnpkg/yarn/_build/results?buildId=11&view=logs
|
A bit more passed (20/122), but it seems like there's something blocking the processes, that end up stuck in most cases. This is annoying, especially since we don't even access the real network (the tests boot a local mock server and communicate with it, it never needs to reach the npm registry) 🙁 |
I'm a Program Manager on Azure Pipelines. Were you able to get in contact with someone about why your builds are getting marked as private? If not, I can follow up. Can you try setting |
@kaylangan I was going to send an email this morning, but since you're here could you follow up on it? |
@arcanis regarding the hang in the step - script: |
cd packages/pkg-tests
yarn jest yarn --detectOpenHandles
echo done
displayName: 'run the acceptance tests' I'm curious whether yarn is hanging, or whether the script task is hanging. If the script task is hanging, then the bug is on me :) |
Hey @kaylangan! 🙂 Regarding private/public, no, not yet. For the record, this is why I see in my interface. It's a bit strange since the settings also say that the project has a public visibility. Hey @ericsciple, thanks for the support! From my investigation, it appears that it's Jest that doesn't exit, not the script task itself. The exact reason why it doesn't exit isn't clear (likely an open handle somewhere), but I think it's related to how most of the individual tests are timeouting - it's possible they leak subprocesses somewhere. So what I'm trying to figure out first is why all the tests are hanging on Windows, even though they appear to work fine on Linux / OSX 🤔 |
@arcanis if all processes within the tree are still in tact (can build tree from pid/ppid), then a tool like Process Monitor may help to show the tree. Of course you would need to be logged in to a Windows machine since it is a graphical tool. At the end of the job, the agent kills all orphaned processes that it can detect. We add a new guid environment variable per-job, and child processes typically inherit the env vars. At the end of the job, we search for all processes with that env var, and kill them. All killed processes are logged to the worker diagnostic log. If you set a variable agent.diagnostic=true, the agent/worker diagnostic logs are uploaded and you should be able to download them from the build summary page. Otherwise if the process is already gone at that point, I have another idea that might work. You could start a background process during your step, that scans for any processes with that specific environment variable and logs them to a file. And set a timeout on your step. Then in a subsequent always-run step, you could upload that log. If that approach sounds like it would help, email ersciple and we can iterate to get a script working to troubleshoot (at microsoft com). If it helps, I can publish it on our troubleshooting doc too (may help others). |
Do you have any other private projects in the Azure DevOps account or was this project private at any point? |
Never tried Azure before. The weird thing is that I did select private project at the beginning, but I don't remember ever switching it into private (but I can't switch it "back" to private, so maybe I'm mistaken?) |
I ran one of the tests with When I look at the process spawned by the test it is just hanging there. I added some logging and it looks like the reason is |
Yes I've spent some time debugging it yesterday, and it seems the problem is there (or around here): https://github.com/yarnpkg/yarn/blob/master/src/fetchers/tarball-fetcher.js#L151 It seems that during the Fetch step (where we fetch the tgz) we might reach a state where the I have no idea why it would only apply on Windows, or why we don't see it on the regular testsuite. Maybe it could be related to the download size (bigger archives don't trigger the bug, but small ones would)? |
So long as the source is public and the project in Azure Pipelines is public, the minutes shouldn't count against the private quota. Are you still seeing that number go up after you turned the project public? We're taking a look on our side. Also, I see your builds are timing out at 60 minutes. You can set the timeout to be as high as 6 hours. |
Yup, the number went up after each build, but as mentionned I don't remember even turning it public in the first place, so that would make sense (but then it doesn't make sense that it's listed as public in another part of the dashboard and that the links can be accessed by anyone 😅 ). Re: the timeout, the problem appears to be software - our tests shouldn't take more than 10 minutes top. |
@arcanis we've tracked down the issue to a bug on our side. The bug won't affect the concurrency of your jobs; we're just posting the minutes back as the wrong type. We've got a fix ready and it'll be deployed soon. |
Btw @pablonete, regarding 6830dd3: the test is to make sure that when you run |
Thank you for clarifying, @arcanis, I guess I didn't dig deep enough, I'll take another look at that test and see why fake binaries were being executed then. Btw, do you think it's a right time to enable Linux and macOS builds on Azure Pipelines? It would help us to detect regressions on other OS while fixing Windows tests. I see it's commented out, did you find any issue with running on the 3 platforms? |
Note that this same testsuite is being executed on Linux via the
Nope, they were working perfectly fine, I just wanted to decrease the load on Azure to ensure quick feedback for the Windows builds (especially to avoid problems caused by queuing and such) - I'll reenable them and deprecate their CircleCI counterparts as soon as we get Windows working 😃 |
Almost there! Only two remaining issues:
|
So if code is in D: but temporary folder is in C:, don't try to find a relative path but stay on the absolute one.
…instead of cmd-shim.
Everything works! Many thanks to you, @pablonete! 🥇 Interestingly, CircleCI seems generally faster than Azure except for the OSX builds which are always queued for a very long amount of time. I think it's fine in our case, but I wonder if maybe there's a cache somewhere I need to enable. @kaylangan I received the following email from Azure - is there a way to keep using the free plan for the Yarn organization? I admit I'm not too sure what are the various options. At the moment CircleCI has the advantage of being free (which matters since we're an independent org with no financing). |
0a656be
to
8f72462
Compare
Checked your Azure DevOps organization and it's all properly set up for free open source builds so it's all good there. The email from Azure looks to be related to an Azure Free Trial subscription which is probably related to something set up on that same email address? If you are just using hosted Linux/windows/mac build jobs in Azure DevOps then you are all good. If you have some of your own Azure compute resources such as your own virtual machines, web apps, DNS etc then that might be what the email is about. Feel free to message me if you want me to dig in more or if the yarn project needs some additional Azure hosted services. Sorry about the confusing email message though. |
* Increase timeout in Windows, we're seeing tests failing randomly and others close to default 5 sec. * Distinguish tests published from each job. * Pass name as vmImage is not available * Remove unnecessary detect unfinished tests. * Using strategy var instead of parameter * Use variables instead of strategy
Awesome 😃 Ok, I think this PR is ready to be merged. I'll just temporarily revert my commit that disable CircleCI until @Daniel15 can take a look at how we could integrate the Azure testsuite inside our release process (if I remember correctly what he explained about Appveyor, there's a webhook that need to be configured somewhere). Thanks a lot for all your help! |
This reverts commit 8f72462.
@arcanis The webhook is used to archive the 'nightly' builds (https://yarnpkg.com/en/docs/nightly) and to publish them when a release is tagged. The webhook doesn't actually run any of the tests. Having said that, it's on my todo list to try and get some time to see if we can move those webhooks to use the Azure DevOps stuff. Likely towards the end of the month. For now we should keep AppVeyor and CircleCI running as-is 😃 It would simplify a lot if we could use one system, as currently we need to grab build artifacts from both CircleCI and AppVeyor, and only publish the release once we receive the webhook calls from both. |
Does this use GitHub permissions, or is permissioning separate? Wondering if we need to add all the core Yarn team as admins for the project. |
@Daniel15 for now, permissioning is separate. We're working on integrating with GitHub permissions. |
No description provided.