-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document CI / Devop pipelines #1731
base: main
Are you sure you want to change the base?
Conversation
@ruffsl I think you need to rebase, all of those errors have been dealt with. Let me know when you want me to take a look at this. |
A couple of suggestions
The section headers for circle CI cover the bases, thank you. |
2c2a51a
to
78d2bb6
Compare
@ruffsl what's the state of this? |
Had a workshop deadline, but hope to finish and merge before foxy release so I can reference this in the official Docker Hub library docs as an assortment of Dockerfile and ROS2 workspace examples. |
Hope the workshop proposal went well 😄 awesome, thanks for the update. |
I know its not related to this documentation, but do you know if we can trigger codecov to upload results if debug test succeeds but release test fails? That's a common occurrence and it would be nice to have updated results there as I'm actively tracking coverage. 39% was rookie numbers, 65% is a C-, 90% or bust. |
No, not sure. Your seeing missing code coverage results on the web UI for nightly CI workflows where only the release jobs are failing? Would https://docs.codecov.io/docs/commit-status#if_ci_failed Although, given CI uploads code coverage, regardless if the debug test fails, I'm not sure we'd want to set the status to success even if the CI fails: Per the docs, Codecov treats a failed CI as not complete because not all tests were executed: https://docs.codecov.io/docs/ci-service-relationship#if-ci-fails-then-the-coverage-results-are It looks like we could ignore a CI provider, but I don't see an option for ignoring one CI job: |
Correct, I'll look into the |
@SteveMacenski you can take a look at the dockerfile.md and dockerhub.md docs. Still working on the other sections. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, I think it needs to be more organized. There are a ton of open questions not answered about the specifics of "what's going on" and "why is it done that way". In the dockerfile file, there's a bunch of open questions about how you structured the document, you reference "stages" that don't exist in any of the dockerfiles and it seems to jump out of order. You also never actually explain how the caching happens, where it happens, how to break it, etc.
I think you need good introduction sections to introduce the concepts and why things are important / motivation before getting into it. I would say that I understand generally how things are going on, but I have to say that that documentation did not answer any of my questions and it was really confusing to read. |
Lightly poking on this |
As to how CircleCI caching is used. Because you create a new cache tag for each run of CI the cache is never re-used between CI runs. You can see this in the config around line 46. |
That is incorrect, although a unique cash key is created, the look up strategy for cashing is the most recent longest matching prefix, thus the unix epoch simply serves to version bump the cash to restore from, if you will. The restore step omits appending the epoch for this reason. |
Based on this (https://circleci.com/docs/2.0/caching/#partial-dependency-caching-strategies) and my testing I don't think that is correct. I think you have to specify to level of prefix you are ok with restoring from. For example you might do something like this:
|
One more thing I haven't figured out yet is I'm trying to cache the workspaces but it is re-building them even though i restored from a chache. |
@tylerjw , perhaps it would help too dissect an executed workflow for nav2 as example, e.g.: https://circleci.com/workflow-run/83b96516-b005-4f29-b587-98cf5c7f9f7f For the release build, we can see the cache was busted given how the nightly image that was pulled from for the executor has been updated. Thus the attempts to restore the workspace from the CI cache are missed. However, given the checkout PR makes no changes to the underlay used, the underlay provided by the docker image can be saved to seed the CI cache for this PR. https://circleci.com/gh/ros-planning/navigation2/12882 Restoring the cache for the overlay workspace similarly misses, but can used the ccache that was seeded from the docker image to speed the overlay build to a little under 4min. From the following ccache stats command, you can see the hit rate for ccache was about 99%, meaning that though the source overlay was changed, very little of it differs from the master branch. The updated ccache is then it self cached for later workflows that start from the same PR and use the same nightly image sha. The release test job then successfully restores the workspace built by the build job |
Should I look at this again? |
I need to update it a little to reflect #1934 now that it's using matrix stanzas and RAM disks for speeding up ccache disk IO. |
Co-authored-by: Steve Macenski <[email protected]>
Co-authored-by: Steve Macenski <[email protected]>
Co-authored-by: Steve Macenski <[email protected]>
Co-authored-by: Steve Macenski <[email protected]>
Co-authored-by: Steve Macenski <[email protected]>
Co-authored-by: Steve Macenski <[email protected]>
now that `Autobuild` is disabled for the build rules on Dockerhub The caches for release and debug jobs in the same workflow don't collide thanks to the different prefix in the cache key names. Add two arrows to make that separation more apparent. Also, correct the fact that test jobs don't reuse caches from prior test jobs, only from build jobs in the same workflow.
Co-authored-by: Steve Macenski <[email protected]>
@ruffsl I went through everything - from here on in the review, can you not hit the "resolve" button? It will help me keep track of which of these I already went through and 'OKed' your fix vs the open items. Then once its done and I just check that the intent of my comment was met, I'll hit the resolve button. This just helps me not have to go back and click every resolved discussion and look over it again and again. Normally its not a problem, but there are alot of comments on this one and that would suck up a lot of time |
@ruffsl any progress? |
@ruffsl any update? This is the oldest pending PR in Nav2 and I'd love to drive this to conclusion with the new galactic release |
No updates. My writing time is being spent on thesis work. We could meet up offline and I could answer any lingering questions over a call so you could do any last editing for sections you feel are still unclear. |
Unfortunately, I really don't have the cycles to commit to finishing this myself. I'm a bit overloaded by the current stack of work I have I can't add this to the queue. I suppose this work is on pause for the foreseeable future? |
This pull request is in conflict. Could you fix it @ruffsl? |
We approve mergify and it attacks! 🏴☠️ |
Just Nav2, right?! Yeah, I'm testing setting this up. Looks like one of my rules regex got messed up, but the others are working great! There's so many boilerplate things I have to tell people that this is going to help greatly to reduce my burden of. Now any time people nuke my required PR template fields, it'll give them back to me in a comment, if conflicts tell them, if builds fail tell them, if they open PRs in inappropriate branches tells them, etc |
We can talk about it more in this ticket #2593 if you like. I don't want to get off topics in this PR |
@ruffsl, your PR has failed to build. Please check CI outputs and resolve issues. |
This pull request is in conflict. Could you fix it @ruffsl? |
2 similar comments
This pull request is in conflict. Could you fix it @ruffsl? |
This pull request is in conflict. Could you fix it @ruffsl? |
Also updates and accompanying Dockerfiles.
Fixes #1716