-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Major documentation structure for helping our team debug issues #1167
Comments
A more structured way to think about this is in terms of the 'objects' we have and the 'actions' that can be performed on them. A starter pack would be:
A lot of these have detailed documentation and trianing provided elsewhere ( |
I think it'll also help us identify required training that we can ask folks to take as part of onboarding so they have time set aside to learn how to use and navigate the tools we use. We stand on the shoulders of giants and we gotta use them! |
This exercise would also help us identify which pieces we can build to help in the process (ie. a quick way to get a cmd line ready to write |
We are working on structure for more docs to be written to make support steward work less stressful - https://hackmd.io/omFosDsjS3-2UEtAIZstcA. |
I see that the two PRs attached to this one have now been merged. Can we close this one? If not, can we define some glanceable deliverables that we can use to know when this issue should be closed? |
@choldgraf i've lifted the list of documents from the hackmd to the issue body. We can close this when those are written! |
- Reword title to complete 'How do I...?' - Remove cmd-access.md, and link instead to tutorials/getting-started, which has the same content. - Document the new behavior of health checks - Reword to emphasize that local deploys are ok but you *must* get them into CI asap, with reasoning. Ref 2i2c-org#1167
From #1314 (comment), we should also develop guides for GPU debugging. |
added to the list at the top 👍 |
Docker images with datascience related packages can be *huge*, and very difficult to build locally! We run a remote [docker-in-docker hack](https://gist.github.com/yuvipanda/48100eb9e15dae808052c7dc9fb22edb) on our 2i2c cluster to make this a lot more painless. This document describes *how* you can use this to build docker images from your laptop much faster. This frees up your laptop's resources, as well as provides you with a datacenter scale upload / download speeds. Ref 2i2c-org#1167
I'd like to unassign myself from this one - I am more than happy to help with documentation, but I don't think that I will have the time to spearhead any of the efforts on this one. I would be happy to be tagged-in as a support person though, or to review PRs etc. |
|
Comment migrated to #1826 (comment). |
@sgibson91, I moved over your comment into the dedicated issue referenced above. |
I'm going to close this one, as I think whatever improvements that were made during this push are complete by now. |
Context
In our latest incident on CarbonPlan @yuvipanda and I had some conversation about ways to lower the barrier for people to debug our cloud infrastructure when incidents occur. We agreed it would be helpful to have some basic documentation to help our team members get started with debugging common things.
Proposal
We've got an outline of major areas of documentation to write to make it easier for people to use our docs in the debugging/operations process. Here's that document:
https://hackmd.io/omFosDsjS3-2UEtAIZstcA
Here are the documents outlined there that we should create
We should flesh out the major sections in that HackMD, and update this issue as we do so with refs to PRs that implement things. We can close the issue once we've got docs that cover the major parts of that HackMD.
Updates and actions
The text was updated successfully, but these errors were encountered: