Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expand documentation about our support charts #1891

Closed
16 tasks
Tracked by #1167
yuvipanda opened this issue Nov 9, 2022 · 1 comment
Closed
16 tasks
Tracked by #1167

Expand documentation about our support charts #1891

yuvipanda opened this issue Nov 9, 2022 · 1 comment

Comments

@yuvipanda
Copy link
Member

yuvipanda commented Nov 9, 2022

(child of #1890)

We currently have a bunch of 'support' charts (under helm-charts/support) that we install once on all clusters we maintain. We must document this better!

  • nginx-ingress
    • Why was this picked over other ingress providers?
  • cluster-autoscaler
    • why only for AWS?
  • prometheus
    • How to determine resources to give to this?
    • How to access from local machine?
    • Network topology - why is prometheus not exposed to the internet?
  • Grafana
    • Authentication
    • Where are the graph definitions stored?
  • nvidiaDevicePlugin (Document setting up GPUs on our clusters and access for our hubs #996)
  • cryptnono
    • What does it do?
  • Upgrading versions of these charts
  • Howto: Deploying these charts in a new cluster
@yuvipanda
Copy link
Member Author

Per @sgibson91 in #1906 (comment), we should also document how to debug prometheus outages in more detail

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

No branches or pull requests

1 participant