Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: API can get process stats #681

Closed
acosta11 opened this issue Feb 17, 2022 · 2 comments · Fixed by #700
Closed

[Feature]: API can get process stats #681

acosta11 opened this issue Feb 17, 2022 · 2 comments · Fixed by #700
Assignees

Comments

@acosta11
Copy link
Member

Blockers/Dependencies

Depends on the spike branch.

Background

As a API client
I want to view stats for given cf processes
So that I can observe my applications

Acceptance Criteria

Happy Path

GIVEN cluster with cf-k8s-controllers, api shim, and metrics server installed
WHEN I make a request that looks like

curl "https://api.example.org/v3/processes/[guid]/stats" \
-X GET \
-H "Authorization: bearer [token]"

THEN I get a response that looks like

HTTP/1.1 200 OK
Content-Type: application/json

{
  "resources": [
    {
      "type": "web",
      "index": 0,
      "state": "RUNNING",
      "usage": { "time": "2016-03-23T23:17:30.476314154Z", "cpu": 0.00038711029163348665, "mem": 19177472, "disk": 69705728 },
      "host": "10.244.16.10",
      "instance_ports": [],
      "uptime": 9042,
      "mem_quota": 268435456,
      "disk_quota": 1073741824,
      "fds_quota": 0,
      "isolation_segment": null,
      "details": null
    }
  ]
}

Dev Notes

API docs: https://v3-apidocs.cloudfoundry.org/version/3.113.0/index.html#get-stats-for-a-process

Because our installation scripts do not currently include metrics-server, we will need to update them to enable end to end testing.

Notes on particular fields:

  • Instance ports meaning in this case is unclear, so ignore the block in the response for this story.
  • Similarly, assume the file descriptor quota is 0 as we do not get insight into that information. (And may not have that concern on K8s)
  • If uptime is unclear from k8s metrics response, make a note and we can also circle back.

Note on RBAC:

  • The metrics api group is separate from corev1, so the existing process/pod RBAC is likely insufficient, but we should be able to determine the necessary RBAC on the metrics pod resource

Fallback:

  • If metrics server is not available, continue to return the empty usage block
    Do we also include a runtime log for debugging as well?
@gnovv
Copy link
Contributor

gnovv commented Feb 18, 2022

@acosta11 @tcdowney hypothetically when a field on the PodMetrics isn't present like memory or disk- should we present the values as 0, or nil, or not include the fields at all in the final response?

@tcdowney
Copy link
Member

@gnovv it's debatable what the best thing to do is. I think as long as the endpoint doesn't fail and still responds OK (but maybe with some warning headers) then we can do either. Tradeoffs are:

  1. Returning 0 is valid for the type, but is also a valid value and can be misleading. Returning a negative value might have strange implications on existing clients that are doing math and using this to make scaling decisions.
  2. Returning null is more semantically correct, but might break clients that are only expecting an integer (for the CLI it will just case it to 0 since it's Go, so it's OK).

Linking a couple issues/PRs where this was discussed relatively recently:

I think I ended up making the whole usage object an empty {} object in Cloud Controller because it wasn't possible for these to partially come back? Have you found that's possible? I think returning null then for each field is more in line with what we're doing for other fields that could be missing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

7 participants