Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new SamzaApplicationMaster metric to track allocated containers buffered in AM #1677

Merged
merged 3 commits into from
Jul 18, 2023

Conversation

jia-gao
Copy link
Contributor

@jia-gao jia-gao commented Jun 30, 2023

LISAMZA-28234

Symptom:
We have observed that Samza YARN AM requests 2-3x more containers than needed during job startup. Can cause temporary resource exhaustion in cluster when large jobs are launched, causing unrelated deployments to timeout/fail.

Change:
This PR added a Samza AM metric to monitor how many containers are allocated by RM and buffered in AM
The metric increments when AM got an allocated container from RM and add it to the buffer. The metric decrements when AM released the overallocated containers (after all processors started).

By observing this metric, we can know how many containers were allocated to a job during its startup and how long before the overallocated ones got released.

@jia-gao jia-gao changed the title Add new SamzaApplicationMaster metric to track containers allocated b… Add new SamzaApplicationMaster metric to track allocated containers buffered in AM Jun 30, 2023
@shanthoosh shanthoosh merged commit 7dee997 into apache:master Jul 18, 2023
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants