Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: remove redundant rm.ExternalPreemptionPending interface #10071

Merged
merged 1 commit into from
Oct 17, 2024

Conversation

stoksc
Copy link
Contributor

@stoksc stoksc commented Oct 16, 2024

Ticket

Description

The ExternalPreemptionPending method always essentially called allocation.Signal, but through 2/3 pointless other calls/message passes. This just cuts out a few middlemen.

I did it now because it stood in the way of a test @rb-determined-ai was trying to remove.

Test Plan

This is code removal, needs no testing. It even swaps to tested code instead!

Checklist

  • Changes have been manually QA'd
  • New features have been approved by the corresponding PM
  • User-facing API changes have the "User-facing API Change" label
  • Release notes have been added as a separate file under docs/release-notes/
    See Release Note for details.
  • Licenses have been included for new code which was copied and/or modified from any external code

@stoksc stoksc requested review from a team as code owners October 16, 2024 21:15
@cla-bot cla-bot bot added the cla-signed label Oct 16, 2024
Copy link

netlify bot commented Oct 16, 2024

Deploy Preview for determined-ui canceled.

Name Link
🔨 Latest commit bae95b9
🔍 Latest deploy log https://app.netlify.com/sites/determined-ui/deploys/67102ce8de24330008046ac3

err := task.DefaultService.Signal(
model.AllocationID(req.AllocationId),
task.TerminateAllocation,
"preempted by the scheduler",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kept the same message but it's not a particularly great message

@stoksc
Copy link
Contributor Author

stoksc commented Oct 16, 2024

hilarious i got "code review required" by both backend and cluster mgmt. may need to tweak though filters some.

Copy link

codecov bot commented Oct 16, 2024

Codecov Report

Attention: Patch coverage is 14.28571% with 6 lines in your changes missing coverage. Please review.

Project coverage is 54.43%. Comparing base (a14525f) to head (bae95b9).
Report is 6 commits behind head on main.

Files with missing lines Patch % Lines
master/internal/api_trials.go 14.28% 6 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #10071      +/-   ##
==========================================
+ Coverage   54.42%   54.43%   +0.01%     
==========================================
  Files        1262     1262              
  Lines      158901   158886      -15     
  Branches     3631     3632       +1     
==========================================
+ Hits        86474    86487      +13     
+ Misses      72293    72265      -28     
  Partials      134      134              
Flag Coverage Δ
backend 45.53% <14.28%> (+0.03%) ⬆️
harness 72.74% <ø> (ø)
web 53.95% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...ster/internal/rm/agentrm/agent_resource_manager.go 49.45% <ø> (+0.10%) ⬆️
...nal/rm/dispatcherrm/dispatcher_resource_manager.go 18.89% <ø> (+0.25%) ⬆️
...nal/rm/kubernetesrm/kubernetes_resource_manager.go 27.67% <ø> (+0.08%) ⬆️
master/internal/rm/multirm/multirm.go 64.96% <ø> (-0.28%) ⬇️
master/internal/rm/resource_manager_iface.go 70.00% <ø> (ø)
master/internal/api_trials.go 56.05% <14.28%> (-0.17%) ⬇️

... and 6 files with indirect coverage changes

Copy link
Member

@rb-determined-ai rb-determined-ai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sick! Almost pure deletion, without any reduction in actual features.

@stoksc stoksc merged commit 34e4749 into main Oct 17, 2024
87 of 100 checks passed
@stoksc stoksc deleted the rm-useless-rm-method branch October 17, 2024 23:26
rb-determined-ai added a commit that referenced this pull request Oct 22, 2024
The only test in this series was an e2e test sure that we caught slurm's
SIGTERM, reported that to the master via an API call, and that training
could would then see should_preempt() return True.

This didn't need to be an e2e test; it can be a unit test to make sure
our SIGTERM handler is working, plus rerouting the API logic to pass
through the our normal preemption handling (#10071).  There is already
plenty of coverage ensuring should_preempt() works in python code.

So by adding that unit test here and following the refactor of #10071,
it is safe to remove the e2e_slurm_preemption series entirely.

This is part of a larger effort to get rid of our znode tests, which
are notoriously unreliable.
rb-determined-ai added a commit that referenced this pull request Oct 22, 2024
The only test in this series was an e2e test sure that we caught slurm's
SIGTERM, reported that to the master via an API call, and that training
could would then see should_preempt() return True.

This didn't need to be an e2e test; it can be a unit test to make sure
our SIGTERM handler is working, plus rerouting the API logic to pass
through the our normal preemption handling (#10071).  There is already
plenty of coverage ensuring should_preempt() works in python code.

So by adding that unit test here and following the refactor of #10071,
it is safe to remove the e2e_slurm_preemption series entirely.

This is part of a larger effort to get rid of our znode tests, which
are notoriously unreliable.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants