Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: check task config policy priority limit for [CM-490] #9958

Merged
merged 6 commits into from
Sep 26, 2024

Conversation

kkunapuli
Copy link
Contributor

@kkunapuli kkunapuli commented Sep 18, 2024

Ticket

CM-490

Description

Includes a check for a priority limit set via task config policies in the "manage job" workflow. Includes support for NTSC, and Experiments. It works for policies set for Workspaces, and Global scope.

Test Plan

Covered by unit and integration tests.

Checklist

  • Changes have been manually QA'd
  • New features have been approved by the corresponding PM
  • User-facing API changes have the "User-facing API Change" label
  • Release notes have been added as a separate file under docs/release-notes/
    See Release Note for details.
  • Licenses have been included for new code which was copied and/or modified from any external code

@cla-bot cla-bot bot added the cla-signed label Sep 18, 2024
Copy link

netlify bot commented Sep 18, 2024

Deploy Preview for determined-ui ready!

Name Link
🔨 Latest commit 3ac85df
🔍 Latest deploy log https://app.netlify.com/sites/determined-ui/deploys/66f30d78f8919d0008887cb0
😎 Deploy Preview https://deploy-preview-9958--determined-ui.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Copy link

codecov bot commented Sep 18, 2024

Codecov Report

Attention: Patch coverage is 54.54545% with 35 lines in your changes missing coverage. Please review.

Project coverage is 54.53%. Comparing base (dbeea99) to head (3ac85df).
Report is 7 commits behind head on main.

Files with missing lines Patch % Lines
master/internal/experiment_job_service.go 0.00% 16 Missing ⚠️
master/internal/command/command_job_service.go 0.00% 11 Missing ⚠️
...ternal/configpolicy/postgres_task_config_policy.go 86.36% 3 Missing ⚠️
master/internal/configpolicy/task_config_policy.go 85.71% 2 Missing ⚠️
...ster/internal/rm/agentrm/agent_resource_manager.go 0.00% 1 Missing ⚠️
...nal/rm/dispatcherrm/dispatcher_resource_manager.go 0.00% 1 Missing ⚠️
...nal/rm/kubernetesrm/kubernetes_resource_manager.go 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #9958      +/-   ##
==========================================
- Coverage   54.53%   54.53%   -0.01%     
==========================================
  Files        1257     1258       +1     
  Lines      156933   157009      +76     
  Branches     3614     3612       -2     
==========================================
+ Hits        85589    85619      +30     
- Misses      71211    71257      +46     
  Partials      133      133              
Flag Coverage Δ
backend 45.15% <54.54%> (-0.01%) ⬇️
harness 72.74% <ø> (ø)
web 54.34% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
master/internal/rm/multirm/multirm.go 65.23% <100.00%> (+1.56%) ⬆️
master/internal/rm/resource_manager_iface.go 70.00% <ø> (ø)
...ster/internal/rm/agentrm/agent_resource_manager.go 48.47% <0.00%> (-0.11%) ⬇️
...nal/rm/dispatcherrm/dispatcher_resource_manager.go 18.64% <0.00%> (-0.02%) ⬇️
...nal/rm/kubernetesrm/kubernetes_resource_manager.go 27.59% <0.00%> (-0.09%) ⬇️
master/internal/configpolicy/task_config_policy.go 85.71% <85.71%> (ø)
...ternal/configpolicy/postgres_task_config_policy.go 88.46% <86.36%> (-0.83%) ⬇️
master/internal/command/command_job_service.go 0.00% <0.00%> (ø)
master/internal/experiment_job_service.go 5.45% <0.00%> (-2.05%) ⬇️

... and 7 files with indirect coverage changes

@kkunapuli kkunapuli marked this pull request as ready for review September 18, 2024 21:17
@kkunapuli kkunapuli requested review from a team as code owners September 18, 2024 21:17
@kkunapuli kkunapuli requested review from jesse-amano-hpe and amandavialva01 and removed request for jesse-amano-hpe September 18, 2024 21:17
}
}()

// No limit set.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// No global priority NTSC limit set

return priorityWithinLimit(priority, limit, smallerHigher), nil
}

// No priority limit has been set.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice comments throughout this function!


// SmallerValueIsHigherPriority returns true if smaller priority values indicate a higher priority level.
func (m *DispatcherResourceManager) SmallerValueIsHigherPriority() (bool, error) {
return false, fmt.Errorf("priority not implemented")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to confirm, does this mean that Determined doesn't support setting job priority on slurm clusters?


// SmallerValueIsHigherPriority returns true if smaller priority values indicate a higher priority level.
func (m *MultiRMRouter) SmallerValueIsHigherPriority() (bool, error) {
return false, nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to confirm, are we setting this to false because multiRM implies the cluster must be all k8s RMs, and k8s RMs use higher value -> higher priority?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes; does that approach sound reasonable to you?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeaa I like this a lot! Super clean way to handle this Kristine!
Can we add a comment mentioning this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, great idea!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was writing a comment, and felt bad about stating that kind of assumption so I rewrote it to call the underlying RM. 😅

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh lol comments keep us honest 😆. FWIW, I do think you initial method of returning false was clean as well, but it's definitely more intuitive / easy to understand from an outside perspective to call the underlying RM, nice work!

wkspIDDoesNotExist := 404
_, found, err = GetPriorityLimit(ctx, &wkspIDDoesNotExist, model.NTSCType)
require.NoError(t, err)
require.False(t, found)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I like the graceful exist for non-existent workspaces

Copy link
Contributor

@amandavialva01 amandavialva01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome job with the code and the test cases! 🎉 Left a few comments

Copy link
Contributor

@amandavialva01 amandavialva01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Awesome work with this

@kkunapuli kkunapuli merged commit ac8fbf6 into main Sep 26, 2024
81 of 94 checks passed
@kkunapuli kkunapuli deleted the kunapuli/manage-job-tcps2 branch September 26, 2024 14:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants