Skip to content

Releases: determined-ai/determined

0.35.1

09 Nov 01:04
Compare
Choose a tag to compare

Release Notes

0.35.1

Changelog

  • 9d4bed2 chore: bump version: 0.35.1-rc0 -> 0.35.1
  • 46b3761 fix: perf issue with too many API reqs when listing pods in all ns (#10202)
  • 5b03599 chore: bump version: 0.35.0 -> 0.35.1-rc0
  • 4182da4 chore: bump current environment image versions to 0.35.1

v0.38.0-rc8

04 Nov 17:58
bb6f140
Compare
Choose a tag to compare
v0.38.0-rc8 Pre-release
Pre-release

Release Notes

v0.38.0-rc8

Changelog

  • bb6f140 [AUTO-BACKPORT 10160] fix: maxPoolSlotCapacity bug (#10195)

v0.38.0-rc7

04 Nov 15:11
7db183e
Compare
Choose a tag to compare
v0.38.0-rc7 Pre-release
Pre-release

Release Notes

v0.38.0-rc7

Changelog

  • 7db183e [AUTO-BACKPORT 10182] docs: docs changes for searcher context removal (#10194)
  • 23f9793 [AUTO-BACKPORT 10192] fix: keras continue from cloud checkpoint (#10193)

v0.38.0-rc6

01 Nov 20:58
508d400
Compare
Choose a tag to compare
v0.38.0-rc6 Pre-release
Pre-release

Release Notes

v0.38.0-rc6

Changelog

  • 508d400 [AUTO-BACKPORT 10174] docs: update docs for non-Trial-centric world (#10186)
  • 87f5ff8 [AUTO-BACKPORT 10188] fix: include max_length in continue expconf (#10190)

v0.38.0-rc5

01 Nov 14:51
e725918
Compare
Choose a tag to compare
v0.38.0-rc5 Pre-release
Pre-release

Release Notes

v0.38.0-rc5

Changelog

  • e725918 [AUTO-BACKPORT 10183] docs: fix typos in the release note (#10185)
  • 23687db [AUTO-BACKPORT 10178] docs: known issue of tb_plugin (#10181)
  • 5427a68 [AUTO-BACKPORT 10172] fix: ban archive columns in filter for experiment/search search (#10176)
  • 88c8887 [AUTO-BACKPORT 10173] fix: client.logout() re-enables client.login() (#10177)
  • 42f74e6 [AUTO-BACKPORT 10168] chore: ignore test_e2e_longrunning tests when merging auto-backports (#10179)
  • 020fc43 [AUTO-BACKPORT 10161] fix: fix diffusion example [DET-10470] (#10169)

v0.38.0-rc4

31 Oct 18:07
c69aa68
Compare
Choose a tag to compare
v0.38.0-rc4 Pre-release
Pre-release

Release Notes

v0.38.0-rc4

Changelog

  • c69aa68 [AUTO-BACKPORT 10140] fix: set max slots and checkpoint gc policy should comply with config policies (#10167)
  • b5e6315 fix: set max slots and checkpoint gc policy should comply with config policies (#10140)
  • 8e6a658 [AUTO-BACKPORT 10105] chore: change det deploy aws's default deployment type to simple-rds (#10162)
  • 6fc6710 [AUTO-BACKPORT 10153] docs: checkpoint storage note for config policies (#10165)
  • b366f80 [AUTO-BACKPORT 10138] feat: determined_master_host and friends helm support, better defaults (#10159)

v0.38.0-rc3

30 Oct 23:38
d8afc57
Compare
Choose a tag to compare
v0.38.0-rc3 Pre-release
Pre-release

Release Notes

v0.38.0-rc3

Changelog

  • d8afc57 [AUTO-BACKPORT 10155] fix: fix iris example to use reported metric name (#10156)
  • 38ae54b [AUTO-BACKPORT 10149] fix: error message fix for duplicate model name (#10154)
  • 47ba6a9 build: INFENG-943: GoReleaser configure prerelease (#10146)

v0.38.0-rc2

28 Oct 22:32
Compare
Choose a tag to compare
v0.38.0-rc2 Pre-release
Pre-release

Release Notes

v0.38.0-rc2

Changelog

v0.38.0-rc0

28 Oct 20:19
Compare
Choose a tag to compare
v0.38.0-rc0 Pre-release
Pre-release

Release Notes

v0.38.0-rc0

Changelog

  • d7f0bbf chore: lock published urls to preserve redirects
  • e3c31f0 Temporarily disable GitHub Actions credentials.
  • 3be954b build: INFENG-938: Update version format in Makefiles (#10142)
  • 69b93b0 build: INFENG-940: Fix logic error in CircleCI config make-component job (#10143)
  • 00870f5 build: INFENG-937: Publish Helm chart release candidates (#10141)
  • 3910426 feat: remove searcher context from harness and master [MD-498] (#10131)
  • 27bebdd build: INFENG-938: Tweak version string format (#10139)
  • 30ad3c0 feat: add master configurations for access token max and default lifespans [DET-10464] (#10101)
  • 782f7a0 revert: "chore: determined_master_host and friends helm support, better defaults" (#10134)
  • 233e095 chore: add checkpoint and max slots config policy enforcements in PATCH experiment (#10125)
  • b3f928b chore: determined_master_host and friends helm support, better defaults (#10092)
  • 6755467 chore: bump Go version used by CI builds to 1.22.8 (#10127)
  • 834eeda feat: add actual select all to glide tables [ET-238] (#10081)
  • c7e0fb5 docs: add log signal release note and update docs (#10126)
  • 02fcc74 test: Add test for filtering user by Role Id (#10095)
  • f97fb5a build: INFENG-933: add GitHub action to start a minor release (#10112)
  • 685918d docs: Add aurora postgres release note (#10115)
  • a84f8c6 chore: SSO improvement feature requires Enterprise Edition. (#10124)
  • c71617c feat: Log Signal Exp Config and Monitoring (#9947)
  • 06b0b31 chore: fix merge exp flake (#10122)
  • 962810a chore: improve messaging when workspace configs conflict with global … (#10121)
  • 6158ef7 docs: Update postgres aurora info (#10116)
  • 4b0c065 docs: log policies restore exp config (#10120)
  • 186962c chore: add config policies to CLI reference docs (#10118)
  • 11ea6f4 chore: clarify version overrides during helm installs (#10094)
  • 4394f29 chore: standardize status api errors for task config policies (#10119)
  • e834302 fix: Add on delete cascade to system_metrics (#10113)
  • 3c59233 chore: populate final merged config with defaults when merging invariant configs (#10107)
  • deb3772 feat: additional APIs to support "actual select all" functions [ET-238] (#10102)
  • fd9cd8a feat: Allow master configuration for ssh key type (#10072)
  • 5e9df7c docs: Update release notes (#10114)
  • c655f33 docs: fix internal link in multi-rm docs page. (#10074)
  • e7186fe docs: Update log policies (#10098)
  • 993296b fix: update copy in experiment and trial headers (#10111)
  • d74a462 docs: Describe sso improvements (#10110)
  • 24d3390 chore: conditionally create VolumeSnapshotClass (#10103)
  • f45ebb9 chore: improve documentation surrounding slot caps helm configuration (#10090)
  • 0013fd0 ci: shorten test_pending_hpc.py (#10104)
  • 22ad457 fix: version upgrade notification bug [CM-411] (#10069)
  • 935fa66 fix: Log searche feedbacks (#10088)
  • 29a08ec Revert "docs: Describe arbitrary metadata logging" (#10099)
  • c6c476c chore: remove e2e_slurm_preemption test series (#10053)
  • e6182ed docs: Describe arbitrary metadata logging (#10073)
  • 539df5e chore: update CLI commands to work with global APIs (#10089)
  • 1f2bea0 feat: update ConfigPolicies with docs link [CM-558] (#10055)
  • 4afc15f build: INFENG-926: Fix version.sh version string output (#10085)
  • 04861dd chore: return error if workspace config violates global constraints (#10076)
  • 912f91e docs: task config policies release note (#10087)
  • 6d56101 fix: remove flake-inducing logretention global singleton (#10016)
  • b70a622 fix: correct token creation CLI to ensure it works with default expiry (#10084)
  • b155332 docs: Describe task config policies (#9969)
  • 27a014b fix: Tensorboard broken on unified install [CM-578] (#10080)
  • bdb56a4 chore: INFENG-922: use correct gh_team tag for infrastructure (#10077)
  • 91e358a INFENG-382: Release redesign (#10002)
  • 34e4749 chore: remove redundant rm.ExternalPreemptionPending interface (#10071)
  • 28bc072 feat: SSO Improvement - alter user_sessions table to include access token, implement CRUD ops, GET, POST, PATCH APIs and det token CLIs (#9867)
  • 472baf9 feat: Add copy task id to task list (#10058)
  • 2e822b7 chore: fix update invariant config and constraints (#10078)
  • d69f7cc chore(deps): bump google.golang.org/grpc from 1.64.0 to 1.64.1 (#9910)
  • e796b92 fix: run checkpoint GC more aggressively to ensure tensorboards are GC'd (#10017)
  • a14525f fix: nil deref in usage of incomplete experiment config policies (#10068)
  • 6c46a46 refactor: remove annotations requiring search ids in bulk action js (ET-241) (#10062)
  • 3ca3418 Docs: describe data files apptainer (#10020)
  • 315f65d chore: ntsc config not supported (#10056)
  • 2e8de9b test: User Management test updates [CM-468] (#10051)
  • 3fc9fed chore: experiment config slots to comply with constraint max slots (#10054)
  • 1d5c984 chore: fix slices and maps merge test (#10063)
  • 219409b chore: fix helptext for det user (#10060)
  • 7d6a1a7 docs: add k8s RP example to the helm values.yaml. (#10027)
  • 9efd96d fix: apply config policy constraints to PATCH /experiments/:id (#10048)
  • dd6aeda chore: change error code back (#10042)
  • 5a39ecb chore: check config policies on 'det notebook set priority' (#10047)
  • 2ef2f12 feat: bulk actions matching filters (ET-241) (#9895)
  • ac82b3c chore: default priority earlier to ensure constraints are satisfied [CM-553] (#10043)
  • 34557ef feat: Extend LogViewer to support scrollable search (#10005)
  • dadf75e chore: take invariant_config priority into account with manage job workflow (#10025)
  • 2356f91 chore: remove e2e_slurm_misconfigured series tests (#10023)
  • b243c26 ci: deflake test_disable_agent_zero_slots (#10040)
  • 4e0f1c4 chore: validate global, admin input against task config policies & constraints (#10028)
  • 3c1630f test: add e2e tests to the "move project" functionality on the "List View" (#10037)
  • 0613cc6 docs: revise postgres permission setup instructions. (#10039)
  • 2594d90 chore: remove e2e_slurm_gpu series tests (#10021)
  • 1f7ccad chore: exp invariant config silent override during add or update (#10019)
  • 30b197d feat: Global Config Policies UI [CM-522] (#10022)
  • c27054d feat: add e2e tests for multi-sort filter on experiments lista (#9992)
  • 9faa0cb chore: wait_for_task_state shows logs on failure (#10029)
  • a166826 fix: Workspace Projects and Tasks test flakes [CM-554] (#10026)
  • 33dfdaf test: Workspace Models tests [CM-538] (#9998)
  • 7e8dbac fix: Update action bar row layout in UserManagement page (#9862)
  • 5b1380c chore: check experiment constraints (#10018)
  • f609a2d fix: remove formatDatetime (#10011)
  • 9b6f0ac docs: Update release notes date (#9999)
  • f5400ea feat: Add regex search to task logs API (#9994)
  • ddca766 fix: correct expToWebhookConfig cache locking (#10014)
  • 80b29fa feat: Config Policies UI, Workspaces Experiments [CM-521] (#10009)
  • 262b4a9 chore: check task config policies against slots and max_slots (#10015)
  • a0cc818 ci: replace no_op fixture with a noop api (#9997)
  • 987b2a5 test: add e2e experiment list pagination test (#9993)
  • 1297899 fix: use UID not username to set HOME dir (#10010)
  • 49e72a8 chore: reword jsonschema extension docs (#9965)
  • 63d728c fix: display archived column for runs and searches (#9987)
  • 83a779e feat: check task config policy constraints before scheduling NTSC wor… (#9991)
  • 0083d7e feat: add CLI commands for config policies [CM-423] (#9911)
  • ac54cf8 ci: delete pointless test (#10004)
  • 7f88390 fix: reset settings not working properly due to url encoding (#10000)
  • 25ca6d0 fix: import missing time module (#9985)
  • 8ab2145 chore: bump version: 0.37.0-dev0 -> 0.37.1-dev0
  • 0760f74 chore: add docs dropdown link for new version
  • 23f1f30 docs: add release notes for 0.37.0 (#9995)
  • 9989475 test: Workspace Task tests [CM-476] (#9982)
  • ad66d3f chore: implement PUT APIs for task config policies (#9983)
  • 036336b docs: fix broken links (#9996)
  • ac8fbf6 chore: check task config policy priority limit for [CM-490] (#9958)
  • 8bc08e5 feat: Read and display log signal from DB (#9959)
  • c8b1910 ci: increase datagrid rightclick timeout/ reduce worker count (#9951)
  • e92c474 fix: fix default id search for runs (#9988)
  • 3ca3d30 test: increase Reactivate test step timeout (#9986)
  • bc3b2a6 fix: Reactivate User test flake (#9979)
  • f2277f1 fix: fix hf on_save raise exception (#9977)
  • dbeea99 fix: Cluster page height (#9975)
  • d02495b fix: Deactivate User test flake (#9974)
  • a8effe8 fix: show search progress in run table (#9976)
  • cf9bdc8 feat: workspace task config policies UI [CM-478] (#9950)
  • 924f663 ci: remove default arg from utils.run_command() (#9973)
  • a96c5af docs: add docstring for PyTorchContext.current_train_epoch (#9972)
  • 66f7a70 fix: grid hp samping ignored empty nests (#9966)
  • 8c4f7a0 fix: correct dataPath for hyperparameters (#9971)
  • 5c4be96 feat: add database snapshot functionality to Helm chart (#9956)
  • 31d9573 fix: show - for empty data in searches table [ET-749] (#9963)

0.37.0

30 Sep 15:28
Compare
Choose a tag to compare

Release Notes

0.37.0

Changelog

  • c415087 chore: bump version: 0.37.0-rc4 -> 0.37.0
  • 736fba6 docs: add release notes for 0.37.0 (#9995)
  • 73dee98 docs: fix broken links (#9996)
  • ecf8ac7 chore: bump version: 0.37.0-rc3 -> 0.37.0-rc4
  • 1b50305 fix: fix default id search for runs (#9988)
  • 0990c11 chore: bump version: 0.37.0-rc2 -> 0.37.0-rc3
  • a78b190 fix: fix hf on_save raise exception (#9977)
  • 0560939 fix: bring in handleEmptyCell from #9963 (#9984)
  • 7caf18a chore: bump version: 0.37.0-rc1 -> 0.37.0-rc2
  • 08d782a fix: show search progress in run table (#9976)
  • 478c78f fix: Cluster page height (#9975)
  • 2772a3c fix: correct dataPath for hyperparameters (#9971)
  • 94f2d95 chore: bump version: 0.37.0-rc0 -> 0.37.0-rc1
  • 63e7df0 chore: 0.37.0 environment images (#9967)
  • b2267d1 chore: bump version: 0.37.0-dev0 -> 0.37.0-rc0
  • f758303 chore: lock published urls to preserve redirects
  • 2a8e7dd chore: lock api state for backward compatibility check
  • 3f54d07 chore: bump version: 0.36.1-dev0 -> 0.37.0-dev0
  • baf451f chore: do not log error for resource pools with zero agents (#9960)
  • 6a8606e docs: Add hpc installation guide (#9945)
  • 3241edb fix: fix flaky generic task pause test (#9962)
  • 43556e9 fix: Remove CSS rule for hiding the Form.Item error message (#9872)
  • 5906001 perf: improve the initial page load speed (#9939)
  • eb1b0de docs: Add workload alerting (#9938)
  • cedfcfe chore: refactor and test RBAC config policies work [CM-530] (#9943)
  • 2d884b9 docs: Add cluster overview (#9936)
  • e17d12c feat: release notes and improvements for workload alerting (#9944)
  • 0db2e3b ci: deflake make slurmcluster, hopefully (#9957)
  • 95f079d feat: add GET global config policies API (#9952)
  • d943d85 chore: fix global PUT for task config policies (#9941)
  • 410edf6 fix: broken MNIST download in e2e tests (#9937)
  • 004c194 ci: fix flaky test_allocation_csv tests (#9953)
  • 88a4c67 feat: add Config Policies GET API and modify CRUD functions to accept both Workload types (#9946)
  • a73c8db test: debug auth [TESTENG-95] (#9942)
  • 13db674 test: experiment list show archived filter [ET-753] (#9932)
  • 02e302f chore: remove unused languages from code editor (#9898)
  • f6d874d docs: Replace slack links (#9919)
  • 26b0954 chore: implement Delete config policies API handlers (#9927)
  • 2d12be1 test: add projects tests [CM-467] (#9928)
  • 062cb52 fix: use different modules for Trial and Cluster topology (#9917)
  • 0928958 chore: change log level for log retention policies (#9935)
  • b559467 chore: bump coverage target (#9920)
  • 3a2ea56 fix: do not filter slots for mixed-slot-type pools (#9902)
  • a58ed7c chore: reassign RM code to CM in CODEOWNERS (#9926)
  • cb3515e fix: update LogRetentionDays from master config when master starts/upgrades (#9930)
  • 13b7b3f ci: increase timeout for k8s intg tests (#9929)
  • 6f36969 fix: flaky workspace test (#9931)
  • 867eb31 fix: update huggingface example (#9925)
  • 5b2275f fix: Refactor sorting logic in WorkspaceProjects for filtering projects (#9903)
  • fd7f77a fix: move validation dataloader check in PyTorchTrial [MD-515] (#9923)
  • db2881f chore: fix config policy unmarshal tests (#9924)
  • 3900742 chore: update test log pattern webhook cache (#9922)
  • f44687d chore: create config policies table and add NTSC CRUD operations (#9915)
  • de89f68 feat: support updating web hook url [MD-482] (#9890)
  • 02fbdbb fix: huggingface callback raise process preempted exception (#9913)
  • 8c799b8 chore: prune cruft out of no_op fixture (#9912)
  • 11de119 chore(deps): bump path-to-regexp and express in /webui/react (#9909)
  • 03961b5 test: add workspace tests (#9905)
  • c877383 fix: GetTrialRemainingLogRetentionDays should take global log retention days into account [CM-518] (#9914)
  • fb0d5f9 fix: change workspace name and set resource quota simultaneously (#9847)
  • 8fb9f6b docs: Update ROCM support (#9893)
  • 481bddb chore(deps): bump github.com/docker/docker from 24.0.9+incompatible to 25.0.6+incompatible (#9780)
  • c1499ac chore: removing model_hub references from Makefile (#9901)
  • c961dbd feat: new run object for Run Centric API (#9897)
  • bfeb418 feat: Implement custom trigger for webhooks (#9879)
  • b6eb05e chore: Remove model hub (#9869)
  • 4a28c10 chore: add unmarshal functions for task config policies (#9896)
  • d842383 fix: timezone handling error in queued allocation time update (#9892)
  • 55b3f9b test: cover project id filtering on bulk actions [ET-138] (#9870)
  • 036477b chore: stub new APIs for task config policies [CM-485] (#9880)
  • be2622a test: Delete workspace after webhook test (#9891)
  • a30bc25 feat: Add rbac for config policies (#9873)
  • 8c83d31 chore: create WorkloadType enum and Go config + constraints structs (#9885)
  • 0a18c5a fix: add backwards compatibility for Pods to Jobs for k8s <v1.27 [CM-461] (#9878)
  • 8e6bba8 ci: fix master-config syntax (#9889)
  • d5d647a fix: inconsistent timezone handling in daily allocation aggregation (#9888)
  • b4209ef test: login redirect with nested route (#9881)
  • 8cacba6 ci: add e2e bulk kill test (#9868)
  • 590c362 fix: Hf callback metric naming (#9887)
  • 61fd26b fix: reset Model Registry page number on pageload [ET-640] (#9876)
  • ce27f81 fix: show - for empty data in run table (#9871)
  • b1c0814 fix: prevent hyperparameter search modal submitting the same request multiple times (#9883)
  • d54713c fix: use new ruamel yaml APIs (#9882)
  • ad5fe5a fix: prevent out of bounds navigation on new list views (#9875)
  • a605f00 fix: reject reconnecting agents with different resource pool configuration (#9815)
  • db92bad feat: Support RBAC in webhook (#9859)
  • 0ef81aa fix: sorting by arbitrary metadata (#9874)
  • c1b7767 feat: Auto-Populate POSIX Information on sign in using SSO [CM-399] (#9755)
  • 54b6165 feat: Logic of different modes for webhook (#9865)
  • a773551 fix: allow for objects inside array metadata to be typed properly (#9864)
  • ee269c8 test: successful login with weak or strong password (#9858)
  • e21fc6f ci: pin chromadb version to avoid incompatibility (#9849)
  • a1234a1 chore: bump version: 0.36.0-dev0 -> 0.36.1-dev0
  • d79c90d chore: add docs dropdown link for new version
  • ce6da74 docs: add release notes for 0.36.0 (#9854)
  • a55af74 fix: use task sessions in Core API [MD-509] (#9860)
  • 3ee88bb fix: replace tree with code mirror for metadata view (#9853)
  • 8dd46d5 chore: Improve CompareTrials perfomance (#9807)
  • 6e08303 fix: fix error toast popping up in Workpace Creator view (#9855)
  • fb95df8 chore: add backport github action (#9835)
  • a37e6e7 fix: prevent loading issues with ipynb files (#9850)
  • 9de4f72 feat: configurable preemption timeout [MD-500] (#9833)
  • 640126b feat: Add workspaceId, mode, name to webhook (#9820)
  • d436c23 fix: reset pinned column state when resetting columns (#9852)
  • 3a91552 fix: fix fallback logic for partially provided custom logos (#9842)
  • 707ad07 Revert "chore: add tracing info to some backend APIs" (#9843)
  • 73a756a fix: update broken tensorflow & certbot links (#9846)
  • 771bbe4 ci: sequential metric count sweep test [Scale-35] (#9791)
  • 32fafdd perf: remove duplicate ids in ExpMetricNames api (#9848)
  • a8fa015 docs: Fix broken links (#9845)
  • 2b1856a fix: model version name overflow on mobile [ET-384] (#9827)
  • e13de20 docs: Document rbac editorprojectrestricted role (#9844)
  • 2838af4 chore: add tracing info to some backend APIs (#9841)
  • e3dfb0a fix: change filter form to say "Show runs" in flat runs view [ET-740] (#9840)
  • 52f2b9f chore: add release notes for PR 9822 (#9837)
  • a37d482 fix: experiment single trial tabs don't scroll on load (#9831)
  • aff486c feat: Rocm bumpenvs (#9830)
  • 13622ad feat: Add report_progress to TrainContext (#9826)
  • d831461 fix: replace rawsource attribute with node directly, due to removal of rawsource in Docutil 2.0 (#9838)
  • 7ed9e83 feat: add EOL notice regarding Aurora V1 & Postgres 12 along with Master Log warnings for Postgres <=12 [CM-413] [CM-416] (#9832)
  • 5c5f107 docs: Minor docs enhancements (#9836)