Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Squashed commit of the following: commit 3b07678 Author: Hamid Zare <[email protected]> Date: Mon Jun 8 16:16:48 2020 -0700 feat: add task list page route and placeholder [DET-3220] (determined-ai#636) commit 4c2d0a6 Author: Hamid Zare <[email protected]> Date: Mon Jun 8 16:15:23 2020 -0700 feat: remember last logged in username [DET-3274] (determined-ai#660) commit 18c8125 Author: Hamid Zare <[email protected]> Date: Mon Jun 8 13:25:46 2020 -0700 refactor: set up experiments context [DET-3255] (determined-ai#640) commit 5e5b188 Author: Neil Conway <[email protected]> Date: Mon Jun 8 12:49:34 2020 -0700 chore: add license to pip metadata (determined-ai#669) Without this change, `pip show determined-cli | grep License` returns: License: UNKNOWN commit 05aa3d2 Author: Brian Friedenberg <[email protected]> Date: Mon Jun 8 11:53:17 2020 -0700 feat: support TF Keras EarlyStopping callbacks [DET-3240] (determined-ai#666) commit 4056146 Author: aaron276h <[email protected]> Date: Mon Jun 8 14:21:13 2020 -0400 docs: add to FAQ how to port a TF core graph model (determined-ai#650) commit c8bb942 Author: Brian Friedenberg <[email protected]> Date: Mon Jun 8 10:45:39 2020 -0700 feat: support Estimator early stopping hooks [DET-3239] (determined-ai#661) commit 3ab90a6 Author: Brian Friedenberg <[email protected]> Date: Mon Jun 8 10:44:17 2020 -0700 test: temporarily disable AMP test since it causes NaNs (determined-ai#670) commit 629f106 Author: Brian Friedenberg <[email protected]> Date: Mon Jun 8 09:22:37 2020 -0700 feat: treat NaN metrics as an error (determined-ai#667) The expected behavior when we hit a NaN is to error the trial. This will restart the trial if we have not restarted max_restarts times. Before we would convert the NaN to the maximum float. commit db76932 Author: Yoni Ben-tzur <[email protected]> Date: Mon Jun 8 08:50:48 2020 -0700 fix: set auth cookie path to apply site wide (determined-ai#668) commit 6588f77 Author: aaron276h <[email protected]> Date: Mon Jun 8 08:47:11 2020 -0400 feat: decouple agent information from workloads starting tasks [DET-3178] (determined-ai#631) This is a step toward adding a generic resource provider interface. As part of this change, Tasks now receive a single `TaskAssigned` message when they are assigned rather than one `Assigned` message per container. They also start containers by sending specs back to the cluster rather than directly to agents. commit f604a28 Author: Yoni Ben-tzur <[email protected]> Date: Fri Jun 5 16:44:11 2020 -0700 feat: read cookies in the new API auth module (determined-ai#665) commit 9da1063 Author: Danny Zhu <[email protected]> Date: Fri Jun 5 15:43:32 2020 -0700 fix: space out WebUI plot x-axis ticks a bit more (determined-ai#658) The ticks could get bunched up too much with long-running experiments before; this just makes sure the spacing is a bit wider by enough to work well in reasonable cases. The ticks may now occasionally look a tiny bit wider than ideal with very short experiments, but it still works fine in practice. Similarly, a long last tick could get cut off at the end, so this also bumps the right side spacing correspondingly. commit b9d9324 Author: Brian Friedenberg <[email protected]> Date: Fri Jun 5 15:24:54 2020 -0700 feat: support early stopping callbacks on a validation step (determined-ai#662) commit cfb3f51 Author: Yoni Ben-tzur <[email protected]> Date: Fri Jun 5 15:22:11 2020 -0700 feat: add user auth to new api (determined-ai#649) commit 414bfdf Author: Hamid Zare <[email protected]> Date: Fri Jun 5 14:07:22 2020 -0700 fix: set authentication failure reason synchronously. (determined-ai#659) commit ed94d86 Author: aaron276h <[email protected]> Date: Fri Jun 5 16:15:25 2020 -0400 feat: decouple agents from transmitting container status changes [DET-3174] (determined-ai#646) This is a step toward adding a generic resource provider interface. commit f27146a Author: Caleb Hoyoul Kang <[email protected]> Date: Fri Jun 5 13:52:03 2020 -0600 fix: address minor login issues (determined-ai#611) * Increase frequency of auto login check to once per second. * Add catch-all routes to redirect invalid route requests to valid routes. commit d014500 Author: Brian Friedenberg <[email protected]> Date: Fri Jun 5 11:47:43 2020 -0700 revert: "revert: "feat: support stopping training in trial code [DET-3238] (determined-ai#648)" (determined-ai#654)" (determined-ai#656) This reverts commit 5baea6a. commit 44a398a Author: Caleb Hoyoul Kang <[email protected]> Date: Fri Jun 5 10:38:40 2020 -0600 feat: ensure WebUI version is up to date with platform version (determined-ai#632) commit 5baea6a Author: Brian Friedenberg <[email protected]> Date: Fri Jun 5 08:55:58 2020 -0700 revert: "feat: support stopping training in trial code [DET-3238] (determined-ai#648)" (determined-ai#654) This reverts commit ee1314f. commit ee1314f Author: Brian Friedenberg <[email protected]> Date: Fri Jun 5 08:02:13 2020 -0700 feat: support stopping training in trial code [DET-3238] (determined-ai#648) commit fa09a74 Author: Brian Friedenberg <[email protected]> Date: Fri Jun 5 07:46:24 2020 -0700 ci: download protoc install to /tmp (determined-ai#653) commit 9759ce7 Author: determined-dsw <[email protected]> Date: Thu Jun 4 18:49:47 2020 -0700 docs: release notes for 0.12.5 (determined-ai#595) (determined-ai#651) (cherry picked from commit 071b3eb) Co-authored-by: Justin Chen <[email protected]> commit 5f476df Author: Hamid Zare <[email protected]> Date: Thu Jun 4 15:30:17 2020 -0700 chore: remove yarn mentions from tests (determined-ai#635) we removed yarn as a dependency a while ago and shouldn't have references to it anymore. commit 8662fda Author: Danny Zhu <[email protected]> Date: Thu Jun 4 13:03:22 2020 -0700 fix: correct filename in Elm Makefile (determined-ai#647) Due to a mismatch between different instances of what should've been the same name, the CSS file was getting built unnecessarily every time, considerably increasing the Elm build time. commit 0e7ca0a Author: Sidney Wijngaarde <[email protected]> Date: Thu Jun 4 12:42:41 2020 -0400 feat: add checkpoint metadata to cli describe commands (determined-ai#645) [DET-3210] commit 84e875a Author: aaron276h <[email protected]> Date: Thu Jun 4 11:56:52 2020 -0400 test: fix nightly nas and iris tf keras tests [DET-3264] (determined-ai#644) * docs: update NAS example to use correct gradient clipping * test: set random seed for nightly iris tf_keras test commit 4ff9fa0 Author: Sidney Wijngaarde <[email protected]> Date: Thu Jun 4 10:47:38 2020 -0400 feat: checkpoint metadata api (determined-ai#619) Adds checkpoint metadata management REST endpoints and python client methods. [DET-3207] [DET-3208] [DET-3209] PR includes an integration test covering the feature commit cbbe117 Author: Yoni Ben-tzur <[email protected]> Date: Wed Jun 3 20:11:39 2020 -0700 chore: move proto files to determined namespace (determined-ai#639) commit fafd686 Author: Yoni Ben-tzur <[email protected]> Date: Wed Jun 3 17:01:33 2020 -0700 feat: add template endpoints to new api (determined-ai#638) commit 4bad652 Author: Brian Friedenberg <[email protected]> Date: Wed Jun 3 16:51:57 2020 -0700 feat: support USER_CANCELLED exited reason (determined-ai#637) commit d1146d3 Author: Caleb Hoyoul Kang <[email protected]> Date: Wed Jun 3 17:37:04 2020 -0600 refactor: update link to support secure blank targets (determined-ai#612) commit f71d64e Author: Hamid Zare <[email protected]> Date: Wed Jun 3 15:33:15 2020 -0700 feat: add page component [DET-3232] (determined-ai#614) commit 25e725e Author: aaron276h <[email protected]> Date: Wed Jun 3 17:25:12 2020 -0400 feat: support gradient clipping in PyTorchTrial via callbacks (determined-ai#615) Breaking Change: we no longer accept gradient clipping as a special hyperparameter for PyTorchTrial. commit 80e39d0 Author: Hamid Zare <[email protected]> Date: Wed Jun 3 12:28:48 2020 -0700 feat: add antd breadcrumb stories [DET-3002] (determined-ai#582) commit 5c9afa2 Author: Hamid Zare <[email protected]> Date: Wed Jun 3 12:27:26 2020 -0700 feat: add activate, pause, and cancel actions to task cards [DET-2934] (determined-ai#585) commit a3e121a Author: aaron276h <[email protected]> Date: Wed Jun 3 13:34:10 2020 -0400 feat: add end of training callback to EstimatorTrial (determined-ai#621) This will allow users to do any post-experiment cleanup they may need to do. commit 8056055 Author: Shiyuan <[email protected]> Date: Wed Jun 3 10:24:40 2020 -0700 feat: make agent starting period configurable [DET-3219] (determined-ai#624) * feat: make agent starting period configurable The provisioner would retry launching an agent after reaching a duration, which is set by max agent starting period. Previously, this configuration is hardcoded to 300 seconds. Now, change it to be in the configuration of the provisioner. DET-3219 #Done. * test: update tests for max agent starting period * chore: clean up provisioner code for better readability * chore: clean up provisioner tests for better readability * docs: update docs for agent starting period * chore: update gcp deployment tool for agent starting period * chore: update packaged master.yaml with agent starting period * chore: update aws deployment tool with agent starting perioo commit 8fdc371 Author: Yoni Ben-tzur <[email protected]> Date: Wed Jun 3 10:12:13 2020 -0700 chore: upgrade proto libraries (determined-ai#630) commit bdfd980 Author: Danny Zhu <[email protected]> Date: Wed Jun 3 09:47:02 2020 -0700 fix: correct logic for checking if a validation is the best one seen (determined-ai#601) Previously, every validation would be marked as the best one seen when `smallerIsBetter` was true. The new version avoids that and also separates the code a bit more cleanly into "decide whether the validation is best" followed by "do things accordingly". commit f590fc3 Author: Yoni Ben-tzur <[email protected]> Date: Wed Jun 3 09:44:41 2020 -0700 chore: remove container recovery (determined-ai#629) commit a8c1bb2 Author: Yoni Ben-tzur <[email protected]> Date: Wed Jun 3 09:22:16 2020 -0700 feat: add master endpoint to new api (determined-ai#627) commit 678d53d Author: Yoni Ben-tzur <[email protected]> Date: Wed Jun 3 08:31:45 2020 -0700 chore: ignore pkg dir in proto sub project (determined-ai#628) commit 65b5c17 Author: Brian Friedenberg <[email protected]> Date: Wed Jun 3 07:07:19 2020 -0700 chore: bump version: 0.12.5.dev0 -> 0.12.6.dev0 (determined-ai#625) commit 13c0db2 Author: Yoni Ben-tzur <[email protected]> Date: Tue Jun 2 21:47:53 2020 -0700 chore: move proto to separate top level package (determined-ai#620) * chore: move proto to separate top level package * fix build issues * downgrade buf build for go1.13 * build proto before publish * proto build before package commit 897f2f6 Author: Shiyuan <[email protected]> Date: Tue Jun 2 17:29:30 2020 -0700 revert: make agent starting period configurable [DET-3219] (determined-ai#623) This reverts commit 7f83e97. commit 7f83e97 Author: Shiyuan <[email protected]> Date: Tue Jun 2 17:11:05 2020 -0700 feat: make agent starting period configurable [DET-3219] (determined-ai#610) * feat: make agent starting period configurable The provisioner would retry launching an agent after reaching a duration, which is set by max agent starting period. Previously, this configuration is hardcoded to 300 seconds. Now, change it to be in the configuration of the provisioner. DET-3219 #Done. * test: update tests for max agent starting period * chore: clean up provisioner code for better readability * chore: clean up provisioner tests for better readability * docs: update docs for agent starting period * chore: update gcp deployment tool for agent starting period * chore: update packaged master.yaml with agent starting period * chore: update aws deployment tool with agent starting perioo commit b01b560 Author: Armand McQueen <[email protected]> Date: Tue Jun 2 14:28:02 2020 -0700 fix: read docker config file from HOME directory (determined-ai#587) The path to the docker config file varies depending on location of HOME directory. When an agent was run somewhere where /root is not the HOME directory, this prevented using docker credential stores. commit e0d0447 Author: Shiyuan <[email protected]> Date: Tue Jun 2 13:37:48 2020 -0700 feat: make GCP operation tracker timeout configuration [DET-3182] (determined-ai#598) * feat: make GCP provider operation timeout configurable * docs: update the cluster configuration doc for gcp operation timeout * docs: update gcp topic guide with operation timeout * docs: update default master configuration for gcp operation timeout * feat: update GCP terraform script for operation timeout * feat: update GCP deploy tool for operation timeout
- Loading branch information