Skip to content

Commit

Permalink
merge master into branch
Browse files Browse the repository at this point in the history
Squashed commit of the following:

commit 3b07678
Author: Hamid Zare <[email protected]>
Date:   Mon Jun 8 16:16:48 2020 -0700

    feat: add task list page route and placeholder [DET-3220] (determined-ai#636)

commit 4c2d0a6
Author: Hamid Zare <[email protected]>
Date:   Mon Jun 8 16:15:23 2020 -0700

    feat: remember last logged in username [DET-3274] (determined-ai#660)

commit 18c8125
Author: Hamid Zare <[email protected]>
Date:   Mon Jun 8 13:25:46 2020 -0700

    refactor: set up experiments context [DET-3255] (determined-ai#640)

commit 5e5b188
Author: Neil Conway <[email protected]>
Date:   Mon Jun 8 12:49:34 2020 -0700

    chore: add license to pip metadata (determined-ai#669)

    Without this change, `pip show determined-cli | grep License` returns:

        License: UNKNOWN

commit 05aa3d2
Author: Brian Friedenberg <[email protected]>
Date:   Mon Jun 8 11:53:17 2020 -0700

    feat: support TF Keras EarlyStopping callbacks [DET-3240] (determined-ai#666)

commit 4056146
Author: aaron276h <[email protected]>
Date:   Mon Jun 8 14:21:13 2020 -0400

    docs: add to FAQ how to port a TF core graph model (determined-ai#650)

commit c8bb942
Author: Brian Friedenberg <[email protected]>
Date:   Mon Jun 8 10:45:39 2020 -0700

    feat: support Estimator early stopping hooks [DET-3239] (determined-ai#661)

commit 3ab90a6
Author: Brian Friedenberg <[email protected]>
Date:   Mon Jun 8 10:44:17 2020 -0700

    test: temporarily disable AMP test since it causes NaNs (determined-ai#670)

commit 629f106
Author: Brian Friedenberg <[email protected]>
Date:   Mon Jun 8 09:22:37 2020 -0700

    feat: treat NaN metrics as an error (determined-ai#667)

    The expected behavior when we hit a NaN is to error the trial. This will restart the trial if we have not restarted max_restarts times. Before we would convert the NaN to the maximum float.

commit db76932
Author: Yoni Ben-tzur <[email protected]>
Date:   Mon Jun 8 08:50:48 2020 -0700

    fix: set auth cookie path to apply site wide (determined-ai#668)

commit 6588f77
Author: aaron276h <[email protected]>
Date:   Mon Jun 8 08:47:11 2020 -0400

    feat: decouple agent information from workloads starting tasks [DET-3178] (determined-ai#631)

    This is a step toward adding a generic resource provider interface.
    As part of this change, Tasks now receive a single `TaskAssigned` message
    when they are assigned rather than one `Assigned` message per container.
    They also start containers by sending specs back to the cluster rather
    than directly to agents.

commit f604a28
Author: Yoni Ben-tzur <[email protected]>
Date:   Fri Jun 5 16:44:11 2020 -0700

    feat: read cookies in the new API auth module (determined-ai#665)

commit 9da1063
Author: Danny Zhu <[email protected]>
Date:   Fri Jun 5 15:43:32 2020 -0700

    fix: space out WebUI plot x-axis ticks a bit more (determined-ai#658)

    The ticks could get bunched up too much with long-running experiments
    before; this just makes sure the spacing is a bit wider by enough to
    work well in reasonable cases. The ticks may now occasionally look a
    tiny bit wider than ideal with very short experiments, but it still
    works fine in practice. Similarly, a long last tick could get cut off at
    the end, so this also bumps the right side spacing correspondingly.

commit b9d9324
Author: Brian Friedenberg <[email protected]>
Date:   Fri Jun 5 15:24:54 2020 -0700

    feat: support early stopping callbacks on a validation step (determined-ai#662)

commit cfb3f51
Author: Yoni Ben-tzur <[email protected]>
Date:   Fri Jun 5 15:22:11 2020 -0700

    feat: add user auth to new api (determined-ai#649)

commit 414bfdf
Author: Hamid Zare <[email protected]>
Date:   Fri Jun 5 14:07:22 2020 -0700

    fix: set authentication failure reason synchronously. (determined-ai#659)

commit ed94d86
Author: aaron276h <[email protected]>
Date:   Fri Jun 5 16:15:25 2020 -0400

    feat: decouple agents from transmitting container status changes [DET-3174] (determined-ai#646)

    This is a step toward adding a generic resource provider interface.

commit f27146a
Author: Caleb Hoyoul Kang <[email protected]>
Date:   Fri Jun 5 13:52:03 2020 -0600

    fix: address minor login issues (determined-ai#611)

    * Increase frequency of auto login check to once per second.
    * Add catch-all routes to redirect invalid route requests to valid routes.

commit d014500
Author: Brian Friedenberg <[email protected]>
Date:   Fri Jun 5 11:47:43 2020 -0700

    revert: "revert: "feat: support stopping training in trial code [DET-3238] (determined-ai#648)" (determined-ai#654)" (determined-ai#656)

    This reverts commit 5baea6a.

commit 44a398a
Author: Caleb Hoyoul Kang <[email protected]>
Date:   Fri Jun 5 10:38:40 2020 -0600

    feat: ensure WebUI version is up to date with platform version (determined-ai#632)

commit 5baea6a
Author: Brian Friedenberg <[email protected]>
Date:   Fri Jun 5 08:55:58 2020 -0700

    revert: "feat: support stopping training in trial code [DET-3238] (determined-ai#648)" (determined-ai#654)

    This reverts commit ee1314f.

commit ee1314f
Author: Brian Friedenberg <[email protected]>
Date:   Fri Jun 5 08:02:13 2020 -0700

    feat: support stopping training in trial code [DET-3238] (determined-ai#648)

commit fa09a74
Author: Brian Friedenberg <[email protected]>
Date:   Fri Jun 5 07:46:24 2020 -0700

    ci: download protoc install to /tmp (determined-ai#653)

commit 9759ce7
Author: determined-dsw <[email protected]>
Date:   Thu Jun 4 18:49:47 2020 -0700

    docs: release notes for 0.12.5 (determined-ai#595) (determined-ai#651)

    (cherry picked from commit 071b3eb)

    Co-authored-by: Justin Chen <[email protected]>

commit 5f476df
Author: Hamid Zare <[email protected]>
Date:   Thu Jun 4 15:30:17 2020 -0700

    chore: remove yarn mentions from tests (determined-ai#635)

    we removed yarn as a dependency a while ago and shouldn't have references to it anymore.

commit 8662fda
Author: Danny Zhu <[email protected]>
Date:   Thu Jun 4 13:03:22 2020 -0700

    fix: correct filename in Elm Makefile (determined-ai#647)

    Due to a mismatch between different instances of what should've been the
    same name, the CSS file was getting built unnecessarily every time,
    considerably increasing the Elm build time.

commit 0e7ca0a
Author: Sidney Wijngaarde <[email protected]>
Date:   Thu Jun 4 12:42:41 2020 -0400

    feat: add checkpoint metadata to cli describe commands (determined-ai#645)

    [DET-3210]

commit 84e875a
Author: aaron276h <[email protected]>
Date:   Thu Jun 4 11:56:52 2020 -0400

    test: fix nightly nas and iris tf keras tests [DET-3264] (determined-ai#644)

    * docs: update NAS example to use correct gradient clipping
    * test: set random seed for nightly iris tf_keras test

commit 4ff9fa0
Author: Sidney Wijngaarde <[email protected]>
Date:   Thu Jun 4 10:47:38 2020 -0400

    feat: checkpoint metadata api (determined-ai#619)

    Adds checkpoint metadata management REST endpoints and python client methods.

    [DET-3207] [DET-3208] [DET-3209]

    PR includes an integration test covering the feature

commit cbbe117
Author: Yoni Ben-tzur <[email protected]>
Date:   Wed Jun 3 20:11:39 2020 -0700

    chore: move proto files to determined namespace (determined-ai#639)

commit fafd686
Author: Yoni Ben-tzur <[email protected]>
Date:   Wed Jun 3 17:01:33 2020 -0700

    feat: add template endpoints to new api (determined-ai#638)

commit 4bad652
Author: Brian Friedenberg <[email protected]>
Date:   Wed Jun 3 16:51:57 2020 -0700

    feat: support USER_CANCELLED exited reason (determined-ai#637)

commit d1146d3
Author: Caleb Hoyoul Kang <[email protected]>
Date:   Wed Jun 3 17:37:04 2020 -0600

    refactor: update link to support secure blank targets (determined-ai#612)

commit f71d64e
Author: Hamid Zare <[email protected]>
Date:   Wed Jun 3 15:33:15 2020 -0700

    feat: add page component [DET-3232] (determined-ai#614)

commit 25e725e
Author: aaron276h <[email protected]>
Date:   Wed Jun 3 17:25:12 2020 -0400

    feat: support gradient clipping in PyTorchTrial via callbacks (determined-ai#615)

    Breaking Change: we no longer accept gradient clipping as a special
    hyperparameter for PyTorchTrial.

commit 80e39d0
Author: Hamid Zare <[email protected]>
Date:   Wed Jun 3 12:28:48 2020 -0700

    feat: add antd breadcrumb stories [DET-3002] (determined-ai#582)

commit 5c9afa2
Author: Hamid Zare <[email protected]>
Date:   Wed Jun 3 12:27:26 2020 -0700

    feat: add activate, pause, and cancel actions to task cards [DET-2934] (determined-ai#585)

commit a3e121a
Author: aaron276h <[email protected]>
Date:   Wed Jun 3 13:34:10 2020 -0400

    feat: add end of training callback to EstimatorTrial (determined-ai#621)

    This will allow users to do any post-experiment cleanup they may need to do.

commit 8056055
Author: Shiyuan <[email protected]>
Date:   Wed Jun 3 10:24:40 2020 -0700

    feat: make agent starting period configurable [DET-3219] (determined-ai#624)

    * feat: make agent starting period configurable

    The provisioner would retry launching an agent after reaching
    a duration, which is set by max agent starting period. Previously,
    this configuration is hardcoded to 300 seconds. Now, change it to
    be in the configuration of the provisioner.

    DET-3219 #Done.

    * test: update tests for max agent starting period

    * chore: clean up provisioner code for better readability

    * chore: clean up provisioner tests for better readability

    * docs: update docs for agent starting period

    * chore: update gcp deployment tool for agent starting period

    * chore: update packaged master.yaml with agent starting period

    * chore: update aws deployment tool with agent starting perioo

commit 8fdc371
Author: Yoni Ben-tzur <[email protected]>
Date:   Wed Jun 3 10:12:13 2020 -0700

    chore: upgrade proto libraries (determined-ai#630)

commit bdfd980
Author: Danny Zhu <[email protected]>
Date:   Wed Jun 3 09:47:02 2020 -0700

    fix: correct logic for checking if a validation is the best one seen (determined-ai#601)

    Previously, every validation would be marked as the best one seen when
    `smallerIsBetter` was true. The new version avoids that and also
    separates the code a bit more cleanly into "decide whether the
    validation is best" followed by "do things accordingly".

commit f590fc3
Author: Yoni Ben-tzur <[email protected]>
Date:   Wed Jun 3 09:44:41 2020 -0700

    chore: remove container recovery (determined-ai#629)

commit a8c1bb2
Author: Yoni Ben-tzur <[email protected]>
Date:   Wed Jun 3 09:22:16 2020 -0700

    feat: add master endpoint to new api (determined-ai#627)

commit 678d53d
Author: Yoni Ben-tzur <[email protected]>
Date:   Wed Jun 3 08:31:45 2020 -0700

    chore: ignore pkg dir in proto sub project (determined-ai#628)

commit 65b5c17
Author: Brian Friedenberg <[email protected]>
Date:   Wed Jun 3 07:07:19 2020 -0700

    chore: bump version: 0.12.5.dev0 -> 0.12.6.dev0 (determined-ai#625)

commit 13c0db2
Author: Yoni Ben-tzur <[email protected]>
Date:   Tue Jun 2 21:47:53 2020 -0700

    chore: move proto to separate top level package (determined-ai#620)

    * chore: move proto to separate top level package

    * fix build issues

    * downgrade buf build for go1.13

    * build proto before publish

    * proto build before package

commit 897f2f6
Author: Shiyuan <[email protected]>
Date:   Tue Jun 2 17:29:30 2020 -0700

    revert: make agent starting period configurable [DET-3219] (determined-ai#623)

    This reverts commit 7f83e97.

commit 7f83e97
Author: Shiyuan <[email protected]>
Date:   Tue Jun 2 17:11:05 2020 -0700

    feat: make agent starting period configurable [DET-3219] (determined-ai#610)

    * feat: make agent starting period configurable

    The provisioner would retry launching an agent after reaching
    a duration, which is set by max agent starting period. Previously,
    this configuration is hardcoded to 300 seconds. Now, change it to
    be in the configuration of the provisioner.

    DET-3219 #Done.

    * test: update tests for max agent starting period

    * chore: clean up provisioner code for better readability

    * chore: clean up provisioner tests for better readability

    * docs: update docs for agent starting period

    * chore: update gcp deployment tool for agent starting period

    * chore: update packaged master.yaml with agent starting period

    * chore: update aws deployment tool with agent starting perioo

commit b01b560
Author: Armand McQueen <[email protected]>
Date:   Tue Jun 2 14:28:02 2020 -0700

    fix: read docker config file from HOME directory (determined-ai#587)

    The path to the docker config file varies depending on location of HOME directory. When an agent was run somewhere where /root is not the HOME directory, this prevented using docker credential stores.

commit e0d0447
Author: Shiyuan <[email protected]>
Date:   Tue Jun 2 13:37:48 2020 -0700

    feat: make GCP operation tracker timeout configuration [DET-3182] (determined-ai#598)

    * feat: make GCP provider operation timeout configurable

    * docs: update the cluster configuration doc for gcp operation timeout

    * docs: update gcp topic guide with operation timeout

    * docs: update default master configuration for gcp operation timeout

    * feat: update GCP terraform script for operation timeout

    * feat: update GCP deploy tool for operation timeout
  • Loading branch information
hamidzr committed Jun 9, 2020
1 parent cd5f856 commit db47652
Show file tree
Hide file tree
Showing 195 changed files with 3,228 additions and 1,588 deletions.
2 changes: 1 addition & 1 deletion .bumpversion.cfg
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[bumpversion]
current_version = 0.12.5.dev0
current_version = 0.12.6.dev0
commit = true
tag = true
tag_name = {new_version}
Expand Down
19 changes: 14 additions & 5 deletions .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ executors:
parameters:
det-version:
type: string
default: 0.12.5.dev0
default: 0.12.6.dev0

release-and-rc-filters: &release-and-rc-filters
branches:
Expand Down Expand Up @@ -111,19 +111,20 @@ commands:

install-protoc:
steps:
- run: curl -LO https://github.com/protocolbuffers/protobuf/releases/download/v3.12.1/protoc-3.12.1-linux-x86_64.zip
- run: unzip protoc-3.12.1-linux-x86_64.zip -d $HOME/.local
- run: curl -o /tmp/protoc.zip -L https://github.com/protocolbuffers/protobuf/releases/download/v3.12.1/protoc-3.12.1-linux-x86_64.zip
- run: unzip /tmp/protoc.zip -d $HOME/.local

go-get-deps:
steps:
- install-protoc
- restore_cache:
keys:
- det-go-deps-v1dev6-{{ checksum "master/go.sum" }}-{{ checksum "agent/go.sum" }}
- det-go-deps-v1dev7-{{ checksum "master/go.sum" }}-{{ checksum "agent/go.sum" }}-{{ checksum "proto/go.sum" }}
- run: make -C proto get-deps
- run: make -C master get-deps
- run: make -C agent get-deps
- save_cache:
key: det-go-deps-v1dev6-{{ checksum "master/go.sum" }}-{{ checksum "agent/go.sum" }}
key: det-go-deps-v1dev7-{{ checksum "master/go.sum" }}-{{ checksum "agent/go.sum" }}-{{ checksum "proto/go.sum" }}
paths:
- "/home/circleci/go/pkg/mod/"
elm-get-deps:
Expand Down Expand Up @@ -496,6 +497,7 @@ jobs:
at: .
- go-get-deps
- setup_remote_docker
- run: make -C proto build
- run: make package
- run: mkdir -p build/
- run: docker save -o build/master.image determinedai/determined-master:${CIRCLE_SHA1}
Expand All @@ -520,6 +522,7 @@ jobs:
at: .
- go-get-deps
- setup_remote_docker
- run: make -C proto build
- run: make package
- login-docker
- run: make -C master publish-dev
Expand All @@ -536,6 +539,7 @@ jobs:
at: .
- go-get-deps
- setup_remote_docker
- run: make -C proto build
- run: make package
- login-docker
- run: make -C master publish
Expand All @@ -553,6 +557,7 @@ jobs:
- go-get-deps
- setup_remote_docker
- login-docker
- run: make -C proto build
- run: make -C master release
- run: make -C agent release

Expand Down Expand Up @@ -694,6 +699,8 @@ jobs:
steps:
- checkout
- go-get-deps
- run: make -C proto build
- run: make -C proto check
- run: make -C master check
- run: make -C agent check

Expand All @@ -705,6 +712,7 @@ jobs:
steps:
- checkout
- go-get-deps
- run: make -C proto build
- run: make -C master build
- run: make -C agent build
- persist_to_workspace:
Expand All @@ -721,6 +729,7 @@ jobs:
steps:
- checkout
- go-get-deps
- run: make -C proto build
- run: make -C master test
- run: make -C agent test

Expand Down
7 changes: 4 additions & 3 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ get-deps:
pip install -r requirements.txt
$(MAKE) -C master $@
$(MAKE) -C agent $@
$(MAKE) -C proto $@
$(MAKE) -C webui $@

.PHONY: package
Expand All @@ -22,7 +23,7 @@ build-%:
build-docs: build-common build-harness build-cli build-deploy build-examples
$(MAKE) -C docs build
.PHONY: build-master
build-master: build-docs build-webui-elm build-webui-react
build-master: build-docs build-webui-elm build-webui-react build-proto
$(MAKE) -C master build
.PHONY: build
build: build-master build-agent
Expand All @@ -31,13 +32,13 @@ build: build-master build-agent
clean-%:
$(MAKE) -C $(subst -,/,$*) clean
.PHONY: clean
clean: clean-tools clean-common clean-harness clean-cli clean-deploy clean-examples clean-docs clean-webui clean-master clean-agent
clean: clean-tools clean-proto clean-common clean-harness clean-cli clean-deploy clean-examples clean-docs clean-webui clean-master clean-agent

.PHONY: check-%
check-%:
$(MAKE) -C $(subst -,/,$*) check
.PHONY: check
check: check-common check-harness check-cli check-deploy check-e2e_tests check-master check-agent check-webui check-examples
check: check-common check-proto check-harness check-cli check-deploy check-e2e_tests check-master check-agent check-webui check-examples

.PHONY: fmt-%
fmt-%:
Expand Down
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
0.12.5.dev0
0.12.6.dev0
4 changes: 3 additions & 1 deletion agent/go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -22,10 +22,12 @@ require (
github.com/spf13/cobra v1.0.0
github.com/spf13/pflag v1.0.5
golang.org/x/sys v0.0.0-20200509044756-6aff5f38e54f
golang.org/x/tools v0.0.0-20200502202811-ed308ab3e770
golang.org/x/tools v0.0.0-20200522201501-cb1345f3a375
gotest.tools v2.2.0+incompatible // indirect
)

replace github.com/determined-ai/determined/master => ../master

replace github.com/determined-ai/determined/proto => ../proto

replace github.com/docker/docker v1.13.1 => github.com/docker/engine v1.4.2-0.20191113042239-ea84732a7725
Loading

0 comments on commit db47652

Please sign in to comment.