Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get Started: Model Management #4883

Merged
merged 101 commits into from
Nov 16, 2023
Merged
Show file tree
Hide file tree
Changes from 99 commits
Commits
Show all changes
101 commits
Select commit Hold shift + click to select a range
f55a2a2
start with model management - skeleton
tibor-mach Sep 27, 2023
76f7331
Restyled by prettier (#4884)
restyled-io[bot] Sep 27, 2023
36053ac
placeholder gif
tibor-mach Sep 27, 2023
6c50cc7
prioritize DVCLive in MR getting started
tibor-mach Sep 27, 2023
b55bbbb
Restyled by prettier (#4886)
restyled-io[bot] Sep 27, 2023
0ce6414
mr guide in sidebar
tibor-mach Sep 27, 2023
53da935
updated model adding
tibor-mach Oct 3, 2023
1af6653
Restyled by prettier (#4896)
restyled-io[bot] Oct 3, 2023
f707d0f
merge conflict
tibor-mach Oct 4, 2023
df08842
Restyled by prettier (#4897)
restyled-io[bot] Oct 4, 2023
8677c1e
Merge branch 'main' into getting-started-mr
tibor-mach Oct 4, 2023
7810642
updated MR chapter
tibor-mach Oct 6, 2023
56558a6
Restyled by prettier (#4900)
restyled-io[bot] Oct 6, 2023
afccbe2
Merge branch 'main' into getting-started-mr
tibor-mach Oct 6, 2023
20aa683
removed duplicate section
tibor-mach Oct 6, 2023
3393725
downloading models docs
tibor-mach Oct 9, 2023
ffea603
Restyled by prettier (#4906)
restyled-io[bot] Oct 9, 2023
a95a112
Merge branch 'main' into getting-started-mr
tibor-mach Oct 9, 2023
5cd264b
cicd guide
tibor-mach Oct 19, 2023
5526ae4
Restyled by prettier (#4937)
restyled-io[bot] Oct 19, 2023
a2390d0
Merge branch 'main' into getting-started-mr
tibor-mach Oct 19, 2023
5bf53a4
bugfix
tibor-mach Oct 19, 2023
c9be325
Restyled by prettier (#4938)
restyled-io[bot] Oct 19, 2023
c1d9343
fixed sidebar link
tibor-mach Oct 19, 2023
3a8d057
typo
tibor-mach Oct 19, 2023
906db15
more links
tibor-mach Oct 19, 2023
05888f1
Restyled by prettier (#4940)
restyled-io[bot] Oct 19, 2023
e8e8319
typos
tibor-mach Oct 19, 2023
f040cc5
Restyled by prettier (#4941)
restyled-io[bot] Oct 19, 2023
c6b0185
updates based on review
tibor-mach Oct 20, 2023
3559c02
tweaking text about deployment workflow
tibor-mach Oct 20, 2023
2582109
Restyled by prettier (#4946)
restyled-io[bot] Oct 20, 2023
9b335b8
Merge branch 'main' into getting-started-mr
tibor-mach Oct 20, 2023
cbca2a3
addressing comments - version with dvc exp run
tibor-mach Oct 23, 2023
60e706b
Restyled by prettier (#4951)
restyled-io[bot] Oct 23, 2023
b13fb42
hide follow along steps
tibor-mach Oct 23, 2023
40f8270
Restyled by prettier (#4952)
restyled-io[bot] Oct 23, 2023
2f51742
Merge branch 'main' into getting-started-mr
tibor-mach Oct 23, 2023
070f498
updated model download options
tibor-mach Oct 27, 2023
fcf2760
Restyled by prettier (#4957)
restyled-io[bot] Oct 27, 2023
2b30f37
Merge branch 'main' into getting-started-mr
tibor-mach Oct 27, 2023
fb5330b
Update content/docs/start/model-management/model-registry.md
tibor-mach Oct 31, 2023
cd140d7
Update content/docs/start/model-management/model-registry.md
tibor-mach Oct 31, 2023
17f4f54
Update content/docs/start/model-management/model-registry.md
tibor-mach Oct 31, 2023
7be21d9
Update content/docs/start/model-management/model-registry.md
tibor-mach Oct 31, 2023
0261bc7
Update content/docs/start/model-management/model-registry.md
tibor-mach Oct 31, 2023
92a40ae
Update content/docs/start/model-management/model-registry.md
tibor-mach Oct 31, 2023
1bfbe3c
Update content/docs/start/model-management/model-registry.md
tibor-mach Oct 31, 2023
81a4c94
Update content/docs/start/model-management/model-registry.md
tibor-mach Oct 31, 2023
ac625ec
Update content/docs/start/model-management/model-registry.md
tibor-mach Oct 31, 2023
2af41cb
Update content/docs/start/model-management/model-registry.md
tibor-mach Oct 31, 2023
606eee3
Update content/docs/start/model-management/model-registry.md
tibor-mach Oct 31, 2023
f28018b
Update content/docs/start/model-management/model-cicd.md
tibor-mach Oct 31, 2023
1906216
Update content/docs/start/model-management/model-cicd.md
tibor-mach Oct 31, 2023
4793dac
Update content/docs/start/model-management/model-cicd.md
tibor-mach Oct 31, 2023
468bf8f
Update content/docs/start/model-management/model-cicd.md
tibor-mach Oct 31, 2023
73b999b
Update content/docs/start/model-management/model-cicd.md
tibor-mach Oct 31, 2023
bfcf4d9
Merge branch 'main' into getting-started-mr
tibor-mach Oct 31, 2023
39b0cd0
Restyled by prettier (#4963)
restyled-io[bot] Oct 31, 2023
7bcf4ac
Update content/docs/start/model-management/model-registry.md
tibor-mach Oct 31, 2023
e225c95
Update content/docs/start/model-management/model-registry.md
tibor-mach Oct 31, 2023
7187bb1
Update content/docs/start/model-management/model-registry.md
tibor-mach Oct 31, 2023
61962fc
Update content/docs/start/model-management/model-cicd.md
tibor-mach Oct 31, 2023
a78dae1
Merge branch 'main' into getting-started-mr
tibor-mach Oct 31, 2023
196ce07
some updates to mr chapter
tibor-mach Nov 1, 2023
600bd50
Merge branch 'main' into getting-started-mr
tibor-mach Nov 1, 2023
2181eb1
fix links
tibor-mach Nov 1, 2023
3bf70e6
Restyled by prettier (#4966)
restyled-io[bot] Nov 1, 2023
0f6b77b
clearer instructions for MR
tibor-mach Nov 1, 2023
13ea229
Restyled by prettier (#4967)
restyled-io[bot] Nov 1, 2023
37cc464
typos
tibor-mach Nov 1, 2023
bfff6ca
Restyled by prettier (#4968)
restyled-io[bot] Nov 1, 2023
a13e851
removed experiment training from mr guide
tibor-mach Nov 1, 2023
61863e7
Restyled by prettier (#4969)
restyled-io[bot] Nov 1, 2023
bdce8ec
updated images
tibor-mach Nov 1, 2023
ab8fd34
Restyled by prettier (#4971)
restyled-io[bot] Nov 1, 2023
aed3431
Update content/docs/start/model-management/model-cicd.md
tibor-mach Nov 1, 2023
6e386dc
CICD steps
tibor-mach Nov 1, 2023
bd081bb
Restyled by prettier (#4972)
restyled-io[bot] Nov 1, 2023
b149338
typos
tibor-mach Nov 1, 2023
3ae489f
upper-case git
tibor-mach Nov 1, 2023
b5c7c3c
Apply suggestions from code review
tibor-mach Nov 3, 2023
1f6c39f
suggestions
tibor-mach Nov 3, 2023
abc94d3
Restyled by prettier (#4974)
restyled-io[bot] Nov 6, 2023
82c7cd8
cicd guide update
tibor-mach Nov 16, 2023
62c85dd
Restyled by prettier (#4988)
restyled-io[bot] Nov 16, 2023
02ae32f
minor edits to cicd docs
tibor-mach Nov 16, 2023
5521a09
Restyled by prettier (#4989)
restyled-io[bot] Nov 16, 2023
63f6068
Merge branch 'main' into getting-started-mr
tibor-mach Nov 16, 2023
41ec307
higher quality gifs
tibor-mach Nov 16, 2023
e7af78b
typo
tibor-mach Nov 16, 2023
95967d1
trying out webm
tibor-mach Nov 16, 2023
a8f4056
Revert "trying out webm"
tibor-mach Nov 16, 2023
c752e19
trying html
tibor-mach Nov 16, 2023
f561653
replace gifs with webm
tibor-mach Nov 16, 2023
f559c4a
cleanup of gifs
tibor-mach Nov 16, 2023
c0cd442
cleanup of images
tibor-mach Nov 16, 2023
ca7f444
Apply suggestions from code review
tibor-mach Nov 16, 2023
7700f9d
Restyled by prettier (#4991)
restyled-io[bot] Nov 16, 2023
71c67f2
updated cicd workflow template
tibor-mach Nov 16, 2023
e87d4c3
Restyled by prettier (#4992)
restyled-io[bot] Nov 16, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions content/docs/sidebar.json
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,21 @@
"slug": "experiment-pipelines"
}
]
},
{
"label": "Model Management",
"slug": "model-management",
"source": "model-management/index.md",
"children": [
{
"label": "Model registry",
"slug": "model-registry"
},
{
"label": "Using and deploying models",
"slug": "model-cicd"
}
]
}
]
},
Expand Down
12 changes: 9 additions & 3 deletions content/docs/start/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,9 +62,10 @@ Now you're ready to DVC!

## Following This Guide

To help you understand and use DVC better, consider the following two use-cases:
**data management** and **experiment tracking**. You may pick either one to
start learning about how DVC helps you "solve" that scenario!
To help you understand and use DVC better, consider the following three
use-cases: **data management**, **experiment tracking** and **model
management**. You may pick any to start learning about how DVC helps you "solve"
that scenario!

Choose a trail to jump into its first chapter:

Expand All @@ -76,8 +77,13 @@ Choose a trail to jump into its first chapter:
by only instrumenting your code, and collaborate on ML experiments like
software engineers do for code.

- **[Model Management]** - Use the DVC model registry to manage the lifecycle of
your models in an auditable way. Easily access your models and integrate your
model registry actions into CICD pipelines to follow GitOps best practices.

[Data Management]: /doc/start/data-management/data-versioning
[Experiment Management]: /doc/start/experiments/experiment-tracking
[Model Management]: /doc/start/model-management/model-registry

<admon type="tip">

Expand Down
42 changes: 42 additions & 0 deletions content/docs/start/model-management/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
---
title: 'Get Started: Model Management'
description:
'Get started with DVC model management. Use the DVC model registry to manage
the lifecycle of your models in an auditable way. Easily access your models
and integrate your model registry actions into CICD pipelines to follow GitOps
best practices.'
---

# Get Started: Model Management

## Chapters

- **[Model registry]** - Set up a git-based model registry with DVC to track and
manage models, their versions and lifecycle stages.

- **[Using and deploying models]** - Easily download your models from the model
registry. Set up your CICD pipelines to be trigger by model registry actions
(such as assigning model stages) and deploy models directly form the model
registry.

[model registry]: /doc/start/model-management/model-registry
[Using and deploying models]: /doc/start/model-management/model-cicd

<admon type="tip">

These are captured in our [example-get-started-experiments] repo (you can [fork
it][example-get-started-experiments-fork] to follow along).

[example-get-started-experiments]:
https://github.com/iterative/example-get-started-experiments
[example-get-started-experiments-fork]:
https://github.com/iterative/example-get-started-experiments/fork

</admon>

## Where To Go Next

Pick a page from the list above, the left-side navigation bar, or just click
`NEXT` below!

Click [here](/doc/start/) to jump back to the Get Started landing page.
235 changes: 235 additions & 0 deletions content/docs/start/model-management/model-cicd.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,235 @@
---
title: 'Get Started: Using and deploying models'
description:
'Easily download your models from the model registry. Set up your CICD
pipelines to be trigger by model registry actions (such as assigning model
stages) and deploy models directly form the model registry.'
---

# Get Started: Using and Deploying Models

In the [model registry chapter](/doc/start/model-management/model-registry) we
registered the model in the model registry and assigned it to some lifecycle
stages. In this chapter, we will learn how to access and use models and how to
use the model registry to trigger automated CICD model workflows.

If you are using the example repository, the models are already versioned on a
publicly readable DVC remote so you can access the model from there and use it.
If you are instead using your own repository you need to set up your own DVC
remote and push the data (including models) there. Have a look at our
[Data management guide](/doc/start/data-management/data-versioning#configuring-a-remote)
to see how this is done.

## Downloading models

It is useful to download model artifacts for example for local testing or for
use in CICD workflows. With models versioned by DVC this can be done easily by
using the Studio UI.

Go to the detailed view of your model, select the desired model version under
the "Version info" and then click on the "Access model" button.

Studio will present you with several ways of downloading models - with the CLI,
in Python code and directly from your web browser. You can see all the web
browser download steps here:

<video width="99%" height="540" autoplay loop muted>
<source src="/img/mr-studio-download-model.webm" type="video/webm">
</video>

And here's how to do it with the CLI:

First, configure the
[DVC Studio Access Token](https://dvc.org/doc/studio/user-guide/account-and-billing#studio-access-token)
(this only needs to be done once):

```console
dvc config --global studio.token <your Studio token>
tibor-mach marked this conversation as resolved.
Show resolved Hide resolved
```

Now you can use the following command to download the model:

```bash
dvc artifacts get https://github.com/<user>/example-get-started-experiments pool-segmentation
```

Here you just need to replace `<user>` with your GitHub user. This will download
the latest version of the `pool-segmentation` model from the DVC Remote
associated with the Git repository in the URL. You can also specify a different
artifact version or a model registry stage. See the `dvc artifacts get`
documentation for all options.

If you don't have a Studio account at all, you can still use `dvc artifacts get`
to download models, but you will need to provide the correct Git and DVC Remote
credentials manually. You can see more details in the
[documentation](/doc/command-reference/artifacts/get#description).

## Connecting model registry actions to your CICD

As we [noted](/docs/start/model-management/model-registry#GTO-tip) in the model
registry chapter, all DVC model registry actions are captured in your Git
repository as Git tags with a specific format.

This also means that we can create CICD actions in our Git repository which will
be triggered whenever versions are registered or stages are assigned.

In the following, we will have a look at an example CICD workflow on GitHub
which runs whenever we assign a version of our model to the "prod" stage in the
model registry. The workflow simulates model deployment without the need to
actually set up a deployment environment (so that you can test it easier) but it
does include all the ingredients needed in an actual deployment job or any other
CICD action.

<admon type="tip">

To see a real-world model deployment example you can check out a
[similar workflow in our example repository](https://github.com/iterative/example-get-started-experiments/blob/main/.github/workflows/deploy-model-sagemaker.yml)
which deploys a specific version of the model to an Amazon Sagemaker endpoint
for inference whenever it is assigned to a stage.

</admon>

Go to the `.github/workflows/deploy-model-template.yml`. This is the file that
GitHub uses to run our CICD workflow. You can see
[runs of this workflow](https://github.com/iterative/example-get-started-experiments/actions/workflows/deploy-model-template.yml)
tibor-mach marked this conversation as resolved.
Show resolved Hide resolved
in our example repository.

At the beginning of the workflow file you will see this code

```yaml
on:
# the workflow is triggered whenever a tag is pushed to the repository
push:
tags:
- '*'
```

The code tells GitHub to run the workflow every time a tag is pushed to the
repository.

This means that the workflow will run whenever we run model registry actions,
but we also want it to limit to specific ones for our specific workflow. That's
where our GTO GitHub action comes into play - in the "parse" job of our workflow
it parses all tags and if they are GTO tags, it gives us the name of the model,
its version, stage (if any) and the event in the model registry.

This is captured in the "parse" job which you can simply copy and paste into
most CICD jobs of your own.

```yaml
# This job parses the git tag with the GTO GitHub Action to identify model registry actions
parse:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: 'Parse GTO tag'
id: gto
uses: iterative/gto-action@v2
outputs:
event: ${{ steps.gto.outputs.event }}
name: ${{ steps.gto.outputs.name }}
stage: ${{ steps.gto.outputs.stage }}
version: ${{ steps.gto.outputs.version }}
```

<admon type="tip">

If you are not using GitHub or if you don't want to use the GTO GitHub Action
you can also use GTO directly with the
[gto check-ref](/doc/gto/command-reference/check-ref) command.

</admon>

The next job called "deploy-model" actually performs the action. First, it uses
the outputs of the parse job and checks whether the action should be performed.
If the tag was produced by the model registry and if the corresponding action
was assignment to the "prod" stage, it proceeds with the rest of the workflow.

```yaml
deploy-model:
needs: parse
if:
${{ needs.parse.outputs.event == 'assignment' && needs.parse.outputs.stage
== 'prod' }}
```

The next step of the workflow sets up DVC (using a GitHub Action, but this can
also be done manually, for example with pip).

This allows us to run `dvc artifacts get` in the last step of the workflow to
download the correct version of the model which can then be deployed or
otherwise used in our CICD.

```yaml
steps:
- uses: iterative/setup-dvc@v1
- name: Get Model For Deployment
run: |
dvc config --global studio.token ${{ secrets.DVC_STUDIO_TOKEN }}
dvc artifacts get ${{ github.server_url }}/${{ github.repository }} ${{ needs.parse.outputs.name }} --rev ${{ needs.parse.outputs.version }}
echo "The right model is available and you can use the rest of this command to deploy it. Good job!"
```

Here, we are using the outputs of the `parse` job to specify the correct model
version. We are then setting up the DVC Studio token which we stored in our
GitHub repository as a
[secret](https://docs.github.com/en/actions/security-guides/using-secrets-in-github-actions)
to manage authentication with the
[DVC remote storage](https://dvc.org/doc/user-guide/data-management/remote-storage#remote-storage).
This way we only need to keep the Studio token saved on GitHub and let Studio
manage the specific storage credentials for us.

Finally, `github.server_url` and `github.repository` are
[default environmental variables in GitHub](https://docs.github.com/en/actions/learn-github-actions/contexts#github-context)
which together form the URL of our repository on GitHub. We could of course also
specify the URL manually.

If you don't use Studio, you can still use `dvc artifacts get` but you will need
to keep your remote storage credentials on GitHub and use them to configure DVC
in the CICD workflow. You will also need to checkout the repository in the
workflow. You can see more details in the
[documentation](/doc/command-reference/artifacts/get#description).

You can now use the following template to create your own Model Registry CICD
actions on GitHub!

```yaml
name: Template CICD Action for DVC Model Registry

on:
# the workflow is triggered whenever a tag is pushed to the repository
push:
tags:
- "*"

jobs:

# This job parses the git tag with the GTO GitHub Action to identify model registry actions
parse:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: "Parse GTO tag"
id: gto
uses: iterative/gto-action@v2
outputs:
event: ${{ steps.gto.outputs.event }}
name: ${{ steps.gto.outputs.name }}
stage: ${{ steps.gto.outputs.stage }}
version: ${{ steps.gto.outputs.version }}

perform-action-with-model:
needs: parse
if: # here paste the model registry event condition you want to trigger your action
steps:
- uses: iterative/setup-dvc@v1
# this step uses DVC to download the model artifact from our remote repository so we can perform CICD actions with it
# The DVC Studio token is used to avoid having to store specific remote storage credentials on GitHub
- name: Get Model For Deployment
run: |
dvc config --global studio.token ${{ secrets.DVC_STUDIO_TOKEN }}
dvc artifacts get ${{ github.server_url }}/${{ github.repository }} ${{ needs.parse.outputs.name }} --rev ${{ needs.parse.outputs.version }}
tibor-mach marked this conversation as resolved.
Show resolved Hide resolved

...

```
Loading
Loading