Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get started - "top funnel" changes #4460

Merged
merged 10 commits into from
Apr 13, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 13 additions & 7 deletions content/docs/sidebar.json
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,6 @@
"source": false,
"children": [
"data-versioning",
"data-and-model-access",
"data-pipelines",
{
"label": "Metrics, Parameters, and Plots",
Expand All @@ -53,13 +52,19 @@
]
},
{
"label": "Experiment Management",
"slug": "experiments",
"source": false,
"children": [
"experiment-versioning",
"experiment-management",
"building-pipelines",
"experiment-iterations"
"experiment-tracking",
{
"label": "Collaborating on Experiments",
"slug": "experiment-collaboration"
},
{
"label": "Experimenting Using Pipelines",
"slug": "experiment-pipelines"
}
]
}
]
Expand Down Expand Up @@ -129,7 +134,6 @@
"slug": "data-management",
"source": false,
"children": [
"large-dataset-optimization",
{
"slug": "remote-storage",
"source": "remote-storage/index.md",
Expand Down Expand Up @@ -161,8 +165,10 @@
]
},
"cloud-versioning",
"discovering-and-accessing-data",
"importing-external-data",
"managing-external-data"
"managing-external-data",
"large-dataset-optimization"
]
},
{
Expand Down
4 changes: 4 additions & 0 deletions content/docs/start/data-management/data-pipelines.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,10 @@ later — exactly as they were built originally! For example, you could capture
simple ETL workflow, organize a data science project, or build a detailed
machine learning pipeline.

Later on, we will find DVC manages the execution of
[machine learning experiments](/doc/start/experiments/experiment-pipelines) on
top of these pipelines - controlling their execution, injecting parameters, etc.

## Pipeline stages

Use `dvc stage add` to create _stages_. These represent processes (source code
Expand Down
9 changes: 9 additions & 0 deletions content/docs/start/data-management/data-versioning.md
Original file line number Diff line number Diff line change
Expand Up @@ -287,6 +287,15 @@ layer. DVC in turn manipulates `.dvc` files, whose contents define the data file
versions. DVC also synchronizes DVC-tracked data in the <abbr>workspace</abbr>
efficiently to match them.

## Discovering and accessing data
omesser marked this conversation as resolved.
Show resolved Hide resolved

DVC helps you with accessing and using your data artifacts from outside of the
project where they are versioned, and your tracked data can be imported and
fetched from anywhere. For example, you may want to download a specific version
of an ML model to a deployment server or import a dataset into another project.
To learn about how DVC allows you to do this, see the
[discovering and accessing data guide](/doc/user-guide/data-management/discovering-and-accessing-data).

## Large datasets versioning

In cases where you process very large datasets, you need an efficient mechanism
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ $ dvc repro

The `-O` option here specifies an output that will not be <abbr>cached</abbr> by
DVC, and `-M` specifies a metrics file (that will also not be cached).
`dvc stage add` generates a new stage in the `dvc.yaml` file:
`dvc stage add` will generates this new stage in the `dvc.yaml` file:

```yaml
evaluate:
Expand Down Expand Up @@ -84,7 +84,7 @@ files to be versioned by Git.
</details>

[`evaluate.py`] writes the model's [ROC-AUC] and [average precision] to
`eval/live/metrics.json` (previously marked as a [metrics file] with `-M`):
`eval/live/metrics.json` (designated a [metrics file] with `-M` above):

```json
{
Expand Down Expand Up @@ -160,7 +160,7 @@ plots:
- eval/importance.png
```

To generate them, you can run `dvc plots show` (shown below), which generates an
To render them, you can run `dvc plots show` (shown below), which generates an
HTML file you can open in a browser. Or you can load your project in VS Code and
use the [DVC Extension]'s [Plots Dashboard].

Expand Down
202 changes: 0 additions & 202 deletions content/docs/start/experiments/building-pipelines.md

This file was deleted.

Original file line number Diff line number Diff line change
@@ -1,16 +1,20 @@
---
title: 'Get Started: Experiment Management'
title: 'Get Started: Experiment Collaboration'
description:
'Manage your experiments and share them with others using software engineering
best practices.'
'Share your experiments with others, persist and apply your changes using Git
branches and software engineering best practices.'
---

# Get Started: Experiment Management
# Get Started: Experiment Collaboration

After having compared all the experiments, you still need to agree on which one
is the best and manage the remaining candidates. <abbr>DVC Experiments</abbr>
are fully compatible with Git workflows, so you can manage the experiments using
software engineering best practices.
After having compared some experiments' results and parameters, you still need
to agree on which one is the best, share it, track it, and do some house keeping
on the rejected experiments.

<abbr>DVC Experiments</abbr> are fully compatible with Git workflows. You can
share, manage and collaborate on experiments and related code changes using
software engineering best practices. There is no need for a different system or
paradigm to track and version experiments.

## Sharing

Expand All @@ -19,7 +23,7 @@ Unless you have enabled
the <abbr>DVC experiments</abbr> only exist in your repo and people can't manage
or view them from other machines.

You can share an experiment with others from your machine:
To share an experiment with others from your machine:

<toggle>

Expand Down
Loading