Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] cmd-ref: document exp init #3015

Merged
merged 9 commits into from
Dec 7, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 24 additions & 0 deletions content/docs/command-reference/config.md
Original file line number Diff line number Diff line change
Expand Up @@ -221,6 +221,30 @@ connection settings, and configuring a remote is the way that can be done.
> hash overlaps: the hash of an external <abbr>output</abbr> could collide with
> that of a local file with different content.
### exp

This section overrides default configured workspace paths in `dvc exp init`,
that helps to avoid repeating these paths if all of your projects share a
similar structure.

The section contains following options, which are only used as a default and can
be overidden explicitly through CLI arguments or through responses in prompts
(in `--interactive` mode).

- `exp.code` - path to your source file or directory.

- `exp.data` - path to your data file or directory to track.

- `exp.models` - path to your models file or directory.

- `exp.metrics` - path to your metrics file.

- `exp.params` - path to your parameters file.

- `exp.plots` - path to your plots file or directory.

- `exp.live` - path to your dvclive outputs.

### state

> 📖 See
Expand Down
6 changes: 4 additions & 2 deletions content/docs/command-reference/exp/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,8 @@
_New in DVC 2.0 (see `dvc version`)_

A set of commands to generate and manage <abbr>experiments</abbr>:
[run](/doc/command-reference/exp/run), [show](/doc/command-reference/exp/show),
[init](/doc/command-reference/exp/init), [run](/doc/command-reference/exp/run),
[show](/doc/command-reference/exp/show),
[diff](/doc/command-reference/exp/diff),
[apply](/doc/command-reference/exp/apply),
[branch](/doc/command-reference/exp/branch),
Expand All @@ -20,7 +21,7 @@ A set of commands to generate and manage <abbr>experiments</abbr>:

```usage
usage: dvc exp [-h] [-q | -v]
{show,apply,diff,run,gc,branch,list,push,pull,remove}
{show,apply,diff,run,gc,branch,list,push,pull,remove,init}
...

positional arguments:
Expand All @@ -37,6 +38,7 @@ positional arguments:
push Push a local experiment to a Git remote.
pull Pull an experiment from a Git remote.
remove Remove local experiments.
init Initialize experiments.
Copy link
Contributor

@jorgeorpinel jorgeorpinel Nov 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uninformative and sounds like a requirement (like dvc init). How can we better describe what this does for the user? (I need to read the rest of the changes before I can suggest something.)

This comment was marked as resolved.

Copy link
Contributor

@jorgeorpinel jorgeorpinel Nov 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the help output maybe a short form of a specific desc.

Codify a project variation and run it as an experiment

p.s. and re "sounds like a requirement (like dvc init)" - can we still reconsider the name? 😅
Maybe dvc exp new

```

## Description
Expand Down
127 changes: 127 additions & 0 deletions content/docs/command-reference/exp/init.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,127 @@
# exp init

Codify project using [DVC metafiles] to run [experiments].

## Synopsis

```usage
usage: dvc exp init [-h] [-q | -v] [--run] [--interactive] [-f]
[--explicit] [--name NAME] [--code CODE]
[--data DATA] [--models MODELS] [--params PARAMS]
[--metrics METRICS] [--plots PLOTS] [--live LIVE]
[--type {default,dl}]
[command]
```

## Description

`dvc exp init` helps you quickly get started with experiments. It reduces
boilerplate for initializing [pipeline](/doc/command-reference/dag) stages in a
`dvc.yaml` file by assuming defaults about the location of your data,
[parameters](/doc/command-reference/params), source code, models,
[metrics](/doc/command-reference/metrics) and
[plots](/doc/command-reference/plots), which can be customized through config.

It also offers guided `--interactive` mode for creating a stage to be
[`exp run`](/doc/command-reference/exp/run) later. `dvc exp init` supports
creating different types of stages, eg: `dl` if you are doing deep learning,
which uses [dvclive](/doc/dvclive) to monitor and checkpoint progress during
training of machine learning models.

This command is intended to be a quick way to start running experiments. To
create more complex stages and pipeliens, use `dvc stage add`.

### The `command` argument

The `command` argument is optional, if you are using `--interactive` mode. The
`command` sent to `dvc exp init` can be anything your terminal would accept and
Comment on lines +34 to +37
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we put this under Options like for targets in https://dvc.org/doc/command-reference/repro#options (and other places I think) ? That was an initiative of @skshetry actually 🙂

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep it as it is for now, as we also need to do something for stage add/run?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why keep it as a section here when we moved it in repro ? We had lots of good reasons (which you proposed), what's different?

do something for stage add/run

Out of scope but we can create an issue.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it's better to do it at the same time for all commands. I am going with what we have right now, let's handle it later?

Copy link
Contributor

@jorgeorpinel jorgeorpinel Dec 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

p.s. we don't always make changes for all commands at once, we can take them as they come. BTW I actually like a section better than having it under Options, I'm just confused since we discussed this a lot for repro and you (and Ivan) strongly argued for putting it under Options I think 🤷 now it's unclear which one is inconsistent.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it's better to do it at the same time for all commands. I am going with what we have right now, let's handle it later?

#3071 (review)

run directly, for example a shell built-in, expression, or binary found in
`PATH`. Please remember that any flags sent after the `command` are interpreted
by the command itself, not by `dvc exp init`.

⚠️ While DVC is platform-agnostic, the commands defined in your
[pipeline](/doc/command-reference/dag) stages may only work on some operating
systems and require certain software packages to be installed.

Wrap the command with double quotes `"` if there are special characters in it
like `|` (pipe) or `<`, `>` (redirection), otherwise they would apply to
`dvc exp init` itself. Use single quotes `'` instead if there are environment
variables in it that should be evaluated dynamically. Examples:

```dvc
$ dvc exp init "./a_script.sh > /dev/null 2>&1"
$ dvc exp init './another_script.sh $MYENVVAR'
```

## Options

- `-i`, `--interactive` - prompts user for the command to execute and different
paths for tracking outputs and dependencies, unless they are provided through
arguments explicitly. Interactive mode allows users to set those locations
from default values or omit them.

- `--explicit` - `dvc exp init` assumes default location of your outputs and
dependencies (which can be overriden from the config). By using `--explicit`,
it will not use those default values while initializing experiments. In
`--interactive` mode, prompt won't set default value and all the values for
the prompt needs to be explicitly provided, or omitted.

- `--code` - override the a path to your source file or directory which your
experiment depends on. The default is `src` directory for your code.

- `--data` - override the path to your data file or directory to track, which
your experiment depends on. The default is `data` directory.

- `--params` - override the path to
[parameter dependencies](/doc/command-reference/params) which your experiment
depends on. The default parameters file name is `params.yaml`. Note that
`dvc exp init` may fail if the parameters file does not exist at the time of
the invocation, as DVC reads the file to find parameters to track for the
stage.

- `--model` - override the path to your models file or directory to track, which
your experiment produces. `dvc exp init` assumes `models` directory by
default.

- `--metrics` - override the path to metrics file to track, which your
experiment produces. Default is `metrics.json` file.

- `--plots` - override the path to plots file or directory, which your
experiment produces. The default is `plots`.

- `--live` - override the directory `path` for [DVCLive](/doc/dvclive), which
your experiment will write logs to. The default is `dvclive` directory, which
only comes to effect when used with `--type=dl`.

- `--type` - selects the type of the stage to create. Currently it provides two
different kinds of stages: `default` and `dl`. If unspecified, `default` stage
is created.

`default` stage creates a stage with `metrics` and `plots` tracked by DVC
itself, and does not track live-created artifacts (unless explicitly
specified).

`dl` stage is intended for use in deep-learning scenarios, where metrics and
plots are tracked by [dvclive](/doc/dvclive) and supports tracking progress
while training a deep-learning model with
[checkpoints](/doc/command-reference/exp/run#checkpoints).

- `-n <stage>`, `--name <stage>` - specify a custom name for the stage generated
by this command (e.g. `-n train`). By default, the name of the stage depends
on `--type` of the stage that is being created. If
`--type=default, the name of the stage will be `default`, and in case of `--type=dl`, the name of the stage will be `dl`.

Note that the stage name can only contain letters, numbers, dash `-` and
underscore `_`.

- `-f`, `--force` - overwrite an existing stage in `dvc.yaml` file without
asking for confirmation.

- `--run` - runs the experiment after initializing it.

- `-h`, `--help` - prints the usage/help message, and exit.

- `-q`, `--quiet` - do not write anything to standard output. Exit with 0 if no
problems arise, otherwise 1.

- `-v`, `--verbose` - displays detailed tracing information.
4 changes: 4 additions & 0 deletions content/docs/sidebar.json
Original file line number Diff line number Diff line change
Expand Up @@ -260,6 +260,10 @@
"label": "exp show",
"slug": "show"
},
{
"label": "exp init",
"slug": "init"
},
{
"label": "exp diff",
"slug": "diff"
Expand Down