Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

guide: best-practices section #1748

Closed
wants to merge 4 commits into from
Closed

guide: best-practices section #1748

wants to merge 4 commits into from

Conversation

imhardikj
Copy link
Contributor

@imhardikj imhardikj commented Sep 1, 2020

  • Creates the Best Practices guide (per guide: add "Best Practices" #72).
  • Also creates an Tips and Tricks doc (not in nav) for practices that ended up being too small — should we dissolve that one into notes spread among other related docs?

@shcheklein shcheklein temporarily deployed to dvc-landing-guide-best--jsgyua September 1, 2020 12:47 Inactive
@imhardikj imhardikj mentioned this pull request Sep 1, 2020
1 task
@shcheklein shcheklein temporarily deployed to dvc-landing-guide-best--jsgyua September 1, 2020 13:12 Inactive
Comment on lines 31 to 36
## Experiments and tracking parameters

You can use DVC for tuning [parameters](doc/command-reference/params), improving
target [metrics](doc/command-reference/metrics) and visualizing the changes with
[plots](doc/command-reference/plots). In the first step tune parameters in
default `params.yaml` file and reproduce the pipeline:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not bad! But per #1705 (review) a previous best practice about how to organize DVC project experiments with Git is needed. The 2 basic options are: as commits in a single branch (plus tags), and as multiple branches (one per experiment).

@jorgeorpinel jorgeorpinel changed the title [WIP] guide: Best practices and tips & trick doc [WIP] guide: best practices et al. Sep 7, 2020
@jorgeorpinel

This comment has been minimized.

@imhardikj

This comment has been minimized.

@jorgeorpinel

This comment has been minimized.

@jorgeorpinel

This comment has been minimized.

@jorgeorpinel
Copy link
Contributor

jorgeorpinel commented Sep 20, 2020

Hey @imhardikj no need to worry about this one but please take notice at all of the improvements I did in my last commit, c30116b. Hopefully lots of that makes sense to you and you can learn from it.

BTW, I fixed MANY basic grammar issues however and those I really need to emphasize we should avoid going forward. Please review them for future reference. Thanks

Comment on lines +9 to +14
## Matching source code to data

One of DVC's basic uses is to avoid a disconnection between
[revisions](https://git-scm.com/docs/revisions) of source code and
[versions](/doc/use-cases/versioning-data-and-model-files) of data. DVC replaces
large data files and directories, models, etc. with small
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't really a best practice, just an intro to data tracking with DVC. I guess it could stay here... Not sure 🤔

Comment on lines +63 to +71
## Managing and sharing large data

Traditional or cloud storage can be used to store the project's data. You can
share the entire 147 GB of your ML project, with all of its data sources,
intermediate data files, and models with others by setting up DVC
[remote storage](doc/command-reference/remote) (optional).

This way you can share models trained in a GPU environment with colleagues who
don't have access to GPUs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But what's the best practice?

Comment on lines 85 to 90
## Tracking experiments with Git

If you are training different models on your data files in the same project,
using Git commits, tags, or branches makes it easy to manage the project.

<!-- TODO: needs much elaboration! -->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto (in the TODO)

@jorgeorpinel jorgeorpinel changed the title [WIP] guide: best practices et al. guide: best-practices section Sep 20, 2020
@jorgeorpinel jorgeorpinel marked this pull request as ready for review September 20, 2020 06:57
@jorgeorpinel
Copy link
Contributor

Going to close this as stale for now... But should be able to pick it up again in a week or 2.

@jorgeorpinel jorgeorpinel deleted the guide/best-practices branch November 28, 2020 00:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants