fix(example-get-started): track the whole eval dir for simplicity #251

shcheklein · 2023-09-08T01:18:47Z

Fixes #249

Simplify eval stage by tracking the whole eval directory with metrics, plots, etc.

Probably next step should be to get rid of the custom name, use log_artifact for models.

Tested it here https://github.com/shcheklein/example-get-started

shcheklein · 2023-09-08T01:21:09Z

example-get-started/code/.github/workflows/cml.yaml

@@ -27,6 +27,7 @@ jobs:
            git fetch origin main:main
          fi

+          dvc pull eval


I'm not sure why we need this now (seems a regression, w/o it metrics are not shown in the dvc exp diff below) cc @dberenbaum . Also, I'm not sure I understand why this command is enough to get both revisions.

Are you sure dvc-tracked metrics were ever pulled automatically? Up until now, we so often assumed metrics were git-tracked that I'm not sure we ever paid enough attention to this.

I guess we need to fix this if we plan to start recommending dvc-tracked metrics

Let's open an issue after this is deployed since it should be easy to reproduce from the deployed repo

no, I'm not sure at all. Agreed that we need to fix this. Becomes a priority I guess. Since we always though had to cache them I would not be surprised if at some point we were fetching them.

One more thought: haven't had a chance to test, but it might work with stage-level metrics but not top-level metrics.

@skshetry WDYT about adding this to iterative/dvc#9722?

dberenbaum · 2023-09-08T18:56:19Z

example-get-started/code/README.md

-Run [`dvc repro`](https://man.dvc.org/repro) to reproduce the
-[pipeline](https://dvc.org/doc/commands-reference/pipeline):
+Run [`dvc exp run`](https://man.dvc.org/exp/run) to reproduce the
+[pipeline](https://dvc.org/doc/user-guide/pipelines) and create a new
+[experiment](https://dvc.org/doc/user-guide/experiment-management).

 ```console
-$ dvc repro
-Data and pipelines are up to date.
+$ dvc exp run
+Ran experiment(s): rapid-cane
+Experiment results have been applied to your workspace.


No strong opinion on which we recommend here, but we still use dvc repro through the docs.

yes, I plan to change docs also when this is deployed - wdyt? (dvc repro becomes more of a low level command)

I go back and forth 😄 . Now I'm slightly in favor of keeping repro in this trail. Since we add one stage at a time, dvc exp run is a bit awkward to run after each stage is added. It's also unnecessarily heavy and the name doesn't fit the initial workflow we introduce very well.

Most importantly, I don't think it is that impactful either way, so not something worth blocking.

okay, yes. I guess you are right. It can be ultimately a mix dvc repro to do iterations on the pipeline, but dvc exp run when everything is ready for that, wdyt?

I'm not sure there's that much benefit to introduce a new command there, but it might make sense there, or as a "next step." Again, not a blocker for me, and probably better to discuss in a docs PR anyway where we can see how it actually looks.

dberenbaum · 2023-09-08T18:59:14Z

example-get-started/code/src/evaluate.py

@@ -100,7 +100,7 @@ def main():
        test, _ = pickle.load(fd)

    # Evaluate train and test datasets.
-    with Live(EVAL_PATH, cache_images=True, dvcyaml=False) as live:
+    with Live(EVAL_PATH, dvcyaml=False, report=None) as live:


Very minor, but we could probably drop report=None since it at worst will be an extra file and report=None will be the default in dvclive 3.0.

yes, I'm following that ticket . I can change it when it's merged

dberenbaum

Thanks @shcheklein!

shcheklein · 2023-09-09T22:50:26Z

@dberenbaum another downside of tracking metrics is that we can't show a badge now:

not a deal breaker, but a bit sad. I would say metrics file is really small, it would be nice to have it in the Git history

shcheklein requested a review from dberenbaum September 8, 2023 01:18

shcheklein force-pushed the fix-249 branch from 46fddac to a394e8e Compare September 8, 2023 01:19

shcheklein temporarily deployed to aws September 8, 2023 01:19 — with GitHub Actions Inactive

shcheklein commented Sep 8, 2023

View reviewed changes

fix(example-get-started): track the whole eval dir for simplicity

fe38e39

shcheklein force-pushed the fix-249 branch from a394e8e to fe38e39 Compare September 8, 2023 01:39

shcheklein temporarily deployed to aws September 8, 2023 01:39 — with GitHub Actions Inactive

dberenbaum reviewed Sep 8, 2023

View reviewed changes

dberenbaum approved these changes Sep 8, 2023

View reviewed changes

shcheklein merged commit 5aee50b into master Sep 9, 2023
1 check passed

shcheklein deleted the fix-249 branch September 9, 2023 19:35

shcheklein mentioned this pull request Sep 10, 2023

feat(get-started): cache the whole eval directory iterative/dvc.org#4851

Merged

dberenbaum mentioned this pull request Sep 11, 2023

Cache dvclive directory iterative/dvclive#703

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(example-get-started): track the whole eval dir for simplicity #251

fix(example-get-started): track the whole eval dir for simplicity #251

shcheklein commented Sep 8, 2023 •

edited

Loading

shcheklein Sep 8, 2023

dberenbaum Sep 8, 2023

dberenbaum Sep 8, 2023

dberenbaum Sep 8, 2023

shcheklein Sep 8, 2023

dberenbaum Sep 8, 2023 •

edited

Loading

dberenbaum Sep 8, 2023

dberenbaum Sep 8, 2023

shcheklein Sep 8, 2023

dberenbaum Sep 8, 2023

shcheklein Sep 8, 2023

dberenbaum Sep 8, 2023

dberenbaum Sep 8, 2023

shcheklein Sep 8, 2023

dberenbaum left a comment

shcheklein commented Sep 9, 2023

fix(example-get-started): track the whole eval dir for simplicity #251

fix(example-get-started): track the whole eval dir for simplicity #251

Conversation

shcheklein commented Sep 8, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dberenbaum Sep 8, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dberenbaum left a comment

Choose a reason for hiding this comment

shcheklein commented Sep 9, 2023

shcheklein commented Sep 8, 2023 •

edited

Loading

dberenbaum Sep 8, 2023 •

edited

Loading