-
Notifications
You must be signed in to change notification settings - Fork 386
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DVCLive: Revisit docs organization and content #3923
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
I think ppl will start with an existing framework (not an index page of the API). We should be prepared for that as much as we can. Let's say I use Keras and I start from the Keras integration page - how many other pages I'll have to read to get to a working project that I can understand? That's the metric we should be optimizing in the docs here. Probably, it means that we can even introduce some duplication and / or clear 1-2-3 where 2 and 3 are links to some general pages (dvc.yaml , code sinppet with DVCLive, etc). |
I was thinking people start from Get Started, but will try to also optimize for this case.
I updated Get Started and moved the old one to API Reference. I tried to make the current Get Started a step-by-step with links to more details. Do you think the step-by-step should also be present on each ML Framework page? |
so, concept of loggers is more or less clear and familiar I guess, that's why last time I was trying I went right to the Keras page ... I expected to see a simple copy-paste to get started Get started itself is good for beginners I think in this case.
I think it would benefit a lot if we make those pages self-containable + easy way to copy paste w/o jumping around and get decent results. |
Here are top entry pages for DVCLive now: Looks like frameworks and DVC integration trump the Get Started quite a bit. And for the DVCLive docs home page (top entry page) most traffic comes from Google (mainly direct tool name searches): Top queries:
Should've probably discussed and planned in iterative/dvclive#273 first (sorry I didn't notice it earlier) but let's check this PR since we have it! |
When using [DVC Checkpoints](/doc/user-guide/experiment-management/checkpoints) | ||
and/or enabling DVCLive's [`resume`](/doc/dvclive/api-reference/live#parameters) | ||
you need to add the flag | ||
[`persist: true`](/doc/user-guide/project-structure/pipelines-files#output-subfields) | ||
to all DVCLive outputs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably needs more motivation (why do this/ how is it useful) closer to what we have now in https://dvc.org/doc/dvclive/dvclive-with-dvc#checkpoints.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying to make this a concise how-to page. I would expect the first linked page to user-guide/experiment-management/checkpoints
to cover the motivation for checkpoints and people coming to this page for specific guidance on proper setup, not looking for motivation on the feature
Adding `--type checkpoint` to `dvc exp init` will take care of doing this when | ||
generating the `dvc.yaml`: | ||
|
||
```dvc | ||
$ dvc exp init \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the recommended "happy path" method? The intro above is already a bit complicated conceptually (mentioning checkpoints and output persistence), so introducing also experiments isn't ideal. If possible move it after the YAML sample as a tip/ alternative easy way to get there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the order. As commented above the scope here was a concise explanation on how to set up for resume training, not to introduce the concepts to dvclive users
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing my feedback. Sorry I missed the updates until now...
# DVCLive with DVC | ||
|
||
Even though DVCLive does not require DVC, they can integrate in a couple useful | ||
ways: | ||
|
||
- The [outputs](#outputs) DVCLive produces are recognized by `dvc exp`, | ||
`dvc metrics` and `dvc plots`. Those same outputs can be visualized in | ||
[Iterative Studio](#iterative-studio). | ||
|
||
- DVCLive is also capable of generating [checkpoint](#checkpoints) signal files | ||
used by DVC <abbr>experiments<abbr>. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This summary was nice. If we remove it, should we mention these things more in DVC docs and link to the new DVCLive how-tos?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the first bullet point is now covered in the get started https://dvc-org-dvclive-refacto-czwjzz.herokuapp.com/doc/dvclive/get-started#dvc
Regarding the second bullet point, there is a detailed page about checkpoints in DVC user guide and it already mentions/points to DVCLive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with @jorgeorpinel that I liked having a summary of all the DVC magic that DVCLive adds. Otherwise, it feels like readers aren't quite sure what DVCLive is doing.
<card href="/doc/dvclive/dvclive-with-dvc" heading="DVCLive with DVC"> | ||
Discover how DVCLive and DVC can integrate in several useful ways | ||
</card> | ||
|
||
<card href="/doc/dvclive/ml-frameworks" heading="ML Frameworks"> | ||
Use DVCLive alongside your favorite ML Framework | ||
A step-by-step introduction | ||
</card> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I get why we removed these 2 from the docs home page if that's what we want to focus on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is that dvclive-with-dvc
is now covered in Get Started and we want to drive users to Get Started as much as possible to clarify the workflow.
## Outputs | ||
|
||
After you run your training code, you should see the following content in the | ||
project: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was never sure about the usage of term "output" here. It can be confusing in the content of DVC (stage outs
). Maybe "project structure" or "resulting files" or something like that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed to Output folder structure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better, thanks. I still would remove "output", especially since the term is not even used in the contents (other than the title).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great -- only minor comments this time. Once those are addressed, I think it's mergeable.
Co-authored-by: Dave Berenbaum <[email protected]>
Co-authored-by: Restyled.io <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @daavoo! Nice improvement here.
"^/doc/dvclive/dvclive-with-dvc$ /doc/dvclive/get-started", | ||
"^/doc/dvclive/ml-frameworks$ /doc/dvclive/api-reference/ml-frameworks", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update on #3923 (comment) (relates to the redirects above):
Looks like frameworks and DVC integration trump the Get Started quite a bit.
Traffic in general is up, esp. to the DVClive's home page and somewhat to the Get Started (👍🏼). But he ML frameworks and DVC integration pages have less traffic now 🤷🏼
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
p.s. I just realized it's only been a week since this was merged so let's check again later.
<toggle> | ||
<tab title="Keras"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
<toggle> | ||
<tab title="Scalars"> | ||
<tab title="Python API"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Slightly unclear what this means: 1) All the samples are Python; 2) There's no "Python API" under ML Frameworks. Maybe link to https://dvc.org/doc/dvclive/api-reference/live for this one? Currently they all have the same comment under:
Check the ML Frameworks page for more details and other supported frameworks.
Learn more in the | ||
[Comparing Experiments](/doc/user-guide/experiment-management/comparing-experiments) | ||
and [Visualizing Plots](/doc/user-guide/experiment-management/visualizing-plots) | ||
pages of the user guide. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💅🏼
pages of the user guide. | |
pages. |
|
||
## What next? | ||
### Share Results |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like this should be an H2 to me (since it's more about Studio than the DVC integration). This would also create an entry in the right side CONTENTS nav.
Closes iterative/dvclive#273
Closes iterative/dvclive#289
Try to focus on the simplest case: Use existing ML integration alongside DVC.
The previous state jumped directly into the API overview and addressed DVC integration on a separate page.
Try to better reflect the high-level steps required to get everything running in the most common case.