Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Image support #166

Merged
merged 36 commits into from
Nov 8, 2021
Merged

Image support #166

merged 36 commits into from
Nov 8, 2021

Conversation

daavoo
Copy link
Contributor

@daavoo daavoo commented Sep 20, 2021

Close #150

  • Added support for logging PIL Image and numpy arrays.

@codecov-commenter
Copy link

codecov-commenter commented Sep 20, 2021

Codecov Report

Merging #166 (c658b21) into master (f6739c5) will increase coverage by 0.43%.
The diff coverage is 96.61%.

❗ Current head c658b21 differs from pull request most recent head 69beabc. Consider uploading reports for the commit 69beabc to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##           master     #166      +/-   ##
==========================================
+ Coverage   91.82%   92.25%   +0.43%     
==========================================
  Files          17       19       +2     
  Lines         416      465      +49     
==========================================
+ Hits          382      429      +47     
- Misses         34       36       +2     
Impacted Files Coverage Ξ”
dvclive/data/image_numpy.py 86.66% <86.66%> (ΓΈ)
dvclive/data/__init__.py 100.00% <100.00%> (ΓΈ)
dvclive/data/image_pil.py 100.00% <100.00%> (ΓΈ)
dvclive/data/scalar.py 100.00% <100.00%> (ΓΈ)
dvclive/live.py 98.07% <100.00%> (+0.16%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Ξ” = absolute <relative> (impact), ΓΈ = not affected, ? = missing data
Powered by Codecov. Last update f6739c5...69beabc. Read the comment docs.

@daavoo daavoo marked this pull request as ready for review September 21, 2021 08:03
dvclive/data/image_pil.py Outdated Show resolved Hide resolved
Copy link
Contributor Author

@daavoo daavoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added an option for using a step placeholder in the output path of the image. With the current html layout, overriding the image at each step is kind of the only option that looks nice.

Overriding (dvclive.log("image.png", img)):

r = 5
for i in range(r):

    img = Image.new('RGB', (200,200), (250,250,250))
    draw = ImageDraw.Draw(img)
    font = ImageFont.truetype("OpenSans-Regular.ttf", 100)
    draw.text((20, -5),str(i),(0,0,0), font=font)

    dvclive.log("image.png", img)
    dvclive.log("loss", (1-i/r) + (np.random.random() / 10))
    dvclive.log("accuracy", (i/r) + (np.random.random() / 10))
    dvclive.next_step()

ezgif-3-c8c7158d51cb

Using the placeholder (dvclive.log("image_{step}.png", img)):

r = 5
for i in range(r):

    img = Image.new('RGB', (200,200), (250,250,250))
    draw = ImageDraw.Draw(img)
    font = ImageFont.truetype("OpenSans-Regular.ttf", 100)
    draw.text((20, -5),str(i),(0,0,0), font=font)

    dvclive.log("image_{step}.png", img)
    dvclive.log("loss", (1-i/r) + (np.random.random() / 10))
    dvclive.log("accuracy", (i/r) + (np.random.random() / 10))
    dvclive.next_step()

ezgif-3-693d6fdc97f8

dvclive/data/image_pil.py Outdated Show resolved Hide resolved
setup.py Show resolved Hide resolved
@daavoo
Copy link
Contributor Author

daavoo commented Oct 4, 2021

So the image file is being overwritten at every step, right? That seems okay for DVC, but not great for non-DVC use cases since they will lose the images from all previous steps.

If they don't include the step yes, but we can emphasize that option in the docs, right?. The following would work as expected:

dvclive.log(f"image_{step}.png", image)

Anyhow, we kind of already have a similar ux regarding model_file it gets overwritten as it is intended to be mainly used alongside DVC checkpoints.

In order to let users write down images for each step we need to either make output for the particular key a directory or implement the structure that @daavoo proposes in iterative/dvc#6628.

Not sure if I am missing something but why would be having a separate subdir a requirement? Using dvclive.log(f"image_{step}.png or dvclive.log(f"{step}/image.png") would currently work, it is just not pretty when showed in the html.

@daavoo
Copy link
Contributor Author

daavoo commented Oct 4, 2021

I removed the subdir part from this P.R. as it is not a strict requirement and could be addressed in a separate P.R.

@pared
Copy link
Contributor

pared commented Oct 5, 2021

Not sure if I am missing something but why would be having a separate subdir a requirement? Using dvclive.log(f"image_{step}.png or dvclive.log(f"{step}/image.png") would currently work, it is just not pretty when shown in the HTML.

Right, its not a strict requirement. If we document the {step} properly I think this is a good way to let users decide how they want to store their images.

@dberenbaum
Copy link
Collaborator

TBH I'm pretty torn. The flexibility is nice, but it puts more burden on the user and moves away from a canonical tree structure. Would you recommend "image_{step}.png" for both DVC and non-DVC workflows?

Anyhow, we kind of already have a similar ux regarding model_file it gets overwritten as it is intended to be mainly used alongside DVC checkpoints.

This seems slightly different to me:

  • A model isn't an artifact that I would want to see in one place across multiple steps.
  • Model file location isn't necessarily within the dvclive directory.
  • Model saving is specific to certain frameworks as a convenience.

On the other hand, it might be nice to make model saving more similar so that, for example, integration with dvc could manage the model output instead of needing to specify this separately in dvc.yaml.

@pared
Copy link
Contributor

pared commented Oct 6, 2021

On the other hand, it might be nice to make model saving more similar so that, for example, integration with dvc could manage the model output instead of needing to specify this separately in dvc.yaml.

This brings us to the question how would we save the model? Do we log it? if so, how do we make sure DVC understands its a model and not a plot/metric? Make a special subdir in dvclive results? Do we need a special method for saving assets that are not metrics? Auto-type detection seems risky in this case as models come in all shapes and colors.

@dberenbaum
Copy link
Collaborator

dberenbaum commented Oct 6, 2021

@daavoo How would you recommend users log images in 1) DVC workflows and 2) non-DVC workflows? What path/pattern would you suggest for each?

Edit: Also, do you have an idea of how DVC would track these outputs?

@daavoo
Copy link
Contributor Author

daavoo commented Oct 27, 2021

@daavoo How would you recommend users log images in 1) DVC workflows and 2) non-DVC workflows? What path/pattern would you suggest for each?

Edit: Also, do you have an idea of how DVC would track these outputs?

I think that just to move forward, we can hide the "{step}" formatting from the users and be opinionated for now.

In my mind, the division would be more about With or Without Checkpoints.

For example, we can just tell the user to call:

live.log("{filename}", image)

And we would internally save the image under:

a) {live.dir}/images/{live.step}/{filename}" (Without checkpoints)

b) {live.dir}/images/{filename}" (With checkpoints)

AFAIK, both outputs will be tracked by DVC as expected (just like the .tsv files).

If later on, some changes on the DVC HTML layout require the images to be in a different folder structure, we can change it internally.

WDYT?

@pared
Copy link
Contributor

pared commented Oct 28, 2021

I wonder whether we should not allow some kind of configuration for that. It seems to me that both cases:

  • I want to store every image
  • I want to store only latest images
    are valid (and aligning with .tsv and summary.json approach) but on the other hand logging every image and showing them on dvc plots show might be annoying.

Checkpointing approach does resolve that, but what about users not using checpoints?

@dberenbaum
Copy link
Collaborator

If I'm ignoring DVC, checkpoints, the HTML output, etc., and only considering how I would expect a logger to save this data, it would be in the {live.dir}/images/{live.step}/{filename} format. This parallels the .tsv files and is intuitive to me. Even if I'm using checkpoints, it might be annoying to compare images rather than just have them available in this readable structure, and we are already saving them, so it's not like it takes up additional space.

Currently, this makes for an ugly visualization, but are there ways we can mitigate this in the future? We could visualize a folder of images in a lot of ways, like make a slider to scroll between images like tensorboard and wandb do. Even if we completely leave it out of the HTML, I could easily scroll through the photos locally in this format.

Is there any compelling reason other than the current limits of our visualization to save them in a different format?

@pared pared closed this Oct 29, 2021
@pared
Copy link
Contributor

pared commented Oct 29, 2021

Is there any compelling reason other than the current limits of our visualization to save them in a different format?

I don't think there are any.

@pared pared reopened this Oct 29, 2021
@pared
Copy link
Contributor

pared commented Oct 29, 2021

Sorry for the close, random shortcut has been hit.

@daavoo
Copy link
Contributor Author

daavoo commented Nov 8, 2021

I updated the code to use {live.dir}/images/{live.step}/{filename} as default in any case.

@daavoo daavoo merged commit 5227d6e into master Nov 8, 2021
@daavoo daavoo deleted the image-support branch November 8, 2021 15:44
@dberenbaum
Copy link
Collaborator

Is there a docs PR for this @daavoo?

@daavoo
Copy link
Contributor Author

daavoo commented Jan 4, 2022

Is there a docs PR for this @daavoo?

iterative/dvc.org#3013 (comment)

@daavoo daavoo mentioned this pull request Jan 14, 2022
2 tasks
@dberenbaum dberenbaum mentioned this pull request Jan 14, 2022
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

dvclive.log: Support images
4 participants