Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create Github Action workflow for reporting DVC image diffs #1104

Merged
merged 19 commits into from
Mar 29, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
6d8fb52
Create Github Action workflow for reporting DVC image diffs
weiji14 Mar 22, 2021
03d95d5
Refactor test_legend_entries to use mpl_image_compare
weiji14 Mar 22, 2021
e14674a
Let actions/checkout fetch all history so that git diff works
weiji14 Mar 23, 2021
a314617
Cut by using whitespace instead of pipe
weiji14 Mar 23, 2021
b4c1da5
Ensure png image is viewable and add markdown header before image diff
weiji14 Mar 23, 2021
ee10108
Use specific versions of setup-dvc and setup-cml
weiji14 Mar 23, 2021
3061b80
Merge branch 'master' into dvc-diff-action
weiji14 Mar 25, 2021
1db5328
Use peter-evans/create-or-update-comment to publish image diff report
weiji14 Mar 25, 2021
bb6d5ff
Make GitHub comment more descriptive and easier to regex match
weiji14 Mar 26, 2021
6963611
Update inline comment on what the markdown diff report should look like
weiji14 Mar 26, 2021
8fa0afe
Update GitHub comment report using 'replace' mode instead of 'append'
weiji14 Mar 26, 2021
de763c9
Add bullet points with names for each of the images that have changed
weiji14 Mar 26, 2021
d482a97
Remove cml-send-comment lines
weiji14 Mar 26, 2021
6fa103f
Mention GitHub SHA in the report
weiji14 Mar 26, 2021
554ba6d
Collapsible image diff section and use correct git SHA
weiji14 Mar 26, 2021
3b890a8
Use awk instead of cut to handle multiple whitespace better
weiji14 Mar 26, 2021
6ea495e
List dvc-diff.yml under MAINTENANCE.md
weiji14 Mar 26, 2021
85df20a
Use awk 'NR>=7' instead of tail and add some whitespace indentation
weiji14 Mar 29, 2021
024c54e
Merge branch 'master' into dvc-diff-action
weiji14 Mar 29, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
95 changes: 95 additions & 0 deletions .github/workflows/dvc-diff.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
# Checks for image diffs in a Pull Request and adds a GitHub comment showing the diff
name: DVC image diff

on:
pull_request:
paths:
- 'pygmt/tests/baseline/*.png.dvc'
Comment on lines +4 to +7
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to get this workflow to run only on commits where *.png.dvc files have changed? The dvc-diff workflow still appears to run on commits where no *.png.dvc files have changed, e.g. at e14674a.


jobs:
dvc-diff:
name: DVC image diff
runs-on: ubuntu-latest

steps:
- name: Checkout
uses: actions/[email protected]
with:
# fetch all history so that dvc diff works
fetch-depth: 0

- name: Setup data version control (DVC)
uses: iterative/[email protected]

- name: Setup continuous machine learning (CML)
uses: iterative/[email protected]

# Produce the markdown diff report, which should look like:
# ## Summary of changed images
#
# This is an auto-generated report of images that have changed on the DVC remote
#
# | Status | Path |
# |----------|-------------------------------------|
# | added | pygmt/tests/baseline/test_image.png |
- name: Put list of images that were added or changed into report
run: |
echo -e "## Summary of changed images\n" > report.md
echo -e "This is an auto-generated report of images that have changed on the DVC remote\n" >> report.md
dvc diff --show-md master HEAD >> report.md
weiji14 marked this conversation as resolved.
Show resolved Hide resolved
seisman marked this conversation as resolved.
Show resolved Hide resolved
cat report.md

- name: Pull image data from cloud storage
run: dvc pull --remote upstream
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently this line can only pull from the upstream dvc remote at https://dagshub.com/GenericMappingTools/pygmt. Will need to think about how to make it work for forks as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to care about forks, if people follow the contributing guides (i.e., asking for permission on DAGsHub)? If people fork the DAGsHub repository and upload DVC images to their fork, our Tests workflow will fail, because it can't download the DVC images, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can deal with fork on the second iteration of this dvc-diff-action workflow. Technically we should be able to do dvc pull from DAGsHub forks, but that'll be quite a bit of work to code up as we need to point to another DVC remote (and also not sure if this is the best way to do things).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe very few users will work with DAGsHub forks, so not a big issue for us.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, will kick the can down the road then.


- name: Put image diff(s) into report
env:
repo_token: ${{ secrets.GITHUB_TOKEN }}
seisman marked this conversation as resolved.
Show resolved Hide resolved
id: image-diff
run: |
# Get just the filename of the changed image from the report
awk 'NR>=7 {print $4}' report.md > diff_files.txt

# Append each image to the markdown report
echo -e "## Image diff(s)\n" >> report.md
echo -e "<details>\n" >> report.md

while IFS= read -r line; do
echo -e "- $line \n" >> report.md
cml-publish --title $line --md "$line" >> report.md
done < diff_files.txt

echo -e "</details>\n" >> report.md

# Mention git commit SHA in the report
echo -e "Report last updated at commit ${{ github.event.pull_request.head.sha }}" >> report.md

# Format report to escape newlines before publishing as GitHub comment
report=$(cat report.md)
report="${report//'%'/'%25'}"
report="${report//$'\n'/'%0A'}"
report="${report//$'\r'/'%0D'}"
echo ::set-output name=report::$report

- name: Find comment with image diff report
uses: peter-evans/[email protected]
id: fc
with:
issue-number: ${{ github.event.pull_request.number }}
comment-author: 'github-actions[bot]'
body-includes: 'This is an auto-generated report of images that have changed on the DVC remote'

- name: Create comment with image diff report
if: steps.fc.outputs.comment-id == ''
uses: peter-evans/[email protected]
with:
issue-number: ${{ github.event.pull_request.number }}
body: ${{ steps.image-diff.outputs.report }}

- name: Update comment with new image diff report
if: steps.fc.outputs.comment-id != ''
uses: peter-evans/[email protected]
with:
comment-id: ${{ steps.fc.outputs.comment-id }}
body: ${{ steps.image-diff.outputs.report }}
edit-mode: replace
6 changes: 6 additions & 0 deletions MAINTENANCE.md
Original file line number Diff line number Diff line change
Expand Up @@ -127,6 +127,12 @@ There are 9 configuration files located in `.github/workflows`:

This workflow is triggered in a PR if the slash command `/format` is used.

10. `dvc-diff.yml` (Report changes to test images on dvc remote)

This workflow is triggered in a PR when any *.png.dvc files have been added,
modified, or deleted. A GitHub comment will be published that contains a summary
table of the images that have changed along with a visual report.

## Continuous Documentation

We use the [Vercel for GitHub](https://github.com/apps/vercel) App to preview changes
Expand Down
Binary file removed pygmt/tests/baseline/test_legend_entries.png
Binary file not shown.
4 changes: 4 additions & 0 deletions pygmt/tests/baseline/test_legend_entries.png.dvc
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
outs:
- md5: 4d1ed417fd2625d0b2fa5c17dafd32fd
size: 73728
path: test_legend_entries.png
34 changes: 8 additions & 26 deletions pygmt/tests/test_legend.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@
from pygmt import Figure
from pygmt.exceptions import GMTInvalidInput
from pygmt.helpers import GMTTempFile
from pygmt.helpers.testing import check_figures_equal


@pytest.mark.mpl_image_compare
Expand Down Expand Up @@ -44,42 +43,25 @@ def test_legend_default_position():
return fig


@check_figures_equal()
@pytest.mark.mpl_image_compare
def test_legend_entries():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rewrite the tests using a simpler 1d array, rather than the @Table_5_11.txt file?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but best to not modify the test too much in this PR as mentioned in #1104 (comment)

"""
Test different marker types/shapes.
"""
fig_ref, fig_test = Figure(), Figure()

# Use single-character arguments for the reference image
fig_ref = Figure()
fig_ref.basemap(J="x1i", R="0/7/3/7", B="")
fig_ref.plot(
data="@Table_5_11.txt",
S="c0.15i",
G="lightgreen",
W="faint",
l="Apples",
)
fig_ref.plot(data="@Table_5_11.txt", W="1.5p,gray", l='"My lines"')
fig_ref.plot(data="@Table_5_11.txt", S="t0.15i", G="orange", l="Oranges")
fig_ref.legend(D="JTR+jTR")

fig_test.basemap(projection="x1i", region=[0, 7, 3, 7], frame=True)
fig_test.plot(
fig = Figure()
fig.basemap(projection="x1i", region=[0, 7, 3, 7], frame=True)
fig.plot(
data="@Table_5_11.txt",
style="c0.15i",
color="lightgreen",
pen="faint",
label="Apples",
)
fig_test.plot(data="@Table_5_11.txt", pen="1.5p,gray", label='"My lines"')
fig_test.plot(
data="@Table_5_11.txt", style="t0.15i", color="orange", label="Oranges"
)
fig_test.legend(position="JTR+jTR")
fig.plot(data="@Table_5_11.txt", pen="1.5p,gray", label='"My lines"')
fig.plot(data="@Table_5_11.txt", style="t0.15i", color="orange", label="Oranges")
fig.legend(position="JTR+jTR")

return fig_ref, fig_test
return fig


@pytest.mark.mpl_image_compare
Expand Down