Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(testing) add new regression test to run exiv2 over every test file #2091

Merged
merged 3 commits into from
Feb 13, 2022

Conversation

hassec
Copy link
Member

@hassec hassec commented Feb 12, 2022

This PR introduces a new test which runs exiv2 over every file* that we have and compares the output with a reference output which is committed to the repository.

The goal is to become more sensitive to any kind of changes when we refactor code or make any other changes.

  • some files are excluded if:
  • utf-8 encoding problems of output (could possibly be fixed)
  • non-zero return code when running over a file
  • empty output

@hassec hassec added the testing Anything related to the tests and their infrastructure label Feb 12, 2022
@hassec hassec requested a review from piponazo February 12, 2022 22:15
@hassec
Copy link
Member Author

hassec commented Feb 12, 2022

I don't have any failures locally... 🙈

will look into it tomorrow

@codecov
Copy link

codecov bot commented Feb 12, 2022

Codecov Report

Merging #2091 (c2c2086) into main (a8a995f) will increase coverage by 0.90%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #2091      +/-   ##
==========================================
+ Coverage   62.08%   62.98%   +0.90%     
==========================================
  Files          96       96              
  Lines       19153    19110      -43     
  Branches     9798     9781      -17     
==========================================
+ Hits        11891    12037     +146     
+ Misses       4967     4779     -188     
+ Partials     2295     2294       -1     
Impacted Files Coverage Δ
src/safe_op.hpp 69.23% <0.00%> (-27.65%) ⬇️
include/exiv2/slice.hpp 69.64% <0.00%> (-21.89%) ⬇️
include/exiv2/error.hpp 60.71% <0.00%> (-4.81%) ⬇️
src/value.cpp 72.76% <0.00%> (-1.59%) ⬇️
src/types.cpp 86.71% <0.00%> (-1.07%) ⬇️
src/tiffcomposite_int.cpp 74.60% <0.00%> (-0.19%) ⬇️
src/tiffimage_int.cpp 79.32% <0.00%> (-0.12%) ⬇️
src/utils.cpp 40.38% <0.00%> (ø)
include/exiv2/types.hpp 70.58% <0.00%> (ø)
src/actions.cpp 63.63% <0.00%> (+0.08%) ⬆️
... and 15 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a8a995f...c2c2086. Read the comment docs.

@clanmills
Copy link
Collaborator

This seems such a good idea that I don't know why we have not done this in the past!

My concern is the increase in the size of the download when you clone the repos.

1060 rmills@rmillsmm-local:~/gnu/github/exiv2/chasse_regression_test $ du -sh * | grep -v K | sort -n
1.1M	xmpsdk
1.3M	contrib
3.3M	tests
3.4M	src
13M	po
61M	test
1061 rmills@rmillsmm-local:~/gnu/github/exiv2/chasse_regression_test $ du -sh test/data/* | grep -v K | sort -n
1.1M	test/data/IMG_3578.heic
1.2M	test/data/Canon.HIF
...
26M	test/data/test_reference_files
1062 rmills@rmillsmm-local:~/gnu/github/exiv2/chasse_regression_test $ 

About 2013, we had complaints from darktable when we added video and eps files. The reason for the complaint was because we had increased the repos from 25mb to 80mb. Their build scripts were pulling down everything and building the library. They didn't run any tests. The data/ directory was causing pain with no gain.

I dealt with that by putting the test files into a different repos and downloaded them on demand. I was able to decrease the repos from 80mb, to 25mb with no unpleasant side effects. The "use a different repos" works well for reference images because they seldom change. In your case, the whole point is to detect change, so having the reference output in the repos is the right thing to do.

If we use tar.gz on directory test_reference_files, the files would be 5mb instead of 25mb. I doubt if this helps because I suspect git uses compression to decrease the data "on the wire".

My conclusion: This is a good idea and the additional band-width required to clone the repos is justified.

@piponazo
Copy link
Collaborator

My concern is the increase in the size of the download when you clone the repos.

It looks like @hassec is not introducing new binary files, just txt files for the expectations in the tests right?

In that case we do not need to worry much baout the increase in size. The real issue with source code repositories regarding the size is when you start adding binary files (images, videos, etc.). For a long time I wanted to give it a try to Git LFS and move there all the binaries we have in exiv2. However I think it would be a large task ...

@clanmills
Copy link
Collaborator

@piponazo I think we're in agreement. At first sight, it appears as though we've added a 26mb sub directory to directory test/. However that compresses "on the wire" to 5mb. For that matter, the directory po/ is only 3mb when compressed.

Another thought I had was that we have a lot more bandwidth in 2022 than the darktable complaint in 2012. We're using it wisely.

You're doing a great job with the Greek paths. I got confused on Friday between Cygwin and MinGW. Cygwin is posix and handles greek paths effortlessly. Apologies for the noise.

MinGW is Windows and needs the powerful magic with which you are now working. The CI is probably hanging by displaying alerts about missing DLLs. Getting that fixed requires patience and determination. You've got both those qualities. Best of Luck.

@piponazo
Copy link
Collaborator

ey @hassec , why are you deleting the file tests/bugfixes/github/test_issue_1959.py here? I am wondering if you deleted it by mistake. I think that is the test that is failing at the moment locally due to the UTC (see #2083 )

@hassec
Copy link
Member Author

hassec commented Feb 13, 2022

good point @piponazo,
tests/bugfixes/github/test_issue_1959.py was only calling exiv2 and comparing the output to what was stored in a reference file.
That reference file collided with the one from the new test.
Given that the new test invokes exiv2 in a more verbose and complete way compared to that specific test, I figured I could just remove the old one and avoid having to have 2 reference files for one image.

There is of course the open question whether we still need many of the small old tests, which are by definition a subset of this new test. So right now we are doing redundant work, but I wanted to focus on introducing the new test in this PR and then discuss and start cleanup step by step in further PRs if that is ok?

@piponazo
Copy link
Collaborator

For me it is totally fine to take an incremental approach. I'll look with more detail to your changes once all the CI jobs are green 😉

@hassec
Copy link
Member Author

hassec commented Feb 13, 2022

@piponazo all tests are ✔️ 🎉

I've excluded 5 more files which are only fuzzing POCs and were failing differently in xmpsdk depending on the platform.
One other file, I'm not sure about why it fails on windows, but it's hard for me to test as I've not yet setup any windows env.
I've excluded the file for now with a comment to double check this later.

@clanmills
Copy link
Collaborator

@hassec. Can you take a cautious approach to bringing your new "hit everything test" into service. I think you should allow new test to settle before removing tests which have given good service for years.

How about opening a new issue report "Retire tests which are duplicated by test_christophs_magic.py". So, only record this as a task for the future (say in 3 months from now).

@hassec
Copy link
Member Author

hassec commented Feb 13, 2022

@clanmills yes I think that is a good idea. I wasn't planning on removing any of the duplicate tests in this MR.
It's O(10s) more runtime for the entire test suite, so I don't think there is a rush.
So I think the cautious approach you suggest, is for sure the way to go.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
testing Anything related to the tests and their infrastructure
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants