Use the zim files in zim-testing-suite for unit tests. #531

mgautierfr · 2021-04-12T15:36:21Z

No description provided.

codecov · 2021-04-12T17:26:00Z

Codecov Report

Merging #531 (732194a) into master (4e22ac3) will decrease coverage by 0.02%.
The diff coverage is n/a.

❗ Current head 732194a differs from pull request most recent head 5fb5f36. Consider uploading reports for the commit 5fb5f36 to get more accurate results

@@            Coverage Diff             @@
##           master     #531      +/-   ##
==========================================
- Coverage   76.39%   76.37%   -0.03%     
==========================================
  Files          89       89              
  Lines        3661     3661              
  Branches     1630     1630              
==========================================
- Hits         2797     2796       -1     
- Misses        863      864       +1     
  Partials        1        1

Impacted Files	Coverage Δ
src/archive.cpp	`55.86% <0.00%> (-0.56%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4e22ac3...5fb5f36. Read the comment docs.

Brew changes its backend. We must update it before using it.

mgautierfr · 2021-04-13T14:52:16Z

@legoktm The idea behind this PR is to move all the test zim files in a separated repository (https://github.com/openzim/zim-testing-suite).
So now, unittests launched during packaging are failing.
What is the best approch on packaging side ? Maybe just add the archive as new source of the package ?

legoktm · 2021-04-13T17:15:01Z

Can we do what we did in openzim/python-libzim@ff560ea and have the tests be skipped if the zim test files are missing?

And then, how many repositories do you expect will use the new zim-testing-suite repository? If it's just this one, then we can add it as an additional source, but if it'll be multiple we probably want to just package it separately and have libzim, etc. depend on it.

veloman-yunkan

Before this change we must agree on the new/proposed workflow of managing test ZIM archives and unit-tests depending on them? In particular:

Under what circumstances and how existing ZIM archives can be updated in the broad sense (even if it means creating a new manually versioned revision of that archive)?
How their versioning is going to be handled?
What is the envisioned approach to creating and maintaining backward compatibility tests (when the same test must run on various versions of the conceptually "same" ZIM archive)?
How do you make sure that developers have the most up-to-date test data on their local development hosts? zim-testing-suite becomes a weak dependency of the source repositories since it is used only for unit-tests. How to avoid scenarios when unit-tests might fail on more fresh test data but keep passing because they use the old data? This now becomes a problem since zim-testing-suite must be updated separately.

kelson42 · 2021-04-14T06:32:22Z

@mgautierfr I have extracted the commit about Brew update in the CI, to speed-up the fix and make it independant. Here is the dedicated PR #533

mgautierfr · 2021-04-14T08:52:03Z

Can we do what we did in openzim/python-libzim@ff560ea and have the tests be skipped if the zim test files are missing?

I prefer not, if we silently skip the tests we can be pretty sure that we will miss that.
But we can add a option (as for test using a lot of memory) to skip them.

And then, how many repositories do you expect will use the new zim-testing-suite repository? If it's just this one, then we can add it as an additional source, but if it'll be multiple we probably want to just package it separately and have libzim, etc. depend on it.

This is a open question. For now only one.
But this may change in the future. But then, we may also change the way it is working (using git-lfs or something else).

mgautierfr · 2021-04-14T09:07:43Z

Before this change we must agree on the new/proposed workflow of managing test ZIM archives and unit-tests depending on them? In particular:

This is somehow something I didn't want to discuss now.
The proposed solution is a easy, not perfect solution to move the tests files elsewhere and go ahead on PR that need new test files.
The priority is the release of libzim7. We need to test it (and only it for now).

Under what circumstances and how existing ZIM archives can be updated in the broad sense (even if it means creating a new manually versioned revision of that archive)?

For now, zim files in zim-testing-suite must not be changed.
We may add new zim files however.

How their versioning is going to be handled?

Manually for now.

What is the envisioned approach to creating and maintaining backward compatibility tests (when the same test must run on various versions of the conceptually "same" ZIM archive)?

The next step here is to move the existing zim files in zim-testing-suite/data into zim-testing-suite/data/with-ns (or any other name) and create another "clone" of with-ns with zim files with the "same" data but in the new format.

On libzim's test side, update getDataFilePath to return a list of files (a list of variant), and update the tests to do the same check on all variant.

How do you make sure that developers have the most up-to-date test data on their local development hosts? zim-testing-suite becomes a weak dependency of the source repositories since it is used only for unit-tests. How to avoid scenarios when unit-tests might fail on more fresh test data but keep passing because they use the old data?

We can add a check in meson to check the data version.
The CI will ensure that we test with the last data.

As said at the beginning of my comment. This solution is not perfect.
I don't want to add 27Mb more of binary data in the repository.
But we need a reference test files in the same time.
And we cannot pass weeks to design a proper solution (see https://github.com/openzim/libzim/issues/469)

veloman-yunkan · 2021-04-14T10:24:05Z

As said at the beginning of my comment. This solution is not perfect.
I don't want to add 27Mb more of binary data in the repository.
But we need a reference test files in the same time.
And we cannot pass weeks to design a proper solution (see #469)

Then I don't see why we need to switch existing tests to a new approach now. You could have created zim-testing-suite repository with only the data required for the new test. Then if that solution proves viable (and after any potential initial quirks with it are cleared) we would gradually migrate other tests to it as needed. I have dealt in an industrial setting with both approaches (1. keeping the test data with the test scripts or 2. keeping them separate from each other) and I find the former approach easier if there is no need to share the data across multiple tests. Of course that doesn't work for large data, but in our case majority of the test ZIM archives are small enough.

mgautierfr · 2021-04-14T13:12:30Z

Then I don't see why we need to switch existing tests to a new approach now

Because current tests must be run on current test zim files AND new ones.

veloman-yunkan · 2021-04-14T14:43:11Z

Then I don't see why we need to switch existing tests to a new approach now

Because current tests must be run on current test zim files AND new ones.

Still, we better arrive at it by creating one such test first, ironing out any issues and then migrating existing tests that rely on large test data. I am going to defend my opinion of keeping small data with the tests.

mgautierfr · 2021-04-15T15:03:15Z

Still, we better arrive at it by creating one such test first, ironing out any issues and then migrating existing tests that rely on large test data. I am going to defend my opinion of keeping small data with the tests.

PR #535 add the new test files and update the tests.
I still prefer to not add the zim file to the git history. They will be part of the git history "forever" even if we remove them the next PR.
But at least we have some concrete code to see.

kelson42 · 2021-04-21T20:41:52Z

Superseeded by #538

mgautierfr force-pushed the no_test_data branch 4 times, most recently from 7365992 to 7d72dc8 Compare April 12, 2021 16:27

mgautierfr added 2 commits April 12, 2021 19:19

[TEST] Use a helper function to get the path of a test zim file.

c662711

[TEST] Remove the test data and point to the new zim-testing-suite.

e3c59bd

mgautierfr force-pushed the no_test_data branch 2 times, most recently from 08c3eea to 00f04f3 Compare April 12, 2021 17:21

mgautierfr changed the title ~~[TEST] Use a helper function to get the path of a test zim file.~~ Use the zim files in zim-testing-suite for unit tests. Apr 12, 2021

mgautierfr force-pushed the no_test_data branch 5 times, most recently from 0f71799 to 732194a Compare April 12, 2021 18:37

mgautierfr added 2 commits April 12, 2021 20:52

[TEST][CI] Update the CI to download the zim-testing-suite.

b31c255

[CI] Update brew before installing packages.

5fb5f36

Brew changes its backend. We must update it before using it.

mgautierfr force-pushed the no_test_data branch from 732194a to 5fb5f36 Compare April 12, 2021 18:52

mgautierfr requested a review from veloman-yunkan April 13, 2021 15:17

veloman-yunkan reviewed Apr 13, 2021

View reviewed changes

mgautierfr mentioned this pull request Apr 15, 2021

Add zim files in the "new" format as testing data. #535

Merged

mgautierfr mentioned this pull request Apr 20, 2021

Remove test data from repository and use data from zim-testing-suite. #538

Merged

kelson42 closed this Apr 21, 2021

mgautierfr deleted the no_test_data branch May 12, 2021 13:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use the zim files in zim-testing-suite for unit tests. #531

Use the zim files in zim-testing-suite for unit tests. #531

mgautierfr commented Apr 12, 2021

codecov bot commented Apr 12, 2021 •

edited

Loading

mgautierfr commented Apr 13, 2021

legoktm commented Apr 13, 2021

veloman-yunkan left a comment

kelson42 commented Apr 14, 2021

mgautierfr commented Apr 14, 2021

mgautierfr commented Apr 14, 2021

veloman-yunkan commented Apr 14, 2021

mgautierfr commented Apr 14, 2021

veloman-yunkan commented Apr 14, 2021

mgautierfr commented Apr 15, 2021

kelson42 commented Apr 21, 2021

Use the zim files in zim-testing-suite for unit tests. #531

Use the zim files in zim-testing-suite for unit tests. #531

Conversation

mgautierfr commented Apr 12, 2021

codecov bot commented Apr 12, 2021 • edited Loading

Codecov Report

mgautierfr commented Apr 13, 2021

legoktm commented Apr 13, 2021

veloman-yunkan left a comment

Choose a reason for hiding this comment

kelson42 commented Apr 14, 2021

mgautierfr commented Apr 14, 2021

mgautierfr commented Apr 14, 2021

veloman-yunkan commented Apr 14, 2021

mgautierfr commented Apr 14, 2021

veloman-yunkan commented Apr 14, 2021

mgautierfr commented Apr 15, 2021

kelson42 commented Apr 21, 2021

codecov bot commented Apr 12, 2021 •

edited

Loading