Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests/*.zim files not included in release tarball #68

Closed
legoktm opened this issue Jul 8, 2020 · 7 comments · Fixed by #76
Closed

tests/*.zim files not included in release tarball #68

legoktm opened this issue Jul 8, 2020 · 7 comments · Fixed by #76
Assignees
Labels
enhancement New feature or request

Comments

@legoktm
Copy link
Member

legoktm commented Jul 8, 2020

If you download https://files.pythonhosted.org/packages/26/8e/201f1ed560f83f5fd4a87fe3ec52960d078f1a31c8f1670ac16cb92fe3a3/libzim-0.0.3.post0.tar.gz it's missing the two zim files in tests/ so the tests fail. I believe we need a MANIFEST.in file for this.

Also, from a licensing standpoint, it would be nice if we could use zim files with fully public domain content so then I don't have to keep track of the specific copyright status for Debian.

@legoktm
Copy link
Member Author

legoktm commented Jul 8, 2020

Also if we used different zim files we could make them much smaller...they're 33M right now.

legoktm added a commit that referenced this issue Jul 8, 2020
This will allow tests to pass from people/distributions (e.g. Debian) using
the release tarball.

Fixes #68.
@kelson42 kelson42 added the enhancement New feature or request label Jul 8, 2020
@kelson42
Copy link
Contributor

kelson42 commented Jul 8, 2020

@rgaudin @legoktm Maybe we should publish the tarbal on download.openzim.org/release ?

@rgaudin
Copy link
Member

rgaudin commented Jul 8, 2020

I also think we should not include 33M of ZIM files for tests.

  • wikipedia_es_chemistry_mini.zim doesn't seem to be used (14M)
  • Do we need large ZIM files (wikipedia_es_physics_mini.zim 20M)?
  • Should we need it, should we download it it tests?

I think it's important we test with ZIM files not created with pylibzim but there are many small ones. We could have an longer test that downloads a larger ZIM file also if we think it's relevant. I think there were issues with large number of articles in the very early days of this.

@kelson42
Copy link
Contributor

kelson42 commented Jul 8, 2020

I'm in favour of downloading ZIM files if they are more than 100KB.

@legoktm
Copy link
Member Author

legoktm commented Jul 8, 2020

With my Debian hat on:

  • I would like all the tests in the tarball to pass without needing anything extra (tests could check if the extra file exists and skip if not).
  • If something does want to download stuff, it needs to be possible to disable that functionality (e.g. one other Python library I maintain will skip network tests if NO_NETWORK environment variable is set).

Avoiding large ZIM files in the tarball would be nice too, but the other two are more important.

@legoktm
Copy link
Member Author

legoktm commented Jul 9, 2020

Might also be worth looking at what ZIMs kiwix-lib is using: https://github.com/kiwix/kiwix-lib/tree/master/test/data. wikipedia_en_ray_charles_mini_2020-03.zim is ~550k which seems much more reasonable.

legoktm added a commit that referenced this issue Jul 25, 2020
The large ZIM files used for tests are not included in the release
tarball because of their size. But we still want the tests to pass
in that case, so skip if the file is missing.

Fixes #68.
@legoktm
Copy link
Member Author

legoktm commented Jul 25, 2020

I created #76 as an alternative solution, it just skips those tests if the file is missing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
3 participants