Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add script to export products data and images for docker dev #6010

Merged
merged 6 commits into from
Oct 21, 2021

Conversation

stephanegigandet
Copy link
Contributor

New script to export product data and images, for instance to load into a development install of Product Opener.

Related to #6009

Can be run like this:

~/openfoodfacts-server$ docker-compose run --rm backend /opt/product-opener/scripts/export_products_data_and_images.pl --query ingredients_tags=en:salt --products-file /opt/product-opener/html/exports/test-products.tar.gz --images-file /opt/product-opener/html/exports/test-products-images.tar

@stephanegigandet stephanegigandet added the Data export We export data nightly as CSV, MongoDB… See: https://world.openfoodfacts.org/data label Oct 20, 2021
@stephanegigandet stephanegigandet requested a review from a team as a code owner October 20, 2021 17:47
@stephanegigandet
Copy link
Contributor Author

Added an option to get a random sample of products, based on their creation date:

./export_products_data_and_images.pl --sample-mod 10000,0 --products-file /srv/off/html/exports/products.random-modulo-10000.tar.gz --images-file /srv/off/html/exports/products.random-modulo-10000.images.tar

This is so that we can make a random sample of products instead of using the top scanned products (which typically have hundreds of images, so it's not very suitable for a dev environment).

# e.g. if the tar command failed and the script was stopped
wget -O products.tar.gz https://static.openfoodfacts.org/exports/products.random-modulo-10000.tar.gz 2>&1
tar -xzvf products.tar.gz -C /mnt/podata/products
rm products.tar.gz
Copy link
Member

@alexgarel alexgarel Oct 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI we have /mnt/podata/mnt which, in docker, is a tmpfs mount. It may be a more appropriate location.

(it also applies for following lines).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah I was wondering why we had this /mnt/podata/mnt empty directory, thanks for the explanation

Copy link
Member

@alexgarel alexgarel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM appart some more comments ;-)

scripts/export_products_data_and_images.pl Show resolved Hide resolved
scripts/export_products_data_and_images.pl Show resolved Hide resolved
scripts/export_products_data_and_images.pl Show resolved Hide resolved
@sonarcloud
Copy link

sonarcloud bot commented Oct 21, 2021

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
No Duplication information No Duplication information

@stephanegigandet stephanegigandet merged commit a3d1a55 into main Oct 21, 2021
@stephanegigandet stephanegigandet deleted the export-products-data-and-images branch October 21, 2021 13:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Data export We export data nightly as CSV, MongoDB… See: https://world.openfoodfacts.org/data
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants