This repository serves as a directory to hold the web scrapping script for the the repo Is-My-Food-Healthy! That app requires lots and lots of data. And entering that manually even with a team for 20 people is a mammoth task. These scripts automated it.
Why is the website url hidden?
That's a very intelligent question! Well, the reason is that I scrapped this data from a site and I am not sure of the legality. So, it' better not to mention it. And yes, that site was really really helpful.
Currently I am looking for sites to scrap data for ingredients, artificial colours, preservatives, etc. I am researching and looking for them. Once that is done, I will create that script for those as well and add it to this repo.
- Python3
- requests library (Needs to be installed separately)
- BeautifulSoup from bs4 (Needs to be installed separately)
- time library (already part of core)
- os library (already part of core)
Install the required packages listed in requirements.txt
by running the following command:
uv pip install -r requirements.txt
When new packages are installed and requirements.txt
needs to be updated, simply run the following command:
uv pip freeze > requirements.txt
I am trying to document this journey and other cool tech stuff! Find it here Twitter: @SurrealDotTxt Newsletter: https://tilincode.substack.com Made with love by Surreal ^_^