Skip to content

surreal30/Food-Data-Scrapper

Repository files navigation

About

This repository serves as a directory to hold the web scrapping script for the the repo Is-My-Food-Healthy! That app requires lots and lots of data. And entering that manually even with a team for 20 people is a mammoth task. These scripts automated it.

Why is the website url hidden?

That's a very intelligent question! Well, the reason is that I scrapped this data from a site and I am not sure of the legality. So, it' better not to mention it. And yes, that site was really really helpful.

What is missing?

Currently I am looking for sites to scrap data for ingredients, artificial colours, preservatives, etc. I am researching and looking for them. Once that is done, I will create that script for those as well and add it to this repo.

Get started

Requirements

  • Python3
  • requests library (Needs to be installed separately)
  • BeautifulSoup from bs4 (Needs to be installed separately)
  • time library (already part of core)
  • os library (already part of core)

How to install

Install the required packages listed in requirements.txt by running the following command:

uv pip install -r requirements.txt

Generating requirements.txt

When new packages are installed and requirements.txt needs to be updated, simply run the following command:

uv pip freeze > requirements.txt

I am trying to document this journey and other cool tech stuff! Find it here Twitter: @SurrealDotTxt Newsletter: https://tilincode.substack.com Made with love by Surreal ^_^