Skip to content

Latest commit

 

History

History
15 lines (11 loc) · 575 Bytes

README.md

File metadata and controls

15 lines (11 loc) · 575 Bytes

My Application

The project is generated by LoopBack.

  • scrapper.js recursively scrapes urls
  • Urls are stored in Link collection, uniquenexx is applied on url to avoid duplicacy while inserting links
  • cheerio is used for dom window generation
  • Concurrency limit of 5 is maintained after 1st url Scrapping
  • Csv output is generated by batchwise processing of Link model data through streaming

Steps to run the project

  • clone the project directory
  • Inside directory run npm install
  • Then run node server/scripts/scrapper.js