Skip to content

Online Twitter and Reddit topic classification with a real-time clustering dashboard.

Notifications You must be signed in to change notification settings

baptiste-pasquier/trend-tracker

Repository files navigation

Trend-Tracker

Open in Streamlit Build & Test codecov Code style: black

1. Prerequities

  • Sign-up for a Twitter developer account on this link
  • Create a Bearer Token (documentation)
  • Fill in the field BEARER_TOKEN in the .env file
  • Create a Reddit developed application on this link (documentation)
  • Fill in the fields CLIENT_ID, SECRET_TOKEN, USERNAME and PASSWORD in the .env file
  • Install and run Kafka (documentation)
  • Create a MongoDB database in the cloud (free) or install the server (documentation)
  • Fill in the fields CONNECTION_STRING in the .env file

2. Usage with Docker

docker-compose -f docker-compose.yml up

3. Development

3.1. Installation

  1. Clone the repository
git clone https://github.com/baptiste-pasquier/trend-tracker
  1. Install the project
poetry install
  • With pip :
pip install -e .
  1. Install pre-commit
pre-commit install

3.2. Usage with CLI

Warning Each script must be run in a separate console

  1. Twitter streaming:
python all_services/ingest_tweets/app.py
  1. Reddit streaming:
python all_services/ingest_reddit/app.py
  1. Data preprocessing:
python all_services/tsf_data/app.py
  1. Data clustering:
python all_services/cluster_data/app.py
  1. Data storage on MongoDB:
python all_services/store_data/app.py

Real-time visualization:

streamlit run streamlit_app.py

About

Online Twitter and Reddit topic classification with a real-time clustering dashboard.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published