Document Search Engine project with TF-IDF abd Google universal sentence encoder model
-
Updated
May 1, 2023 - Jupyter Notebook
Document Search Engine project with TF-IDF abd Google universal sentence encoder model
Tunable full text search engine in JavaScript that: (1) works natively on web apps like Express.js; (2) easy to customize (via BM25) to specific types of documents (e.g. tweets, scientifc journals); (3) is deployable on either the client-side or the server side.
Web app to match resume to job type, using nlp svm classifier model. Data via webscraping. Uploaded resume converted from PDF to text using OCR.
Implementation of a Vector Space Retrieval Model using TF-IDF and cosine similarity on the Cranfield document corpus
E-Commerce Recommendation System
Code for UCSD CSE 258 Web Mining and Recommender Systems
An NLP model to detect fake news and accurately classify a piece of news as REAL or FAKE trained on dataset provided by Kaggle.
This case study shows how to create a model for text analysis and classification and deploy it as a web service in Azure cloud in order to automatically classify support tickets. This project is a proof of concept made by Microsoft (Commercial Software Engineering team) in collaboration with Endava http://endava.com/en
Extractive Text Summarizer, based on tf-idf text representation (an example)
Web search engine to retrieve most relevant web-pages for user search query from web-pages crawled on the UIC domain
Twitter Sentiment Analysis
The project utilizes the TF-IDF (Term Frequency-Inverse Document Frequency) algorithm. The main objective of this project is to measure the similarity between text documents using the TF-IDF algorithm.
Checkout my adventures into NLP here.
Text classification using Naive Bayes Algorithm¶
Detect Real or Fake News. To build a model to accurately classify a piece of news as REAL or FAKE. Using sklearn, build a TfidfVectorizer on the provided dataset. Then, initialize a PassiveAggressive Classifier and fit the model. In the end, the accuracy score and the confusion matrix tell us how well our model fares.
Recommendation systems for Yelp (collaborative filtering & content-based)
Recommends Anime using Content based filtering (using TFIDF vectorization and sigmoid kernel) and collaborative filtering (using KNN)
Add a description, image, and links to the tfidf-text-analysis topic page so that developers can more easily learn about it.
To associate your repository with the tfidf-text-analysis topic, visit your repo's landing page and select "manage topics."