Java data streaming pipeline for ElasticSearch.
Enables NLP Naive Bayes classification with k-fold cross-validation on existing datasets.
Inspiration is taken from the following python Kaggle Notebooks:
- https://www.kaggle.com/jamesmcguigan/nlp-naive-bayes
- https://www.kaggle.com/jamesmcguigan/nlp-tf-idf-classifier
- https://www.kaggle.com/jamesmcguigan/nlp-logistic-regression
- https://www.kaggle.com/jamesmcguigan/nlp-laser-embeddings-keras
Initialization of the ElasticSearch database is handled in this repository:
For local installation instructions see: