AI :: Artificial Intelligence, Cognitive Science, Machine Learning {(Un)Supervised/RL}, Neural Nets, NLP, etc...

§1. AI
§2. DATA SCIENCE
§3. MACHINE LEARNING
§4. NLP
§5. REINFORCEMENT LEARNING
§6. Reproducibility
§7. SUPERVISED LEARNING
§8. UNSUPERVISED LEARNING
- Neural Networks
  - Artificial Neural Network

§1. AI

simpleai :: Simple artificial intelligence utilities.

§2. DATA SCIENCE

engarde :: A library for defensive data analysis.
gqn-datasets :: Datasets used to train Generative Query Networks (GQNs) in the ‘Neural Scene Representation and Rendering’ paper.
python-seminar :: Python Computing for Data Science.

Resources

General Assembly's Data Science course in Washington-DC: Jupyter notebooks for DAT4 and DAT8
Public repository for course materials for the Spring 2013 session of Introduction to Data Science, an online coursera course.
General guidelines (table) for choosing a statistical analysis which was adapted from Choosing the Correct Statistic developed by James D. Leeper, Ph.D.
LearnDataScience :: A collection of Data Science Learning materials in the form of IPython Notebooks with associated data sets.

§3. MACHINE LEARNING.

ConfidenceWeighted :: Confidence weighted classifier.
- Papers : (http://www.cs.jhu.edu/~mdredze/publications/icml_variance.pdf) and (http://icml.cc/2012/papers/86.pdf)
Faceless :: A port of ICAAM library by Luca Vezzaro to Python for Face Tracking based on Active Appearance Models.
featureforge :: A set of tools for creating and testing machine learning features, with a scikit-learn compatible API.
Foxhound :: Scikit-learn inspired library for gpu-accelerated machine learning.
fuel :: A data pipeline framework for machine learning.
hips-lib :: Library of common tools for machine learning research.
MachineLearning :: Materials for the Wednesday Afternoon Machine Learning workshop.
Machine Learning Video Library.
Masque :: Experiments on Deep Learning and Emotion Classification.
MILK :: Machine Learning Toolkit.
MLOSS.org
MLTRP :: Machine Learning and the Traveling Repairman Problem.
Morris_counter is a Probabilistic Morris Counter (counts 2^n using e.g. just a byte).
MLTP :: ML Timeseries Platform.
ProFET :: Protein Feature Engineering Toolkit for Machine Learning.
pyHANSO :: Python Implementation of Michael Overton's HANSO (Hybrid Algorithm for Non-Smooth Optimization).
pyklsh :: Python implementation of Kernelized Locality Sensitive Hashing
PyML is an interactive object oriented framework for machine learning written in Python, with support for classification and regression, including Support Vector Machines (SVM), feature selection, model selection, syntax for combining classifiers and methods for assessing classifier performance.
- PyML Tutorial
Rambutan :: A python wrapper for caffe which aims at providing a simple, pythonic, interface for users so that users can define, train, and evaluate deep models in only a few lines of code. It requires that caffe and pycaffe are both built properly.
RAMP :: Rapid Machine Learning Prototyping in Python.
python-recsys :: A python library for implementing a recommender system.
Sixpack :: a language-agnostic a/b-testing framework. Documentation
TPOT :: A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming. A blog post explaining the same: http://www.randalolson.com/2016/05/08/tpot-a-python-tool-for-automating-data-science/
PyCM :: PyCM is a multi-class confusion matrix library written in Python that supports both input data vectors and direct matrix, and a proper tool for post-classification model evaluation that supports most classes and overall statistics parameters. PyCM is the swiss-army knife of confusion matrices, targeted mainly at data scientists that need a broad array of metrics for predictive models and an accurate evaluation of large variety of classifiers.

Resources

What I Learned Implementing a Classifier from Scratch in Python.
DataSciencePython :: common data analysis and machine learning tasks using python.
Examples from "Thoughtful Machine Learning".
CIML :: A Course in Machine Learning. This repository contains the source code for the CIML book (see http://ciml.info/) as well as any course materials that seem useful (slides, documents, labs, etc.).
deepframeworks :: An evaluation of Deep Learning Frameworks.
A Machine Learning course by Prof. Yaser Abu-Mostafa with videos on Youtube.
study :: A study of interesting algorithms.
Machine Learning Algorithm Cheat Sheet by Laura D Hamilton.
machine-learning-cheat-sheet :: Classical equations and diagrams in machine learning by @soulmachine.
Cheatsheet for choosing the right estimator.
Machine Learning cheatsheet.
Big Data Machine Learning Patterns for Predictive Analytics By Ricky Ho.
A HN site for ML.
Source Code for the book Building Machine Learning Systems with Python.

§3.1. Deep Learning. span id="3-1-Deep-Learning">

https://en.wikipedia.org/wiki/Comparison_of_deep_learning_software
TRAINS :: Auto-Magical Experiment Manager & Version Control for AI. Training production-grade deep learning models is a glorious but messy process. TRAINS tracks and controls the process by associating code version control, research projects, performance metrics, and model provenance.

Resources

DeepLearningTutorials :: Deep Learning Tutorial notes and code. See the wiki for more info.
Deep Learning Part 1: Comparison of Symbolic Deep Learning Frameworks.
handson-ml :: A series of Jupyter notebooks that walk you through the fundamentals of Machine Learning and Deep Learning in python using Scikit-Learn and TensorFlow.
handson-ml2 :: Version-2 of the series of Jupyter notebooks that walk you through the fundamentals of Machine Learning and Deep Learning in Python using Scikit-Learn, Keras and TensorFlow 2.

§3.2. Classification Algorithms. span id="3-2-Classification-Algorithms">

K-Nearest-Neighbors-with-Dynamic-Time-Warping :: Python implementation of KNN and DTW classification algorithm.

Resources

Naive Bayes

Blog on How To Implement Naive Bayes From Scratch in Python

§3.3. Graph Theory. span id="3-3-Graph-Theory">

fluffy-graph :: NP-hard game where you find isomorphic graphs.
PyMarkovChain :: Simple markov chain implementation in python.
python-igraph :: Python interface for igraph. The code and issue tracker is on github.

Resources

Amazon Machine Learning: use cases and a real example in Python.
Some machine learning libraries
Visualizing Algorithms
Machine Learning HowTo from HN.
Alternating Least Squares Method for Collaborative Filtering
Using Machine Learning To Pick Your Lottery Numbers
How a Russian mathematician constructed a decision tree - by hand - to solve a medical problem
MST → python algorithms for minimum spanning trees.

§3.4. GPU. span id="3-4-GPU">

cuML :: is a suite of libraries that implement machine learning algorithms and mathematical primitives functions that share compatible APIs with other RAPIDS projects.

§4. NLP. span id="4-NLP">

Broca :: Various useful NLP algos and utilities for rapid NLP prototyping.
commonast :: A common AST description for Python.
Fairseq :: A sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling and other text generation tasks.
Gensim is a Python library for topic modelling, document indexing and similarity retrieval with large corpora for natural language processing (NLP) and information retrieval (IR). Source Code.
Geiger :: An automated system for grouping similar comments and then identifying the best representative from each group.
Glove-python :: Toy Python implementation of http://www-nlp.stanford.edu/projects/glove/
Gramformer :: A framework for detecting, highlighting and correcting grammatical errors on natural language text.
IEPY :: An open source tool for Information Extraction focused on Relation Extraction.
JPKyteaTokenizer :: A Japanese tokenizer with KyTea for nltk.
Mykytea-python :: Python wrapper for KyTea.
NLTK :: Natural Language ToolKit to manipulate human language data. Source Code
nupic.fluent :: A platform for building language / NLP-based applications using NuPIC and CEPT.
Quepy :: A python framework to transform natural language questions to queries in a database query language.
Parrot_Paraphraser :: Parrot is a paraphrase based utterance augmentation framework purpose built to accelerate training NLU models.
PLY :: Python Lex-Yacc. http://www.dabeaz.com/ply/index.html
SAMR :: An entry to kaggle's 'Sentiment Analysis on Movie Reviews' competition.
Suggester :: The heart for full-text auto-complete web services.
TextGridTools :: Read, write, and manipulate Praat TextGrid files with Python.
txtai:: builds an AI-powered index over sections of text & supports building text indices to perform similarity searches and create extractive question-answering based systems.
word_cloud :: A little word cloud generator in Python.

§4.1. Computational Linguistics span id="4-1-Computational-Linguistics">

spaCy :: a library for advanced Natural Language Processing in Python and Cython; with pretrained pipelines and currently supports tokenization and training for 60+ languages that features neural network models for tagging, parsing, named entity recognition, text classification and more.

§4.1.1. Named Entity Recognition. span id="4-1-1-Named-Entity-Recognition">

CLNER :: The code is for the ACL-IJCNLP 2021 paper "Improving Named Entity Recognition by External Context Retrieving and Cooperative Learning".
mt-dnn :: This PyTorch package implements the Multi-Task Deep Neural Networks (MT-DNN) for Natural Language Understanding.

§4.2. Digital Humanities. span id="4-2-Digital-Humanities">

NLP-Notebooks-Newspaper-Collections :: NLP Notebooks for Newspaper Collections are aimed particularly at digital humanities scholars who use newspapers as a source.

§4.3. Screen Reading

wordgraph :: This project supports creating English-language text from a graph description for those doing screen reading for vision-impaired people, or just people who like to listen to graphs while jogging, or just to get a handle on what's going on.
Resources
- STT with HMM :: Single Speaker Speech Recognition with Hidden Markov Models.

§4.4. Speech Recognition

Speech recognition software for Linux
Dragonfly :: Dragonfly is a speech recognition framework. It is a Python package which offers a high-level object model and allows its users to easily write scripts, macros, and programs which use speech recognition. Documentation.
ParlAI: A framework for training and evaluating AI models on a variety of openly available dialog datasets. http://parl.ai
speech-processing :: A Python framework for speech processing.

Resources

An Introduction to Natural Language Processing that introduces text based machine learning techniques (ex. N-grams, corpus,..) inorder to do text classification and analysis.

§4.5. Transformers. span id="4-5-Transformers">

BERT :: TensorFlow code and pre-trained models for 24 smaller BERT models (English only, uncased, trained with WordPiece masking) referenced in Well-Read Students Learn Better: On the Importance of Pre-training Compact Models.
Transformers :: State-of-the-art Natural Language Processing for Jax, PyTorch and TensorFlow.

REINFORCEMENT LEARNING

bsuite :: A collection of carefully-designed experiments that investigate core capabilities of a reinforcement learning (RL) agent.
Tensortrade :: An open source reinforcement learning framework for training, evaluating, and deploying robust trading agents.

Reproducibility

AIQC :: is an open source framework for rapid & reproducible deep learning.

SUPERVISED LEARNING

tensor2tensor :: Tensor2Tensor (T2T) Transformers is a modular and extensible library and binaries for supervised learning with TensorFlow and with support for sequence tasks. It is actively used and maintained by researchers and engineers within the Google Brain team.

Resources

ml_cheat_sheet :: Supervised learning superstitions cheat sheet.

UNSUPERVISED LEARNING

GAN

Jokerise :: Jokeriser with CycleGAN.

Neural Networks

BinaryConnect :: Training Deep Neural Networks with binary weights during propagations.
BinaryNet :: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1.
NAMAS :: Neural Attention Model for Abstractive Summarization.
SparkNet :: Distributed Neural Networks for Spark.

Artificial Neural Network

pylearn2 : A Machine Learning library based on Theano.
Tensorflow :: Open source software library for numerical computation using data flow graphs. Source code on GH.
- models :: Models built with TensorFlow.
- Resources: TensorFlow-Tutorials :: Simple tutorials using Google's TensorFlow Framework.
theano-nlp :: Tools and datasets for NLP in Theano.

Pre-Trained Models

Spiral :: A pre-trained model for unconditional 19-step generation of CelebA-HQ images.

Resources

An introduction to Recurrent Neural Networks.
TensorFlow-Book :: Accompanying source code for Machine Learning with TensorFlow. Refer to the book for step-by-step explanations. http://www.tensorflowbook.com

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AI.md

AI.md

§1. AI

§2. DATA SCIENCE

Resources

§3. MACHINE LEARNING.

Resources

§3.1. Deep Learning. span id="3-1-Deep-Learning">

Resources

§3.2. Classification Algorithms. span id="3-2-Classification-Algorithms">

Resources

§3.3. Graph Theory. span id="3-3-Graph-Theory">

Resources

§3.4. GPU. span id="3-4-GPU">

§4. NLP. span id="4-NLP">

§4.1. Computational Linguistics span id="4-1-Computational-Linguistics">

§4.1.1. Named Entity Recognition. span id="4-1-1-Named-Entity-Recognition">

§4.2. Digital Humanities. span id="4-2-Digital-Humanities">

§4.3. Screen Reading

§4.4. Speech Recognition

Resources

§4.5. Transformers. span id="4-5-Transformers">

REINFORCEMENT LEARNING

Reproducibility

SUPERVISED LEARNING

Resources

UNSUPERVISED LEARNING

GAN

Neural Networks

Artificial Neural Network

Pre-Trained Models

Resources

Files

AI.md

Latest commit

History

AI.md

File metadata and controls

§1. AI

§2. DATA SCIENCE

Resources

§3. MACHINE LEARNING.

Resources

§3.1. Deep Learning. span id="3-1-Deep-Learning">

Resources

§3.2. Classification Algorithms. span id="3-2-Classification-Algorithms">

Resources

§3.3. Graph Theory. span id="3-3-Graph-Theory">

Resources

§3.4. GPU. span id="3-4-GPU">

§4. NLP. span id="4-NLP">

§4.1. Computational Linguistics span id="4-1-Computational-Linguistics">

§4.1.1. Named Entity Recognition. span id="4-1-1-Named-Entity-Recognition">

§4.2. Digital Humanities. span id="4-2-Digital-Humanities">

§4.3. Screen Reading

§4.4. Speech Recognition

Resources

§4.5. Transformers. span id="4-5-Transformers">

REINFORCEMENT LEARNING

Reproducibility

SUPERVISED LEARNING

Resources

UNSUPERVISED LEARNING

GAN

Neural Networks

Artificial Neural Network

Pre-Trained Models

Resources