This repository contains the source code of all projects developed for the ML4NLP class at the University of Zurich during fall semester 2021.
The projects cover the following topics:
- Language Identification on Tweets using sklearn.
- Building Word Embeddings with PyTorch or TensorFlow.
- Language Identification on Tweets with CNNs using PyTorch and TensorFlow.
- Paper dissection of "Improving Language Understanding by Generative Pre-Training" (GPT-1 Paper).
- Named entity recognition by using and fine-tuning pretrained language models.
- Topic Modeling based on Latent Dirichlet Allocation.
pandoc -V geometry:margin=2cm -o ex01_report.pdf README.md