Skip to content

Latest commit

 

History

History
93 lines (73 loc) · 5.5 KB

README.md

File metadata and controls

93 lines (73 loc) · 5.5 KB

A Play with Gestures

Sign Language Recognition and Production with Tensorflow and Scikit-learn developed as a final year project for Bachelors in Computer Science.

Technologies

Prerequisites

  • Install Anaconda
  • Open Jupyter notebook

Sign Language Recognition

It differentiates between various static hand gestures i.e. images, with the help of representative features with the core objective of maximizing the classification accuracy on the testing dataset.

Datasets

Feature Extraction

  • Hand coordinates
  • Convolutional features
  • Convolutional features + finger angles
  • Convolutional features on image with only hand edges
  • Linear combination of BRISK keypoints on edge image
  • Convolutional features on BRISK keypoint-infused image

    Take a look at Creating_datasets.ipynb and feature_extraction.ipynb for the code.

Dimensionality Reduction

PCA is applied only on the convolutional-related features since they are extensive in number.

Classification

Ensemble methods

  • Random Forest - takes aggregate result from a collection of decision trees
  • XGBoost - each regression tree improves upon the result of the previous tree

    Take a look at all models-Ensemble methods .ipynb files for the code.

Neural Networks

  • Artificial Neural Networks

    Take a look at models-ANN.ipynb for the code.

  • Hybrid ANN - takes original image and edge image in separate neural blocks, concatenates them and passes them through a final neural block



Take a look at all `models-hybrid ANN` .ipynb files for the code.

Sign Language Production

It generates interpretable sign images by training a generative model on the image dataset itself. The main aim is to generate as many image classes as possible all the while keeping a check on image quality.

Datasets

Since the above dataset comprise of videos, Extracting video frames.ipynb is used to get still images from the dataset.

Latent Vector

This is the input to the generative model from which it gives images as output. It can either be random noise or noise generated from an encoder (Autoencoder_generated_noise.ipynb)

The following variants of the GAN are used in this project:

  • Deep Convolutional GAN - uses convolutional layers in the generator and discriminator

    Take a look at all DCGAN .ipynb files for the code.

  • Wasserstein GAN - uses Wasserstein loss to output image scores for avoiding mode collapse

    Take a look at all WGAN .ipynb files for the code.

  • WGAN with Gradient Penalty - a modified WGAN that allows the model to learn complex functions for better mapping between real and fake images

    Take a look at all WGAN_with_GP .ipynb files for the code.

Note:

Lower FID scores (Fretchet Inception Distance.ipynb) indicate better quality of images generated by the GAN models

Publication

If you are referring this project for academic purposes, please cite the following paper:

Jana A, Krishnakumar SS. Sign Language Gesture Recognition with Convolutional-Type Features on Ensemble Classifiers and Hybrid Artificial Neural Network. Applied Sciences. 2022; 12(14):7303. https://doi.org/10.3390/app12147303