Plume ASR

Generates text from audio containing speech

Prerequisites

# apt install libsndfile-dev ffmpeg

Features

ASR using Jasper (from NemoToolkit )
ASR using Wav2Vec2 (from fairseq )

Installation

To install the packages and its dependencies run.

python setup.py install

or with pip

pip install .[all]

The installation should work on Python 3.6 or newer. Untested on Python 2.7

Usage

Library

Jasper

from plume.models.jasper_nemo.asr import JasperASR
asr_model = JasperASR("/path/to/model_config_yaml","/path/to/encoder_checkpoint","/path/to/decoder_checkpoint") # Loads the models
TEXT = asr_model.transcribe(wav_data) # Returns the text spoken in the wav

Wav2Vec2

from plume.models.wav2vec2.asr import Wav2Vec2ASR
asr_model = Wav2Vec2ASR("/path/to/ctc_checkpoint","/path/to/w2v_checkpoint","/path/to/target_dictionary") # Loads the models
TEXT = asr_model.transcribe(wav_data) # Returns the text spoken in the wav

Command Line

$ plume

Pretrained Models

Jasper https://ngc.nvidia.com/catalog/models/nvidia:multidataset_jasper10x5dr/files?version=3 Wav2Vec2 https://github.com/pytorch/fairseq/blob/master/examples/wav2vec/README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Plume ASR

Table of Contents

Prerequisites

Features

Installation

Usage

Library

Command Line

Pretrained Models

Files

README.md

Latest commit

History

README.md

File metadata and controls

Plume ASR

Table of Contents

Prerequisites

Features

Installation

Usage

Library

Command Line

Pretrained Models