Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wav2Vec2Bert ASR Inference Support #1778

Merged
merged 30 commits into from
Sep 13, 2024
Merged

Wav2Vec2Bert ASR Inference Support #1778

merged 30 commits into from
Sep 13, 2024

Conversation

homink
Copy link
Contributor

@homink homink commented Sep 11, 2024

This PR allows Wav2Vec2Bert ASR inference within the CTranslate2 framework, specifically improving both speed and memory usage. For the inference processing, Sigmoid activation function is added to process the GLU activation and asymmetric relative positional embedding logic is added in the Attention Class. Compared to the HuggingFace implementation, the int8 quantized model shows an 12% increase in speed and a 61% reduction in GPU memory usage with a 72% reduction in CPU memory usage when processing 300 audio files. Additionally, using an N-gram language model with pyctcdecode further can improve the speech recognition accuracy. My environment includes an NVIDIA GeForce RTX 2080 11GB with CUDA 12.4, torch==2.12+cu12.1, and transformers==4.42.0.

@BBC-Esq
Copy link

BBC-Esq commented Sep 11, 2024

Looking forward to this if it's implemented.

@homink
Copy link
Contributor Author

homink commented Sep 11, 2024

This PR requires transformers>=4.41.0 which is not available yet here so skipped by checking the version. TestWav2Vec2Bert will work once it meets the version. @minhthuc2502, I wonder if I can upgrade transformers by 4.41.0 in python test requirement. Could you please let me know?

@minhthuc2502
Copy link
Collaborator

minhthuc2502 commented Sep 12, 2024

Thank you for your PR. It looks good to me.
I think you can upgrade transformers to 4.41.0 for python test. It won't affect anything.

@homink
Copy link
Contributor Author

homink commented Sep 12, 2024

@minhthuc2502 it looks like transformers 4.41.0 makes some conflict on test_transformers_translation, although such embed_scale is available. Any ideas?

python/tests/test_transformers.py::TestWav2Vec2::test_transformers_wav2vec2[facebook/wav2vec2-large-robust-ft-swbd-300h-expected_transcription0-cpu] PASSED [ 69%]
python/tests/test_transformers.py::TestWav2Vec2Bert::test_transformers_wav2vec2bert[hf-audio/wav2vec2-bert-CV16-en-expected_transcription0-cpu] PASSED [ 70%]
python/tests/test_translator.py::test_invalid_model_path PASSED          [ 71%]
python/tests/test_translator.py::test_unicode_path PASSED                [ 71%]
python/tests/test_translator.py::test_invalid_model_type PASSED          [ 72%]
python/tests/test_translator.py::test_invalid_device_settings PASSED     [ 72%]
python/tests/test_translator.py::test_contains_model PASSED              [ 73%]
python/tests/test_translator.py::test_get_supported_compute_types PASSED [ 73%]
python/tests/test_translator.py::test_translator_properties PASSED       [ 74%]
python/tests/test_translator.py::test_compute_type PASSED                [ 74%]
python/tests/test_translator.py::test_batch_translation[0] PASSED        [ 75%]
python/tests/test_translator.py::test_batch_translation[1] PASSED        [ 75%]
python/tests/test_translator.py::test_batch_translation_async PASSED     [ 76%]
python/tests/test_translator.py::test_iterable_translation PASSED        [ 77%]
python/tests/test_translator.py::test_token_streaming[True] PASSED       [ 77%]
python/tests/test_translator.py::test_token_streaming[False] PASSED      [ 78%]
python/tests/test_translator.py::test_token_streaming_exception PASSED   [ 78%]
python/tests/test_translator.py::test_callback_hypothesis_id PASSED      [ 79%]
python/tests/test_translator.py::test_callback_batch_id PASSED           [ 79%]
python/tests/test_translator.py::test_file_translation PASSED            [ 80%]
python/tests/test_translator.py::test_raw_file_translation PASSED        [ 80%]
python/tests/test_translator.py::test_file_translation_with_prefix PASSED [ 81%]
python/tests/test_translator.py::test_raw_file_translation_with_prefix PASSED [ 81%]
python/tests/test_translator.py::test_empty_translation PASSED           [ 82%]
python/tests/test_translator.py::test_invalid_translation_options PASSED [ 83%]
python/tests/test_translator.py::test_invalid_translation_options_async PASSED [ 83%]
python/tests/test_translator.py::test_hard_target_prefix PASSED          [ 84%]
python/tests/test_translator.py::test_hard_target_prefix_with_vmap[1] PASSED [ 84%]
python/tests/test_translator.py::test_hard_target_prefix_with_vmap[2] PASSED [ 85%]
python/tests/test_translator.py::test_strongly_biased_target_prefix[1] PASSED [ 85%]
python/tests/test_translator.py::test_strongly_biased_target_prefix[2] PASSED [ 86%]
python/tests/test_translator.py::test_weakly_biased_target_prefix[1] PASSED [ 86%]
python/tests/test_translator.py::test_weakly_biased_target_prefix[2] PASSED [ 87%]
python/tests/test_translator.py::test_repetition_penalty_with_vmap[1] PASSED [ 87%]
python/tests/test_translator.py::test_repetition_penalty_with_vmap[2] PASSED [ 88%]
python/tests/test_translator.py::test_no_repeat_ngram_size_with_vmap[1] PASSED [ 89%]
python/tests/test_translator.py::test_no_repeat_ngram_size_with_vmap[2] PASSED [ 89%]
python/tests/test_translator.py::test_suppress_sequences_with_vmap[1] PASSED [ 90%]
python/tests/test_translator.py::test_suppress_sequences_with_vmap[2] PASSED [ 90%]
python/tests/test_translator.py::test_num_hypotheses PASSED              [ 91%]
python/tests/test_translator.py::test_max_decoding_length PASSED         [ 91%]
python/tests/test_translator.py::test_min_decoding_length PASSED         [ 92%]
python/tests/test_translator.py::test_min_decoding_length_with_vmap[1] PASSED [ 92%]
python/tests/test_translator.py::test_min_decoding_length_with_vmap[2] PASSED [ 93%]
python/tests/test_translator.py::test_return_attention PASSED            [ 93%]
python/tests/test_translator.py::test_ignore_scores PASSED               [ 94%]
python/tests/test_translator.py::test_return_alternatives PASSED         [ 95%]
python/tests/test_translator.py::test_return_alternatives_with_vmap PASSED [ 95%]
python/tests/test_translator.py::test_random_sampling PASSED             [ 96%]
python/tests/test_translator.py::test_score_api PASSED                   [ 96%]
python/tests/test_translator.py::test_model_unload[False] PASSED         [ 97%]
python/tests/test_translator.py::test_model_unload[True] PASSED          [ 97%]
python/tests/test_translator.py::test_model_unload_while_async_translation PASSED [ 98%]
python/tests/test_translator.py::test_load_model_from_memory[True] PASSED [ 98%]
python/tests/test_translator.py::test_load_model_from_memory[False] PASSED [ 99%]
python/tests/test_translator.py::test_logging PASSED                     [100%]

FAILED python/tests/test_transformers.py::test_transformers_translation[facebook/m2m100_418M] - AttributeError: 'M2M100Encoder' object has no attribute 'embed_scale'
FAILED python/tests/test_transformers.py::test_transformers_translation[facebook/mbart-large-50-many-to-many-mmt] - AttributeError: 'MBartEncoder' object has no attribute 'embed_scale'
FAILED python/tests/test_transformers.py::test_transformers_translation[facebook/mbart-large-en-ro] - AttributeError: 'MBartEncoder' object has no attribute 'embed_scale'
FAILED python/tests/test_transformers.py::test_transformers_translation[facebook/bart-base] - AttributeError: 'BartEncoder' object has no attribute 'embed_scale'
FAILED python/tests/test_transformers.py::test_transformers_translation[facebook/nllb-200-distilled-600M] - AttributeError: 'M2M100Encoder' object has no attribute 'embed_scale'

@minhthuc2502
Copy link
Collaborator

minhthuc2502 commented Sep 13, 2024

I think because of some changes in transformers decribed in this PR #1760. I see you have already applied the patch. Thanks! I'll merge this.

@minhthuc2502 minhthuc2502 merged commit cb16c8e into OpenNMT:master Sep 13, 2024
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants