Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LID: several random samples for long file #6853

Merged
merged 32 commits into from
Nov 6, 2023
Merged

Conversation

karpnv
Copy link
Collaborator

@karpnv karpnv commented Jun 12, 2023

What does this PR do ?

Use several random samples for long files

Collection: ASR

Changelog

Added parameters:

  • segment_duration (float): random sample duration in seconds
  • num_segments (int): number of segments of file to use for majority vote

Usage

lang_model = nemo_asr.models.EncDecSpeakerLabelModel.from_pretrained(model_name="langid_ambernet")
lang = lang_model.get_label(filename, segment_duration = np.inf, num_segments = 1, random_seed = None)

PR Type:

  • [V] New Feature

Who can review?

@fayejf

Signed-off-by: Nikolay Karpov <[email protected]>
@github-actions github-actions bot added the ASR label Jun 12, 2023
@github-actions
Copy link
Contributor

This PR is stale because it has been open for 14 days with no activity. Remove stale label or comment or update or this will be closed in 7 days.

@github-actions github-actions bot added the stale label Jun 29, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Jul 6, 2023

This PR was closed because it has been inactive for 7 days since being marked as stale.

@github-actions github-actions bot closed this Jul 6, 2023
@karpnv karpnv reopened this Oct 24, 2023
@karpnv karpnv marked this pull request as ready for review October 24, 2023 17:52
@karpnv karpnv changed the title duration limit LID: several random samples for long file Oct 24, 2023

Returns:
label: label corresponding to the trained model
"""
_, logits = self.infer_file(path2audio_file=path2audio_file)
audio, sr = librosa.load(path2audio_file, sr=None)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you replace this with sf.read it is much faster

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it might not have been originally designed to support reading mp3 or multi-channel (stereo) wav files, but it was able to do so in the past with librosa.load, however, it may result in errors after switching to sf.read. Should we consider adding support for more formats or stick to using librosa.load for consistency?

path2audio_file: path to audio wav file
path2audio_file (str): path to audio wav file
segment_duration (float): random sample duration in seconds
num_segments (int): number of segments of file to use for majority vote
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of num_segments, just do non-overlap segments from start to end based on 5 sec audio samples? Have you done ablation study on what is best?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally didn't, but it was suggested by Fai. This is for very long audio (several hours). We take several segments and get result by majority vote

@github-actions github-actions bot removed the stale label Oct 25, 2023
nithinraok
nithinraok previously approved these changes Oct 25, 2023
Copy link
Collaborator

@nithinraok nithinraok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, Please add a random seed for selection of random segments.

@karpnv
Copy link
Collaborator Author

karpnv commented Oct 26, 2023

added random_seed parameter

nithinraok
nithinraok previously approved these changes Oct 26, 2023
Copy link
Collaborator

@nithinraok nithinraok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM

@karpnv
Copy link
Collaborator Author

karpnv commented Oct 30, 2023

jenkins

@nithinraok
Copy link
Collaborator

jenkins

Signed-off-by: Nikolay Karpov <[email protected]>
@nithinraok
Copy link
Collaborator

jenkins

@karpnv karpnv merged commit 286e84e into main Nov 6, 2023
15 checks passed
@karpnv karpnv deleted the karpnv/duration_limit branch November 6, 2023 07:51
pzelasko pushed a commit to pzelasko/NeMo that referenced this pull request Jan 3, 2024
* add random samlpes (num_segments) with segment_duration for get_label(filename, segment_duration = 60*6, num_segments)

---------

Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Piotr Żelasko <[email protected]>
rohitrango pushed a commit to rohitrango/NeMo that referenced this pull request Jun 25, 2024
* add random samlpes (num_segments) with segment_duration for get_label(filename, segment_duration = 60*6, num_segments)


---------

Signed-off-by: Nikolay Karpov <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants