-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LID: several random samples for long file #6853
Conversation
Signed-off-by: Nikolay Karpov <[email protected]>
for more information, see https://pre-commit.ci
Signed-off-by: Nikolay Karpov <[email protected]>
This PR is stale because it has been open for 14 days with no activity. Remove stale label or comment or update or this will be closed in 7 days. |
This PR was closed because it has been inactive for 7 days since being marked as stale. |
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
for more information, see https://pre-commit.ci
Signed-off-by: Nikolay Karpov <[email protected]>
…into karpnv/duration_limit
Signed-off-by: Nikolay Karpov <[email protected]>
for more information, see https://pre-commit.ci
Signed-off-by: Nikolay Karpov <[email protected]>
…into karpnv/duration_limit
Signed-off-by: Nikolay Karpov <[email protected]>
|
||
Returns: | ||
label: label corresponding to the trained model | ||
""" | ||
_, logits = self.infer_file(path2audio_file=path2audio_file) | ||
audio, sr = librosa.load(path2audio_file, sr=None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you replace this with sf.read
it is much faster
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know it might not have been originally designed to support reading mp3 or multi-channel (stereo) wav files, but it was able to do so in the past with librosa.load, however, it may result in errors after switching to sf.read. Should we consider adding support for more formats or stick to using librosa.load for consistency?
path2audio_file: path to audio wav file | ||
path2audio_file (str): path to audio wav file | ||
segment_duration (float): random sample duration in seconds | ||
num_segments (int): number of segments of file to use for majority vote |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of num_segments, just do non-overlap segments from start to end based on 5 sec audio samples? Have you done ablation study on what is best?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally didn't, but it was suggested by Fai. This is for very long audio (several hours). We take several segments and get result by majority vote
Signed-off-by: Nikolay Karpov <[email protected]>
for more information, see https://pre-commit.ci
Signed-off-by: Nikolay Karpov <[email protected]>
…into karpnv/duration_limit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, Please add a random seed for selection of random segments.
Signed-off-by: Nikolay Karpov <[email protected]>
for more information, see https://pre-commit.ci
added random_seed parameter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, LGTM
Signed-off-by: Nikolay Karpov <[email protected]>
jenkins |
Signed-off-by: Nikolay Karpov <[email protected]>
jenkins |
Signed-off-by: Nikolay Karpov <[email protected]>
jenkins |
* add random samlpes (num_segments) with segment_duration for get_label(filename, segment_duration = 60*6, num_segments) --------- Signed-off-by: Nikolay Karpov <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]>
* add random samlpes (num_segments) with segment_duration for get_label(filename, segment_duration = 60*6, num_segments) --------- Signed-off-by: Nikolay Karpov <[email protected]>
What does this PR do ?
Use several random samples for long files
Collection: ASR
Changelog
Added parameters:
Usage
PR Type:
Who can review?
@fayejf