-
Notifications
You must be signed in to change notification settings - Fork 19.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add MelSpectrogram
layer
#17717
Add MelSpectrogram
layer
#17717
Conversation
Imports are incorrectly sorted and/or formatted.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! This is good work. There's a lot of API design work to be done here to make the API as intuitive as possible.
@fchollet Thanks for your valuable feedback. While I understand your concern about the potential confusion caused by non-intuitive names, I would like to point out that the current argument names are widely used and recognized in the community, as demonstrated by popular libraries such as for librosa.feature.melspectrogram(y=None, sr=22050, n_fft=2048, n_mels=128, hop_length=512, win_length=None,
window='hann', power=2.0, fmin=0.0, fmax=None) for torchaudio.transforms.MelSpectrogram(sample_rate=16000, n_fft=400, win_length=None, hop_length=None,
f_min=0.0, f_max= None, n_mels = 128, window_fn = <fn>, power = 2.0) for nnAudio.Spectrogram.MelSpectrogram(sr=22050, n_fft=2048, n_mels=128, hop_length=512,
window='hann', power=2.0,fmin=0.0, fmax=None) While I am open to alternative names, I believe that changing them could create confusion for users who are already familiar with these naming conventions. Therefore, it seems like a trade-off between renowned names and more intuitive names. Any thoughts what should we do? |
--line-length 80
1. add reference link 2. explanation 3. use case
We should use more intuitive names.
|
@fchollet I've updated the names according to your suggestions and replaced the rest with better intuitive names. Let me know if they meet the requirements. By the way just noticed, this PR keras-team/keras-hub/pull/847 in Keras-NLP is using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update -- the API is looking good (just one comment)! Please add unit tests.
@fchollet could you please check? |
@gbaned any update? |
@gbaned could you approve the workflow for unit-test ?? |
@mihirparadkar Hi, I just noticed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Imported from GitHub PR #17717 This PR will add the `MelSpecrtoram` layer as an `audio_preprocessing` layer as mentioned in #17657. I have added the backbone of this layer. This layer will convert raw audio signals to Mel spectrograms. This layer is compatible with both GPU & TPU. Need to add some tests to ensure everything is okay. ## Todo - [x] unbatched audio test - [x] batched audio test - [x] zero values audio test - [ ] serialize callable `ref` cc: @fchollet , @mattdangerw Copybara import of the project: -- d1d8175 by Awsaf <[email protected]>: Add `MelSpectrogram` layer -- afa9e88 by Awsaf <[email protected]>: Fix for isort Imports are incorrectly sorted and/or formatted. -- ae6d109 by Awsaf <[email protected]>: reorder `super.__init__` -- d4a8daf by Awsaf <[email protected]>: Fix output_shape for 1D input -- 914e75d by Awsaf <[email protected]>: Export to only `experimental` layers -- ba1f18e by Awsaf <[email protected]>: Make inline -- 0fda055 by Awsaf <[email protected]>: Remove outline -- afdf73c by Awsaf <[email protected]>: Reformat with black --line-length 80 -- adb9477 by Awsaf <[email protected]>: Update: docstring 1. add reference link 2. explanation 3. use case -- f3e0fe9 by Awsaf <[email protected]>: Example added -- 6181518 by Awsaf <[email protected]>: Update arg names -- 4273fae by Awsaf <[email protected]>: test added -- 865af24 by Awsaf <[email protected]>: melspec test added Merging this change closes #17717 FUTURE_COPYBARA_INTEGRATE_REVIEW=#17717 from awsaf49:melspec 865af24 PiperOrigin-RevId: 547546687
Imported from GitHub PR #17717 This PR will add the `MelSpecrtoram` layer as an `audio_preprocessing` layer as mentioned in #17657. I have added the backbone of this layer. This layer will convert raw audio signals to Mel spectrograms. This layer is compatible with both GPU & TPU. Need to add some tests to ensure everything is okay. ## Todo - [x] unbatched audio test - [x] batched audio test - [x] zero values audio test - [ ] serialize callable `ref` cc: @fchollet , @mattdangerw Copybara import of the project: -- d1d8175 by Awsaf <[email protected]>: Add `MelSpectrogram` layer -- afa9e88 by Awsaf <[email protected]>: Fix for isort Imports are incorrectly sorted and/or formatted. -- ae6d109 by Awsaf <[email protected]>: reorder `super.__init__` -- d4a8daf by Awsaf <[email protected]>: Fix output_shape for 1D input -- 914e75d by Awsaf <[email protected]>: Export to only `experimental` layers -- ba1f18e by Awsaf <[email protected]>: Make inline -- 0fda055 by Awsaf <[email protected]>: Remove outline -- afdf73c by Awsaf <[email protected]>: Reformat with black --line-length 80 -- adb9477 by Awsaf <[email protected]>: Update: docstring 1. add reference link 2. explanation 3. use case -- f3e0fe9 by Awsaf <[email protected]>: Example added -- 6181518 by Awsaf <[email protected]>: Update arg names -- 4273fae by Awsaf <[email protected]>: test added -- 865af24 by Awsaf <[email protected]>: melspec test added Merging this change closes #17717 FUTURE_COPYBARA_INTEGRATE_REVIEW=#17717 from awsaf49:melspec 865af24 PiperOrigin-RevId: 548211937
Currently the PR is on hold due to the following error, /keras/distribute/ctl_correctness_test.runfiles/org_keras/keras/metrics/confusion_metrics.py", line 22, in <module>
from keras import activations
File "/home/kbuilder/.cache/bazel/_bazel_kbuilder/31d6f47147b75c35404d734345be7323/execroot/org_keras/bazel-out/k8-opt/bin/keras/distribute/ctl_correctness_test.runfiles/org_keras/keras/activations.py", line 22, in <module>
import keras.layers.activation as activation_layers
ImportError: cannot import name 'layers' from partially initialized module 'keras' (most likely due to a circular import) (/home/kbuilder/.cache/bazel/_bazel_kbuilder/31d6f47147b75c35404d734345be7323/execroot/org_keras/bazel-out/k8-opt/bin/keras/distribute/ctl_correctness_test.runfiles/org_keras/keras/__init__.py) I can't seem to find a fix for it as it relates to |
Hello, Thank you for submitting a pull request. We're currently in the process of migrating the new |
This PR will add the
MelSpecrtoram
layer as anaudio_preprocessing
layer as mentioned in keras-team/tf-keras#55. I have added the backbone of this layer. This layer will convert raw audio signals to Mel spectrograms. This layer is compatible with both GPU & TPU.Need to add some tests to ensure everything is okay.
Todo
ref
cc: @fchollet , @mattdangerw