Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MelSpectrogram layer #17717

Closed
wants to merge 23 commits into from
Closed

Add MelSpectrogram layer #17717

wants to merge 23 commits into from

Conversation

awsaf49
Copy link
Contributor

@awsaf49 awsaf49 commented Mar 24, 2023

This PR will add the MelSpecrtoram layer as an audio_preprocessing layer as mentioned in keras-team/tf-keras#55. I have added the backbone of this layer. This layer will convert raw audio signals to Mel spectrograms. This layer is compatible with both GPU & TPU.

Need to add some tests to ensure everything is okay.

Todo

  • unbatched audio test
  • batched audio test
  • zero values audio test
  • serialize callable ref

cc: @fchollet , @mattdangerw

@awsaf49
Copy link
Contributor Author

awsaf49 commented Mar 24, 2023

It turns out linting is failing due to conflict between flake8 and black. Black is making the formatted code line too wide which is flagged by flake8 on other hand if code is reformated to adjust for too wide line then black flags it
This is resolved setting --line-length 80 in black

Copy link
Member

@fchollet fchollet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! This is good work. There's a lot of API design work to be done here to make the API as intuitive as possible.

keras/layers/preprocessing/audio_preprocessing.py Outdated Show resolved Hide resolved
keras/layers/preprocessing/audio_preprocessing.py Outdated Show resolved Hide resolved
keras/layers/preprocessing/audio_preprocessing.py Outdated Show resolved Hide resolved
keras/layers/preprocessing/audio_preprocessing.py Outdated Show resolved Hide resolved
keras/layers/preprocessing/audio_preprocessing.py Outdated Show resolved Hide resolved
keras/layers/preprocessing/audio_preprocessing.py Outdated Show resolved Hide resolved
keras/layers/preprocessing/audio_preprocessing.py Outdated Show resolved Hide resolved
keras/layers/preprocessing/audio_preprocessing.py Outdated Show resolved Hide resolved
keras/layers/preprocessing/audio_preprocessing.py Outdated Show resolved Hide resolved
keras/layers/preprocessing/audio_preprocessing.py Outdated Show resolved Hide resolved
@awsaf49
Copy link
Contributor Author

awsaf49 commented Mar 25, 2023

@fchollet Thanks for your valuable feedback. While I understand your concern about the potential confusion caused by non-intuitive names, I would like to point out that the current argument names are widely used and recognized in the community, as demonstrated by popular libraries such as librosa,torchaudio, and nnaudio. For example,

for librosa

librosa.feature.melspectrogram(y=None, sr=22050, n_fft=2048, n_mels=128, hop_length=512, win_length=None, 
                               window='hann', power=2.0, fmin=0.0, fmax=None)

for torchaudio

torchaudio.transforms.MelSpectrogram(sample_rate=16000, n_fft=400, win_length=None, hop_length=None, 
                                     f_min=0.0, f_max= None, n_mels = 128, window_fn = <fn>, power = 2.0)

for nnaudio,

nnAudio.Spectrogram.MelSpectrogram(sr=22050, n_fft=2048, n_mels=128, hop_length=512, 
                                   window='hann', power=2.0,fmin=0.0, fmax=None)

While I am open to alternative names, I believe that changing them could create confusion for users who are already familiar with these naming conventions. Therefore, it seems like a trade-off between renowned names and more intuitive names. Any thoughts what should we do?

@awsaf49 awsaf49 requested a review from fchollet March 26, 2023 09:23
@google-ml-butler google-ml-butler bot added the keras-team-review-pending Pending review by a Keras team member. label Mar 26, 2023
@fchollet
Copy link
Member

Therefore, it seems like a trade-off between renowned names and more intuitive names. Any thoughts what should we do?

We should use more intuitive names.

  • There are more people who will use these APIs in the future than they are people using these APIs today. We're doing them a service by adopting better naming conventions.
  • If the names are intuitive, then they will be intuitive / easy to understand for people already familiar with the current APIs.
  • Keras APIs must be consistent with the Keras API. It would be annoying and surprising if something called "sampling_rate" in several other places of the API was called "sample_rate" here.

@awsaf49
Copy link
Contributor Author

awsaf49 commented Mar 27, 2023

@fchollet I've updated the names according to your suggestions and replaced the rest with better intuitive names. Let me know if they meet the requirements.

By the way just noticed, this PR keras-team/keras-hub/pull/847 in Keras-NLP is using sample_rate instead of sampling_rate & stride instead of fft_stride.

@gbaned gbaned removed the keras-team-review-pending Pending review by a Keras team member. label Mar 28, 2023
Copy link
Member

@fchollet fchollet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update -- the API is looking good (just one comment)! Please add unit tests.

@awsaf49 awsaf49 requested a review from fchollet May 3, 2023 04:00
@awsaf49
Copy link
Contributor Author

awsaf49 commented May 19, 2023

@fchollet could you please check?

@awsaf49
Copy link
Contributor Author

awsaf49 commented May 27, 2023

@gbaned any update?

@awsaf49 awsaf49 marked this pull request as ready for review May 31, 2023 03:42
@awsaf49
Copy link
Contributor Author

awsaf49 commented Jul 1, 2023

@gbaned could you approve the workflow for unit-test ??

@gbaned gbaned added the keras-team-review-pending Pending review by a Keras team member. label Jul 4, 2023
@mihirparadkar mihirparadkar removed the keras-team-review-pending Pending review by a Keras team member. label Jul 6, 2023
@awsaf49
Copy link
Contributor Author

awsaf49 commented Jul 7, 2023

@mihirparadkar Hi, I just noticed keras-team-review-pending label has been removed. Any update on this??

Copy link
Member

@fchollet fchollet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@google-ml-butler google-ml-butler bot added kokoro:force-run ready to pull Ready to be merged into the codebase labels Jul 12, 2023
copybara-service bot pushed a commit that referenced this pull request Jul 12, 2023
Imported from GitHub PR #17717

This PR will add the `MelSpecrtoram` layer as an `audio_preprocessing` layer as mentioned in #17657. I have added the backbone of this layer. This layer will convert raw audio signals to Mel spectrograms. This layer is compatible with both GPU & TPU.

Need to add some tests to ensure everything is okay.

## Todo
- [x] unbatched audio test
- [x] batched audio test
- [x] zero values audio test
- [ ] serialize callable `ref`

cc: @fchollet , @mattdangerw
Copybara import of the project:

--
d1d8175 by Awsaf <[email protected]>:

Add `MelSpectrogram` layer

--
afa9e88 by Awsaf <[email protected]>:

Fix for isort

Imports are incorrectly sorted and/or formatted.
--
ae6d109 by Awsaf <[email protected]>:

reorder `super.__init__`

--
d4a8daf by Awsaf <[email protected]>:

Fix output_shape for 1D input

--
914e75d by Awsaf <[email protected]>:

Export to only `experimental` layers

--
ba1f18e by Awsaf <[email protected]>:

Make inline

--
0fda055 by Awsaf <[email protected]>:

Remove outline

--
afdf73c by Awsaf <[email protected]>:

Reformat with black

--line-length 80
--
adb9477 by Awsaf <[email protected]>:

Update: docstring

1. add reference link
2. explanation
3. use case
--
f3e0fe9 by Awsaf <[email protected]>:

Example added

--
6181518 by Awsaf <[email protected]>:

Update arg names

--
4273fae by Awsaf <[email protected]>:

test added

--
865af24 by Awsaf <[email protected]>:

melspec test added

Merging this change closes #17717

FUTURE_COPYBARA_INTEGRATE_REVIEW=#17717 from awsaf49:melspec 865af24
PiperOrigin-RevId: 547546687
@google-ml-butler google-ml-butler bot removed the ready to pull Ready to be merged into the codebase label Jul 13, 2023
@gbaned gbaned requested a review from fchollet July 14, 2023 07:39
copybara-service bot pushed a commit that referenced this pull request Jul 14, 2023
Imported from GitHub PR #17717

This PR will add the `MelSpecrtoram` layer as an `audio_preprocessing` layer as mentioned in #17657. I have added the backbone of this layer. This layer will convert raw audio signals to Mel spectrograms. This layer is compatible with both GPU & TPU.

Need to add some tests to ensure everything is okay.

## Todo
- [x] unbatched audio test
- [x] batched audio test
- [x] zero values audio test
- [ ] serialize callable `ref`

cc: @fchollet , @mattdangerw
Copybara import of the project:

--
d1d8175 by Awsaf <[email protected]>:

Add `MelSpectrogram` layer

--
afa9e88 by Awsaf <[email protected]>:

Fix for isort

Imports are incorrectly sorted and/or formatted.
--
ae6d109 by Awsaf <[email protected]>:

reorder `super.__init__`

--
d4a8daf by Awsaf <[email protected]>:

Fix output_shape for 1D input

--
914e75d by Awsaf <[email protected]>:

Export to only `experimental` layers

--
ba1f18e by Awsaf <[email protected]>:

Make inline

--
0fda055 by Awsaf <[email protected]>:

Remove outline

--
afdf73c by Awsaf <[email protected]>:

Reformat with black

--line-length 80
--
adb9477 by Awsaf <[email protected]>:

Update: docstring

1. add reference link
2. explanation
3. use case
--
f3e0fe9 by Awsaf <[email protected]>:

Example added

--
6181518 by Awsaf <[email protected]>:

Update arg names

--
4273fae by Awsaf <[email protected]>:

test added

--
865af24 by Awsaf <[email protected]>:

melspec test added

Merging this change closes #17717

FUTURE_COPYBARA_INTEGRATE_REVIEW=#17717 from awsaf49:melspec 865af24
PiperOrigin-RevId: 548211937
@gbaned gbaned requested review from fchollet and removed request for fchollet July 17, 2023 07:54
@awsaf49
Copy link
Contributor Author

awsaf49 commented Sep 15, 2023

Currently the PR is on hold due to the following error,

/keras/distribute/ctl_correctness_test.runfiles/org_keras/keras/metrics/confusion_metrics.py", line 22, in <module>
    from keras import activations
  File "/home/kbuilder/.cache/bazel/_bazel_kbuilder/31d6f47147b75c35404d734345be7323/execroot/org_keras/bazel-out/k8-opt/bin/keras/distribute/ctl_correctness_test.runfiles/org_keras/keras/activations.py", line 22, in <module>
    import keras.layers.activation as activation_layers
ImportError: cannot import name 'layers' from partially initialized module 'keras' (most likely due to a circular import) (/home/kbuilder/.cache/bazel/_bazel_kbuilder/31d6f47147b75c35404d734345be7323/execroot/org_keras/bazel-out/k8-opt/bin/keras/distribute/ctl_correctness_test.runfiles/org_keras/keras/__init__.py)

I can't seem to find a fix for it as it relates to keras circular import but the code is written with a similar structure as the image_processing layer which works just fine. Any suggestions would be really helpful.
@fchollet @mattdangerw

@sachinprasadhs
Copy link
Collaborator

Hello, Thank you for submitting a pull request.

We're currently in the process of migrating the new Keras 3 code base from keras-team/keras-core to keras-team/keras.
Consequently, merging this PR is not possible at the moment. After the migration is successfully completed, feel free to reopen this PR at keras-team/keras if you believe it remains relevant to the Keras 3 code base. If instead this PR fixes a bug or security issue in legacy tf.keras, you can instead reopen the PR at keras-team/tf-keras, which hosts the TensorFlow-only, legacy version of Keras.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants