Develop Audio Nodes #1115

TheMutta · 2024-08-26T15:03:39Z

Adding audio managment capabilities to depthai with the creation of various audio nodes for audio capture, output, mixing, encoding and replay.

AudioIn and AudioOut as endpoints for raw audio passing through alsa devices.
AudioEncoder recieves data from one endpoint and converts it to some other format.
AudioReplay, which plays a specific audio file on host.
AudioMixer, tasked with multiplexing audio inputs and outputs

asahtik · 2024-09-04T11:34:30Z

CMakeLists.txt

@@ -647,6 +654,9 @@ target_link_libraries(${TARGET_CORE_NAME}
    INTERFACE
        XLinkPublic
    PRIVATE
+        sndfile
+	samplerate
+	asound


Please fix indentation

asahtik · 2024-09-04T11:46:02Z

It'd probably be good to add comments to node / datatype getters and setters for docs generation before merging.

Also might be good to run clangformat, but I'm not sure if v3_develop is properly formatted. If clangformat causes a lot of new changes, skip - we'll make a separate merge just for formatting.

moratom

Thanks Filippo!

Writing here a brief TODO list as a summary:

Address comments from the PR
Try to make all used libraries included by Hunter and/or make them optional so DAI can be compiled without audio support
Regardless of the option chosen above, they should be included with cmake
Make sure the PR builds successfully on all platforms in CI
TBD on the AudioMixer node

moratom · 2024-09-18T09:41:27Z

bindings/python/src/DeviceBindings.cpp

+        .def("getAlsaDevices", [](DeviceBase& d) { py::gil_scoped_release release; return d.getAlsaDevices(); }, DOC(dai, DeviceBase, getAlsaDevices))
+        .def("getAlsaPCMs", [](DeviceBase& d) { py::gil_scoped_release release; return d.getAlsaPCMs(); }, DOC(dai, DeviceBase, getAlsaPCMs))


TODO:

Add impl on RVC2 side of things as well to return empty arrays

moratom · 2024-09-18T10:00:41Z

src/utility/AudioHelpers.cpp

+namespace dai {
+namespace audio {
+
+std::vector<AudioDevice> GetAlsaDevices() {


Suggested change

std::vector<AudioDevice> GetAlsaDevices() {

std::vector<AudioDevice> getAlsaDevices() {

moratom · 2024-09-18T10:02:53Z

src/utility/AudioHelpers.cpp

+    return vec;
+}
+
+std::vector<AudioPCM> GetAlsaPCMs() {


Suggested change

std::vector<AudioPCM> GetAlsaPCMs() {

std::vector<AudioPCM> getAlsaPCMs() {

Same comment for other functions

moratom · 2024-09-18T10:04:27Z

src/utility/AudioHelpers.cpp

@@ -0,0 +1,234 @@
+#include "depthai/utility/AudioHelpers.hpp"
+
+#include <alsa/asoundlib.h>


We'll need to include this with hunter ideally or at least make it optional at compile time if we leave it for the system.

moratom · 2024-09-18T10:05:01Z

src/pipeline/node/AudioReplay.cpp

+            break;
+    }
+
+    // std::cout << "Duration frames: " << durationFrames << std::endl;


We can use a logger here

moratom · 2024-09-18T10:21:43Z

include/depthai/pipeline/datatype/AudioFrame.hpp

+   public:
+    sf_count_t frames;
+    unsigned int bitrate;
+    unsigned int channels;
+    int format;
+


Can be made private?

moratom · 2024-09-18T10:23:01Z

include/depthai/properties/AudioMixerProperties.hpp

+struct AudioMixerProperties : PropertiesSerializable<Properties, AudioMixerProperties> {
+    static constexpr int AUTO = -1;
+
+    bool ready = false;


Does it make sense for this to be a serializable property?

moratom · 2024-09-18T11:09:31Z

examples/python/Audio/audio_encoder.py

+#!/usr/bin/env python3
+
+import depthai as dai
+
+# Create pipeline
+device = dai.Device()
+with dai.Pipeline(device) as pipeline:
+    in = pipeline.create(dai.node.AudioIn);
+    in.setRunOnHost(True);
+    in.setDeviceName("microphone");
+    in.setDevicePath("default");
+    in.setBitrate(48000);
+    in.setFps(16);
+    in.setChannels(2);
+    in.setFormat(dai.AudioFrame.AUDIO_FORMAT_PCM_32);
+
+    out = pipeline.create(dai.node.AudioOut);
+    out.setRunOnHost(True);
+    out.setDeviceName("speaker");
+    out.setDevicePath("default");
+    out.setBitrate(44100);
+    out.setFps(16);
+    out.setChannels(2);
+    out.setFormat(dai.AudioFrame.AUDIO_FORMAT_PCM_16);
+
+    encoder = pipeline.create(dai.node.AudioOut);
+    out.setRunOnHost(False);
+    out.setBitrate(44100);
+    out.setChannels(2);
+    out.setFormat(dai.AudioFrame.AUDIO_FORMAT_PCM_16);
+
+    in.out.link(encoder.input);
+    encoder.out.link(out.input);


Let's skip the semicolons

moratom · 2024-09-18T11:10:34Z

examples/python/Audio/AudioReplay.py

Rename the example to match others

moratom · 2024-09-18T11:12:15Z

examples/cpp/RVC4/Audio/audio_replay.cpp

+    dai::Pipeline pipeline;
+
+    auto replay = pipeline.create<dai::node::AudioReplay>();
+    replay->setSourceFile("/tmp/test.wav");


Let's add all test files to be pulled by hunter so that the examples work out of the box.

Likewise for python, so they're pulled in by install_requirements.py script. (Check how it works for videos that are used in current replay examples)

…udioOut

TheMutta added 20 commits August 12, 2024 15:44

Added AudioIn node.

216b064

Added audio to CMakeLists

d9cc8cc

Removed unused functions.

8f531f6

Audio nodes skeleton

0c8df80

Removed log files.

ee7a202

Added new audio helpers with the use of libsndfile.

29ec561

Fixed bugs in AudioReplay and made AudioIn/Out host runnable

ab482c4

Implemented basic mixer

9c5ac1d

Example for audio testing updated.

098c776

Added AudioFrame for future use.

fd2bd3e

Supporting all audio formats from 8 bit to double

7851fb6

Added audio frame to cmake.

3efa2bb

Added audio encoder for format/rate

a67e60d

Finalized design and layout of AudioFrame datatype.

a79ed6b

Usign AudioFrame in place of Buffer for audio nodes.

204d989

Finalized properties for audio encoder.

f40443f

Added libsamplerate

0be58ae

Audio example.

41a8e06

Mixer now supports fully AudioFrames

85afbef

Multithreaded mixer.

ae0ca2c

TheMutta requested a review from moratom August 26, 2024 15:03

TheMutta self-assigned this Aug 26, 2024

TheMutta and others added 8 commits August 27, 2024 12:15

Removed debug messages.

c8457cb

AudioFrame to StreamMessageParser

3406ece

Added examples.

f860250

Transformed AudioNodes private fields into protected.

97a640a

Removed eccess function from audio nodes.

a16bd47

Added helper constructors to AudioFrame.

307aced

Added missing frames metadata from AudioFrame

51eda95

TODO: mixer only runs on host

ff86777

Filippo Mutta added 15 commits August 29, 2024 16:24

Running encoder not on host in examples.

ee0da3f

Clangformat

fb6b3ee

Added test for encoder.

9f7427c

Fixed bug in frame calculation in AudioMixer

b4df862

Encoder and mixer test.

789b7ae

Added RPC getAlsaDevices

bd40c69

Fixed audio list example.

c8228b0

Audio nodes python bindings.

c2329d2

Finalised python bindings for Audio nodes.

f130cb7

Added AudioFormat enum to be compatible with SF_FORMAT

5119284

Added AudioFormat binding.

e616c8d

Added missing running host function on bindings for AudioOut

6589a73

Added python Audio examples.

70b2bea

Added GetAlsaPCMs function

b2169ef

Added GetAlsaPCMs callback

8c14ad5

asahtik reviewed Sep 4, 2024

View reviewed changes

moratom reviewed Sep 18, 2024

View reviewed changes

Filippo Mutta added 6 commits October 11, 2024 12:13

Added verbose error messages to AudioIn and AudioOut nodes.

a41f224

Moved snd_pcm_t *captureHandle to the run() function in AudioIn and A…

b5b8aa3

…udioOut

Added additional tests in configuration stage of AudioIn

d4119af

Added additional tests AudioOut

ad3852d

Added more explicative error messages + added snd_pcm_wait()

ffaee85

Fixed typo

d1816e1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Develop Audio Nodes #1115

Develop Audio Nodes #1115

TheMutta commented Aug 26, 2024

asahtik Sep 4, 2024

asahtik commented Sep 4, 2024 •

edited

Loading

moratom left a comment

moratom Sep 18, 2024

moratom Sep 18, 2024

moratom Sep 18, 2024

moratom Sep 18, 2024

moratom Sep 18, 2024

moratom Sep 18, 2024

moratom Sep 18, 2024

moratom Sep 18, 2024

moratom Sep 18, 2024

moratom Sep 18, 2024

		.def("getAlsaDevices", [](DeviceBase& d) { py::gil_scoped_release release; return d.getAlsaDevices(); }, DOC(dai, DeviceBase, getAlsaDevices))
		.def("getAlsaPCMs", [](DeviceBase& d) { py::gil_scoped_release release; return d.getAlsaPCMs(); }, DOC(dai, DeviceBase, getAlsaPCMs))

	std::vector<AudioDevice> GetAlsaDevices() {
	std::vector<AudioDevice> getAlsaDevices() {

	std::vector<AudioPCM> GetAlsaPCMs() {
	std::vector<AudioPCM> getAlsaPCMs() {

		@@ -0,0 +1,234 @@
		#include "depthai/utility/AudioHelpers.hpp"

		#include <alsa/asoundlib.h>

Develop Audio Nodes #1115

Are you sure you want to change the base?

Develop Audio Nodes #1115

Conversation

TheMutta commented Aug 26, 2024

Choose a reason for hiding this comment

asahtik commented Sep 4, 2024 • edited Loading

moratom left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

asahtik commented Sep 4, 2024 •

edited

Loading