Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple calls to audio.play(wait=False) and microphone.record_into(wait=False) #198

Open
microbit-carlos opened this issue Apr 12, 2024 · 6 comments
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@microbit-carlos
Copy link
Contributor

Multiple subsequent calls to audio.play(buffer, wait=False) and microphone.record_into(buffer, wait=False) will cancel the current playback/recording and start the new one immediately.

This is very likely the user expectation, and if we wanted to block until the previous call finishes we can always wait with the audio.is_playing() and microhone.is_recording() functions.

However, when first building a programmes using wait=False, we found ourselves in situations where "blocking with a queue of 1" was useful. This is the approached we ended up following in CODAL as well for some of the async audio functionality.

For example, to illustrate what I mean I'll change the wait parameter with a new value:

audio.play(audioframe_1, wait="queue")   # starts playing immediately
display.show(Image.HAPPY)                # It's shown almost instantly
audio.play(audioframe_2, wait="queue")   # waits until the previous playback ended before playing audioframe_2
display.show(Image.SAD)                  # It's only shown when the second audioframe_2 starts playing

So, for a loop like this one:

buffers = [...]
next_buffer = 0
while True:
    microphone.record_into(buffers[next_buffer], wait="queue")
    # Do something else
    next_buffer = (next_buffer + 1) % len(buffers)

We might end up having to do something like:

buffers = [...]
next_buffer = 0
while True:
    while microphone.is_recording():
        pass
    microphone.record_into(buffer, wait=False)
    # Do something else
    next_buffer = (next_buffer + 1) % len(buffers)

And we think that some of the audio clicks we hear when trying to constantly transmit and play audio data via radio (walkie-talkie projects) might be produced between the while is_recording() and the record_into(), or in the case of playback, between while is_playing() and audio.play().

@dpgeorge
Copy link
Collaborator

Doing continuous and smooth playback/recording is a good goal to achieve. And having a queue as suggested above is a neat way to do it. There could be instead a method like audio.wait_finished_playing(), but calling that before the next audio.play() will probably lead to small gaps in the output. So a queue is a good idea.

Bit actually I think the feature is orthogonal to the existing wait keyword argument. Because you may want to queue and block. So consider instead a new keyword argument called queue:

audio.play(frame1, wait=False)             # start playing and return
audio.play(frame2, queue=True, wait=False) # wait for previous frame, then play
audio.play(frame3, queue=True)             # wait for previous frame, then play and block

The queue argument means "wait for any existing playback/recording to finish and then immediately start this next one".


But, you can already do gapless playback by passing in a generator to the play function. The audio subsystem will continue to get frames from the generator until the generator is exhausted. This is a very flexible way of generating audio frames. For example:

def generator():
    yield audioframe_1
    yield audioframe_2

audio.play(generator())

Or a continuous streaming example:

frames = collections.deque((), 4)  # holds a queue of frames to play

def generator():
    blank_frame = audio.AudioFrame()
    while True:
        if not frames:
            # no frames ready, just play a blank frame
            yield blank_frame
        else:
            # pop a frame and play it
            yield frame.popleft()

# start the playback
audio.play(generator(), wait=False)

# generate audio frames
while True:
    frames.append(get_frame_from_radio())

It might be a good idea to extend this generator functionality to microphone recording, where the generator yields the next frame to record into. When the generator is exhausted the recording stops.

@dpgeorge
Copy link
Collaborator

After more thought on this it might not be as simple as extending the microphone recording function to support a generator.

First of all, I think it's important to have a nice symmetry between playback and recording. So if you stream playback with a generator yielding frames, then you should be able to stream recording using a generator that consumes frames.

But that brings us to the main issue with recording, that you need to have two points in the code where recording is managed:

  • one point in the code that feeds in new frames to use for recording, and maintaining a queue so there's at least one ready to record into
  • another point in the code that consumes frames that have just been recorded into (otherwise how does the code know when to use the frame, eg send it over radio)

This is different to playback, where there is only one point needed in the code, and that is to queue frames into the play pipeline (for full symmetry with recording, you could be notified when each frame has finished playing, but I don't think that's useful functionality).

Let's say we tried to use a generator to record a stream of frames. The generator would need to both yield frames to record into, and receive frames that have been recorded. But you want to be able to queue a bunch of frames before the first one has finished recording, so it's not as simple as new_frame = yield empty_frame. It needs to be, for example:

def recording_generator():
    frames = [AudioFrame(), AudioFrame(), AudioFrame()]
    yield frames  # yield the empty frames to record into
    while True:
        finished_frame = yield  # wait for the next frame that is finished recording
        radio.send(finished_frame)  # do something with the frame

microphone.record_into(recording_generator)

This is pretty difficult to understand. It's probably better to separate the queue of frames from the consumption:

def recording_generator():
    while True:
        finished_frame = yield  # block until a frame is ready
        radio.send(finished_frame)

microphone.record_into([AudioFrame(), AudioFrame()], continuous=True, when_complete=recording_generator)

That could work. But we've seen that having generators "run in the background" can be hard to debug. For example it's not possible to stop the generator with ctrl-C. You'd need to use microphone.stop_recording() to stop it.

An alternative would be a callback instead of a generator, eg:

def frame_ready(frame):
    radio.send(frame)

microphone.record_into([AudioFrame(), AudioFrame()], continuous=True, when_complete=frame_ready)

I think that's a bit more understandable than the generator case.

Alternatively, instead of callbacks, there could be a function to wait and then retrieve the most recent frame:

microphone.record_continuous_stream([AudioFrame(), AudioFrame()])
while True:
    frame = microphone.get_next_frame()  # will block until the next frame is ready
    radio.send(frame)

That could actually be turned into a generator, where the microphone is the generator:

microphone.record_continuous_stream([AudioFrame(), AudioFrame()])
while True:
    frame = next(microphone)  # will block until the next frame is ready
    radio.send(frame)

that would allow:

microphone.record_continuous_stream([AudioFrame(), AudioFrame()])
# this loop will never end, frames will keep being
for frame in microphone:
    radio.send(frame)

Note that all these microphone stream recording schemes require two parts: preload a queue of frames, then consume frames as they become ready.

Whatever scheme is chosen, I think playback should have a similar scheme implemented.

@microbit-carlos
Copy link
Contributor Author

Thanks Damien for the detailed proposal!

I agree that using a generator with a yield assignment can be quite difficult, and we should separate the streams, so I'd discard that option.


I quite like the callback option, and in my during my first impressions it was my preferred option (although after further consideration that has now changed).

It is similar to the generator version, but I agree that it is a lot easier to understand, as it's clear how to use the frame argument.

def frame_ready(frame):
    radio.send(frame)

microphone.record_into([AudioFrame(), AudioFrame()], continuous=True, when_complete=frame_ready)

I'm not 100% convinced about the [AudioFrame(), AudioFrame()], continuous=True arguments. I can definitely see their value, but it doesn't feel immediately obvious what they are and how the can be used.

Alternatively, if we wanted to use generator instead of the continuous flag, we could end up with something along these lines:

def frame_generator():
    audio_buffers = (AudioRecording(), AudioRecording())
    i = 0
    while True:
        yield audio_buffers[i]
        i = (i + 1) % len(audio_buffers)

def frame_ready(frame):
    radio.send(frame)

microphone.record_into(frame_generator(), when_complete=frame_ready)

This is definitely more code and looks more complex than the continuous flag.


Having a mirophone.record_continuous_stream(..) with an extra generator to retrieved the filled buffers is very attractive.

I think perhaps for this proposal I would go a step further and rather than having to provide a list of inputs, we can instead indicate the size of the buffer and then return a generator directly:

for frame in microphone.record_stream(duration=25, rate=7812):
    radio.send(frame)

I prefer this approach because:

  • It's easier to understand the relationship of a generator returned from microphone.record_stream() than to have to pair it with a different function/generator from the microphone module
  • With record_continuous_stream() it's hard to tell what exactly needs to be the input. Why 2 and not 3 buffers?
    • Is there anything to do with these input buffers before they are returned by the generator? Having access to these buffers declared in the user code doesn't really offer that much to the user?
  • In my opinion it's slightly more intuitive that for x in microphone.record_stream() can be an infinite loop, vs for x in microphone.
    • Using microphone.stop_recording() inside the loop feels obvious enough with either option.

One open question would be the function signature, having it mirror microphone.record(duration, rate) is nice, but we are presented once more with the question of how to allow a buffer size in bytes without having incompatible arguments.
Based on that we'd also have to choose if the generator returns AudioRecordings or AudioTracks.

There could also be a question, that if the function signature is very similar to microphone.record() perhaps we could instead add another flag to record() to return a generator instead of an AudioRecording. My preference would be to keep these two separated, as they are used differently, it adds complexity to microphone.record() (which should be the entry point for the recording feature, and therefore as simple as possible) and are more clear when the difference is in the function name (it's also easier to configure for the type checker, but that shouldn't be a driver).

Disadvantages:

  • Memory fragmentation, if we need to create a new AudioRecording/Track each time
  • Anything else?

@dpgeorge
Copy link
Collaborator

dpgeorge commented Sep 4, 2024

for frame in microphone.record_stream(duration=25, rate=7812):
    radio.send(frame)

What exactly is the duration parameter here used for? Is this doing a continuous record or only a record for a fixed duration?

I think one of the use cases we should support here is some kind of sound level meter / real-time audio display. For that you want to be able to continuously record short samples of audio, and at the same time update the display. That means you don't really want the microphone generator because it blocks you from updating an animation.

Eg you want to be able to do something like this:

level = 0
stream = microphone.record_continuous_stream(rate=5000)
while not button_a.was_pressed():  # wait until user presses A
    frame = stream.get_new_frame()  # returns None if nothing ready yet
    if frame:
        # do some maths on the frame data
        level = sum(frame)
    else:
        # decay the level down to 0
        level *= 0.95
    display.show(Image.DIAMOND * min(1, level / 100))
    microbit.sleep(10)

This non-blocking behaviour of the microphone recording would also be useful, eg, if you were at the same time waiting for incoming data on the radio. Basically anytime you want to do more than just record.

There's already wait=False for quite a few functions (eg display animation) and I think the microphone stream recording needs that "background" mode as well.

@microbit-carlos
Copy link
Contributor Author

for frame in microphone.record_stream(duration=25, rate=7812):
    radio.send(frame)

What exactly is the duration parameter here used for? Is this doing a continuous record or only a record for a fixed duration?

That's a good point, the goal is to define the size of the "chunks" that the generator will return, and not the total length, which is not 100% clear.

Duration in milliseconds felt like a more natural unit, but it's possible that bytes is a better unit for this kind of usage.

Perhaps having a length parameter and total_length could help here (we might need better names, in case we've used a different word somewhere else in the micro:bit API), although in essence it'd be no different than:

total_length = 0
for frame in microphone.record_stream(length=128, rate=7812):
    radio.send(frame)
    total_length += len(frame)
    if total_length >= MAX_LEN:
        microphone.stop_recording()

I think one of the use cases we should support here is some kind of sound level meter / real-time audio display. For that you want to be able to continuously record short samples of audio, and at the same time update the display.

Yes, I agree something like that is great example to consider.

That means you don't really want the microphone generator because it blocks you from updating an animation.

Eg you want to be able to do something like this:

level = 0
stream = microphone.record_continuous_stream(rate=5000)
while not button_a.was_pressed():  # wait until user presses A
    frame = stream.get_new_frame()  # returns None if nothing ready yet
    if frame:
        # do some maths on the frame data
        level = sum(frame)
    else:
        # decay the level down to 0
        level *= 0.95
    display.show(Image.DIAMOND * min(1, level / 100))
    microbit.sleep(10)

Right, but the microphone data is not really an event that might or might not have happened, like receiving a radio packet. The generator will always return a microphone frame at the same interval (depending on the rate and the frame size). The frame might be almost silence (there is always a bit of noise), but it will aways contain sound data.

So that example could end up being something more or less like this:

level = 0
for frame in microphone.record_continuous_stream(rate=5000):
    # do some maths on the frame data
    frame_level = sum(frame)
    if frame_level > THRESHOLD:
        level *= 1.1
    else:
        level *= 0.95
    display.show(Image.DIAMOND * min(1, level / 100))
    if button_a.is_pressed():
        microphone.stop_recording()

This non-blocking behaviour of the microphone recording would also be useful, eg, if you were at the same time waiting for incoming data on the radio. Basically anytime you want to do more than just record.

There's already wait=False for quite a few functions (eg display animation) and I think the microphone stream recording needs that "background" mode as well.

Ah, perhaps I didn't quite understand the blocking or non-blocking behaviour. I assumed that the microphone is constantly recording on the background, and every time next() is called on the generator it would immediately return the last recorded frame, or block until the last frame was finished recording.

So in that case, one could add a radio.received() call at the end of the loop and immediately process any queued radio packet while the microphone is recording in the background.

The question at that point would also be what happens if the loop iteration takes longer than a frame takes to record. Would the record_continuous_stream() background task create a queue and the generator pops them, or does it drop frames and always return the latest one?

@microbit-carlos
Copy link
Contributor Author

It's also worth mentioning that perhaps I was picturing the frames for record_continuous_stream() to be relatively small (size of a radio packet), and the blocking behaviour is more prominent when using larger size frames, like in the hundreds of milliseconds range.

@dpgeorge dpgeorge self-assigned this Sep 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants