How to use a buffer with StartStreamTranscriptionAsync ? #2914

ricardogreis · 2024-04-09T19:19:05Z

ricardogreis
Apr 9, 2024

Hi all.
I am trying to use the Transcribe service.
I already have a unsigned char buffer with the audio in my server ...

I didn't find any examples of TranscribeStreamingServiceClient::StartStreamTranscriptionAsync using a buffer.

Is there any ways to do that?

Answered by jmklix

Apr 9, 2024

Have you taken a look at our samples repo? This transcribe sample might be what you're looking for:

// Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
// SPDX-License-Identifier: Apache-2.0

#include <aws/core/Aws.h>
#include <aws/core/utils/threading/Semaphore.h>
#include <aws/core/auth/AWSCredentialsProviderChain.h>
#include <aws/transcribestreaming/TranscribeStreamingServiceClient.h>
#include <aws/transcribestreaming/model/StartStreamTranscriptionHandler.h>
#include <aws/transcribestreaming/model/StartStreamTranscriptionRequest.h>
#include <aws/core/platform/FileSystem.h>
#include <fstream>
#include <chrono>
#include <thread>
#include <array>

using namespace Aws;
using

View full answer

jmklix · 2024-04-09T20:51:31Z

jmklix
Apr 9, 2024
Maintainer

Have you taken a look at our samples repo? This transcribe sample might be what you're looking for:

// Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
// SPDX-License-Identifier: Apache-2.0

#include <aws/core/Aws.h>
#include <aws/core/utils/threading/Semaphore.h>
#include <aws/core/auth/AWSCredentialsProviderChain.h>
#include <aws/transcribestreaming/TranscribeStreamingServiceClient.h>
#include <aws/transcribestreaming/model/StartStreamTranscriptionHandler.h>
#include <aws/transcribestreaming/model/StartStreamTranscriptionRequest.h>
#include <aws/core/platform/FileSystem.h>
#include <fstream>
#include <chrono>
#include <thread>
#include <array>

using namespace Aws;
using namespace Aws::TranscribeStreamingService;
using namespace Aws::TranscribeStreamingService::Model;

//TODO(User): Update path to location of local .wav test file, if necessary.
static const Aws::String FILE_NAME{MEDIA_DIR "/transcribe-test-file.wav"};
static const int SAMPLE_RATE = 8000; // for the file above
// If you're able to specify chunk size with your audio type (such as with PCM), set each chunk to between 50 ms and 200 ms
static const int CHUNK_LENGTH = 125;
// chunk_size_in_bytes = chunk_duration_in_millisecond / 1000 * audio_sample_rate * 2
static const int BUFFER_SIZE = CHUNK_LENGTH * SAMPLE_RATE / 1000 * 2;

// snippet-start:[transcribe.cpp.stream_transcription_async.code]
int main() {
    Aws::SDKOptions options;

    Aws::InitAPI(options);
    {
        //TODO(User): Set to the region of your AWS account.
        const Aws::String region = Aws::Region::US_WEST_2;

        //Load a profile that has been granted AmazonTranscribeFullAccess AWS managed permission policy.
        Aws::Client::ClientConfiguration config;
#ifdef _WIN32
        // ATTENTION: On Windows with the AWS C++ SDK, this example only runs if the SDK is built
        // with the curl library. 
        // For more information, see the accompanying ReadMe.
        // For more information, see "Building the SDK for Windows with curl".
        // https://docs.aws.amazon.com/sdk-for-cpp/v1/developer-guide/setup-windows.html
        //TODO(User): Update to the location of your .crt file.
        config.caFile = "C:/curl/bin/curl-ca-bundle.crt";
#endif
        config.region = region;

        TranscribeStreamingServiceClient client(config);
        StartStreamTranscriptionHandler handler;
        handler.SetOnErrorCallback(
                [](const Aws::Client::AWSError<TranscribeStreamingServiceErrors> &error) {
                        std::cerr << "ERROR: " + error.GetMessage() << std::endl;
                });
        //SetTranscriptEventCallback called for every 'chunk' of file transcripted.
        // Partial results are returned in real time.
        handler.SetTranscriptEventCallback([](const TranscriptEvent &ev) {
                for (auto &&r: ev.GetTranscript().GetResults()) {
                    if (r.GetIsPartial()) {
                        std::cout << "[partial] ";
                    }
                    else {
                        std::cout << "[Final] ";
                    }
                    for (auto &&alt: r.GetAlternatives()) {
                        std::cout << alt.GetTranscript() << std::endl;
                    }
                }
        });

        StartStreamTranscriptionRequest request;
        request.SetMediaSampleRateHertz(SAMPLE_RATE);
        request.SetLanguageCode(LanguageCode::en_US);
        request.SetMediaEncoding(
                MediaEncoding::pcm); // wav and aiff files are PCM formats.
        request.SetEventStreamHandler(handler);

        auto OnStreamReady = [](AudioStream &stream) {
                Aws::FStream file(FILE_NAME, std::ios_base::in | std::ios_base::binary);
                if (!file.is_open()) {
                    std::cerr << "Failed to open " << FILE_NAME << '\n';
                }
                std::array<char, BUFFER_SIZE> buf;
                int i = 0;
                while (file) {
                    file.read(&buf[0], buf.size());

                    if (!file)
                        std::cout << "File: only " << file.gcount() << " could be read"
                                  << std::endl;

                    Aws::Vector<unsigned char> bits{buf.begin(), buf.end()};
                    AudioEvent event(std::move(bits));
                    if (!stream) {
                        std::cerr << "Failed to create a stream" << std::endl;
                        break;
                    }
                    //The std::basic_istream::gcount() is used to count the characters in the given string. It returns
                    //the number of characters extracted by the last read() operation.
                    if (file.gcount() > 0) {
                        if (!stream.WriteAudioEvent(event)) {
                            std::cerr << "Failed to write an audio event" << std::endl;
                            break;
                        }
                    }
                    else {
                        break;
                    }
                    std::this_thread::sleep_for(std::chrono::milliseconds(
                            25)); // Slow down because we are streaming from a file.
                }
                if (!stream.WriteAudioEvent(
                        AudioEvent())) {
                    // Per the spec, we have to send an empty event (an event without a payload) at the end.
                    std::cerr << "Failed to send an empty frame" << std::endl;
                }
                else {
                    std::cout << "Successfully sent the empty frame" << std::endl;
                }
                stream.flush();
                stream.Close();
        };

        Aws::Utils::Threading::Semaphore signaling(0 /*initialCount*/, 1 /*maxCount*/);
        auto OnResponseCallback = [&signaling](
                const TranscribeStreamingServiceClient * /*unused*/,
                const Model::StartStreamTranscriptionRequest & /*unused*/,
                const Model::StartStreamTranscriptionOutcome &outcome,
                const std::shared_ptr<const Aws::Client::AsyncCallerContext> & /*unused*/) {

                if (!outcome.IsSuccess()) {
                    std::cerr << "Transcribe streaming error "
                              << outcome.GetError().GetMessage() << std::endl;
                }

                signaling.Release();
        };

        std::cout << "Starting..." << std::endl;
        client.StartStreamTranscriptionAsync(request, OnStreamReady, OnResponseCallback,
                                             nullptr /*context*/);
        signaling.WaitOne(); // Prevent the application from exiting until we're done.
        std::cout << "Done" << std::endl;
    }

    Aws::ShutdownAPI(options);

    return 0;
}
// snippet-end:[transcribe.cpp.stream_transcription_async.code]

2 replies

ricardogreis Apr 10, 2024
Author

I am new to the sdk and it was the first thing I did. I already check the tests ..

OnStreamReady is a callback without a context pointer with user data ...

What is the option to write my buffer to the AudioStream parameter ?

jmklix Apr 11, 2024
Maintainer

This is dependent on your OS and audio stream format. This stackoverflow question has an answer that might help you with what you're looking for.

Please let me know if this doesn't answer your question

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to use a buffer with StartStreamTranscriptionAsync ? #2914

{{title}}

Replies: 1 comment 2 replies

{{title}}

{{title}}

{{title}}

Select a reply

How to use a buffer with StartStreamTranscriptionAsync ? #2914

ricardogreis Apr 9, 2024

Replies: 1 comment · 2 replies

jmklix Apr 9, 2024 Maintainer

ricardogreis Apr 10, 2024 Author

jmklix Apr 11, 2024 Maintainer

ricardogreis
Apr 9, 2024

Replies: 1 comment 2 replies

jmklix
Apr 9, 2024
Maintainer

ricardogreis Apr 10, 2024
Author

jmklix Apr 11, 2024
Maintainer