Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When using yield, automatic audio playback through web pages does not work properly #7972

Closed
1 task done
AnitaSherry opened this issue Apr 9, 2024 · 3 comments
Closed
1 task done
Labels
bug Something isn't working

Comments

@AnitaSherry
Copy link

AnitaSherry commented Apr 9, 2024

  • I have searched to see if a similar issue already exists.
    yes

Is your feature request related to a problem? Please describe.
I hope the audio module can play multiple audio in sequence

Describe the solution you'd like
When I am using a large model for streaming output, I want to play the voice converted from text as well

Additional context
Generate audio code

async def generate_audio_from_text(text):
    tts = edge_tts.Communicate(text=text, voice='zh-CN-YunxiNeural', rate='-4%', volume='+0%', pitch='+0Hz')
    audio_bytes = b''
    sample_rate = 24000  # 设置采样率
    async for message in tts.stream():
        if message['type'] == 'audio':
            audio_bytes += message['data']
    
    # 将字节串转换为 NumPy 数组
    audio_data = np.frombuffer(audio_bytes, dtype=np.int16)

    return gr.Audio(value=(sample_rate, audio_data))

The code submitted to the webpage

    messages = history[-1][0]
    past_key_values = None
    current_length = 0
    current_sentence = ""
    
    for response, historys in model.stream_chat(tokenizer, messages, history=historys):
        new_token = response[current_length:]
        if new_token != '':
            history[-1][-1] += new_token
            current_sentence += new_token
            if contains_punctuation(new_token):
                yield history[-1][-1], await generate_audio_from_text(current_sentence)
                current_sentence = ""
            else:
                yield history[-1][-1], None
            current_length = len(response)

Streaming text can output normally, but audio cannot
Usually only the last audio will be output

@abidlabs abidlabs added the bug Something isn't working label Apr 9, 2024
@freddyaboulton
Copy link
Collaborator

Do you have autoplay=True set in the output audio? If you could share the full reproduction code, someone can take a closer look.

@AnitaSherry
Copy link
Author

Do you have autoplay=True set in the output audio? If you could share the full reproduction code, someone can take a closer look.

yes,i set autoplay=True

with gr.Interface(
    predict,
    inputs=gr.Textbox(label="Input Text", lines=10),
    outputs=[
        gr.Textbox(label="Output Text"),
        gr.Audio(label="Output Audio", type="numpy", autoplay=True)
    ]
) as interface:
    
    interface.launch(server_name="xxx.xxx.xx.xx", server_port=xxxx, inbrowser=True, share=True)

I don't need to display the complete code to explain my problem, I just want to support continuous playback of multiple audio in a list

@abidlabs
Copy link
Member

Closed via #8906. If you'd like to try it out, you can install gradio from this branch: #8843

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants