Web Speech API does not define the capability to capture the audio output
of window.speechSyntehsis.speak()
to a MediaStream
or ArrayBuffer
(MediaStream
, ArrayBuffer
, Blob
audio result from speak() for recording?),
and is not integrated with Web Audio API (web audio api connected to speech api #1764).
Use Transferable Streams (Transferable Streams Explained, Transferable objects, Feature: Streams API: transferable streams),
MediaStream
(Media Capture and Streams), Insertable Streams (MediaStreamTrack Insertable Media Processing using Streams, Insertable streams for MediaStreamTrack), byte streams,
Web Audio API (BaseAudioContext, MediaStreamAudioDestinationNode
, MediaStreamAudioSourceNode
, OscillatorNode
), Byte Streams (Streams Standard - WHATWG, Using readable byte streams),
subprocess streams from Node.js (Child process), Deno (Deno.Command), or Bun (Child processes) to execute piper
(rhasspy/piper) with
--output_raw
option to stream raw 1 channel S16 PCM to the browser with
Native Messaging (Chrome Developers
, MDN Web Docs
, Microsoft Edge Developer documentation
, Messaging between the app and JavaScript in a Safari web extension), write the data to a MediaStreamTrackGenerator
for the
capability to play back share the stream to speakers or headphones, record,
and with peers over a WebRTC RTCPeerConnection
or RTCDataChannel
(WebRTC: Real-Time Communication in Browsers).
background-aw.js
is a Web Audio API AudioWorklet
version of background.js
Media Capture Transform MediaStreamTrackGenerator
version, where Web Audio API is also used.
Clone repository, then fetch piper
.tar.gz
release; extract contents with
UntarFileStream.js
then write to extracted contents of piper
to repository folder;
install Native Messaging host manifest (Native manifests) to Chromium or Chrome
user data directory.
git clone https://github.com/guest271314/native-messaging-piper
cd native-messaging-piper
deno -A install_piper.js # Or node install_piper.js, bun does not support DecompressionStream
deno -A install_host.js # Or node install_host.js, bun run install_host.js
To programmatically install the Web extension launch Chrome with
chrome --load-extension=/absolute/path/to/native-messaging-piper
- Navigate to
chrome://extensions
. - Toggle
Developer mode
. - Click
Load unpacked
. - Select
native-messaging-piper
folder. - Note the generated extension ID.
- Open
nm_piper.json
in a text editor, set"path"
to absolute path ofnm_piper.js
andchrome-extension://<ID>/
using ID from 5 in"allowed_origins"
array. - Copy the
nm_piper.json
file to Chrome or Chromium configuration folder, e.g., Chromium on *nix~/.config/chromium/NativeMessagingHosts
; Chrome dev channel on *nix~/.config/google-chrome-unstable/NativeMessagingHosts
User Data Directory - Default Location. - Modify shebang line to use
node
,deno
, orbun
to runnm_piper.js
; and set the file permission to executable. - Reload the extension.
The default voices fetched from diffusionstudio/piper-voices
in install_piper.js
, where additional voices are listed, are en_US-hfc_female-medium
, and en_US-hfc_male-medium
corresponding to "male"
and "female"
passed to Piper
constructor.
Adjust accordingly to change voices downloaded from Hugging Faces and available in install_piper.js
at
const voices = Array.of("en_US-hfc_female-medium", "en_US-hfc_male-medium");
and in the template literal in nm_piper
at
const voiceuri = new URL(`./en_US-hfc_${voice}-medium.onnx`, import.meta.url).pathname;
The Web extension injects the Piper
class
into all Web URL's that do not start with chrome:
or chrome-extension:
protocols.
In DevTools console
or Snippets
in Sources
panel, or in other code executed
in the Web page, create a Piper
instance.
The double quotes in template literal are necessary; the Native Messaging protocol
uses JSON over IPC between the spawned host (nm_piper.js
) and the browser.
var piper = new Piper({
voice: "male",
text: `"Now watch. ..., this how science works.
One researcher comes up with a result.
And that is not the truth. No, no.
A scientific emergent truth is not the
result of one experiment. What has to
happen is somebody else has to verify
it. Preferably a competitor. Preferably
someone who doesn't want you to be correct.
- Neil deGrasse Tyson, May 3, 2017 at 92nd Street Y"`.replace(/\n/g, " "),
});
piper.stream().then(console.log).catch(console.error);
var piper = new Piper({
voice: "female",
text: `"So we need people to have weird new
ideas. We need more ideas to break it
and make it better.
Use it. Break it. File bugs. Request features.
- Soledad Penadés, Real time front-end alchemy, or: capturing, playing,
altering and encoding video and audio streams, without
servers or plugins!"`.replace(/\n/g, " "),
});
piper.stream().then(console.log).catch(console.error);
The Piper
instance exposes a mediaStream
property that is a live MediaStream
of the 1 channel S16 PCM output by piper
.
To abort the audio playback and stream that sends data to the arbitrary Web page using Trasnferable Stream
piper.abort(); // Default parameter to abort is "Stream aborted."
Or explicitly set the reason
piper.abort("Cancel");
Example rhasspy/piper
and diffusion-studios
TTS audio output files are located in this repository at en_US-hfc_male-medium.wav
and en_US-hfc_female-medium.wav
.
Do What the Fuck You Want to Public License WTFPLv2