Skip to content

guest271314/native-messaging-piper

Repository files navigation

native-messaging-piper

Synopsis

Web Speech API does not define the capability to capture the audio output of window.speechSyntehsis.speak() to a MediaStream or ArrayBuffer (MediaStream, ArrayBuffer, Blob audio result from speak() for recording?), and is not integrated with Web Audio API (web audio api connected to speech api #1764).

Use Transferable Streams (Transferable Streams Explained, Transferable objects, Feature: Streams API: transferable streams), MediaStream (Media Capture and Streams), Insertable Streams (MediaStreamTrack Insertable Media Processing using Streams, Insertable streams for MediaStreamTrack), byte streams, Web Audio API (BaseAudioContext, MediaStreamAudioDestinationNode, MediaStreamAudioSourceNode, OscillatorNode), Byte Streams (Streams Standard - WHATWG, Using readable byte streams), subprocess streams from Node.js (Child process), Deno (Deno.Command), or Bun (Child processes) to execute piper (rhasspy/piper) with --output_raw option to stream raw 1 channel S16 PCM to the browser with Native Messaging (Chrome Developers , MDN Web Docs , Microsoft Edge Developer documentation , Messaging between the app and JavaScript in a Safari web extension), write the data to a MediaStreamTrackGenerator for the capability to play back share the stream to speakers or headphones, record, and with peers over a WebRTC RTCPeerConnection or RTCDataChannel (WebRTC: Real-Time Communication in Browsers).

background-aw.js is a Web Audio API AudioWorklet version of background.js Media Capture Transform MediaStreamTrackGenerator version, where Web Audio API is also used.

Installation

Clone repository, then fetch piper .tar.gz release; extract contents with UntarFileStream.js then write to extracted contents of piper to repository folder; install Native Messaging host manifest (Native manifests) to Chromium or Chrome user data directory.

git clone https://github.com/guest271314/native-messaging-piper
cd native-messaging-piper
deno -A install_piper.js # Or node install_piper.js, bun does not support DecompressionStream
deno -A install_host.js # Or node install_host.js, bun run install_host.js

To programmatically install the Web extension launch Chrome with

chrome --load-extension=/absolute/path/to/native-messaging-piper

Manual installation

  1. Navigate to chrome://extensions.
  2. Toggle Developer mode.
  3. Click Load unpacked.
  4. Select native-messaging-piper folder.
  5. Note the generated extension ID.
  6. Open nm_piper.json in a text editor, set "path" to absolute path of nm_piper.js and chrome-extension://<ID>/ using ID from 5 in "allowed_origins" array.
  7. Copy the nm_piper.json file to Chrome or Chromium configuration folder, e.g., Chromium on *nix ~/.config/chromium/NativeMessagingHosts; Chrome dev channel on *nix ~/.config/google-chrome-unstable/NativeMessagingHosts User Data Directory - Default Location.
  8. Modify shebang line to use node, deno, or bun to run nm_piper.js; and set the file permission to executable.
  9. Reload the extension.

Usage

The default voices fetched from diffusionstudio/piper-voices in install_piper.js, where additional voices are listed, are en_US-hfc_female-medium, and en_US-hfc_male-medium corresponding to "male" and "female" passed to Piper constructor. Adjust accordingly to change voices downloaded from Hugging Faces and available in install_piper.js at

const voices = Array.of("en_US-hfc_female-medium", "en_US-hfc_male-medium");

and in the template literal in nm_piper at

const voiceuri = new URL(`./en_US-hfc_${voice}-medium.onnx`, import.meta.url).pathname;

The Web extension injects the Piper class into all Web URL's that do not start with chrome: or chrome-extension: protocols.

In DevTools console or Snippets in Sources panel, or in other code executed in the Web page, create a Piper instance.

The double quotes in template literal are necessary; the Native Messaging protocol uses JSON over IPC between the spawned host (nm_piper.js) and the browser.

var piper = new Piper({
  voice: "male",
  text: `"Now watch. ..., this how science works.
One researcher comes up with a result.
And that is not the truth. No, no.
A scientific emergent truth is not the
result of one experiment. What has to
happen is somebody else has to verify
it. Preferably a competitor. Preferably
someone who doesn't want you to be correct.

- Neil deGrasse Tyson, May 3, 2017 at 92nd Street Y"`.replace(/\n/g, " "),
});

piper.stream().then(console.log).catch(console.error);
var piper = new Piper({
  voice: "female",
  text: `"So we need people to have weird new
ideas. We need more ideas to break it
and make it better.

Use it. Break it. File bugs. Request features.

- Soledad Penadés, Real time front-end alchemy, or: capturing, playing,
  altering and encoding video and audio streams, without
  servers or plugins!"`.replace(/\n/g, " "),
});

piper.stream().then(console.log).catch(console.error);

The Piper instance exposes a mediaStream property that is a live MediaStream of the 1 channel S16 PCM output by piper.

To abort the audio playback and stream that sends data to the arbitrary Web page using Trasnferable Stream

piper.abort(); // Default parameter to abort is "Stream aborted." 

Or explicitly set the reason

piper.abort("Cancel");

Examples

Example rhasspy/piper and diffusion-studios TTS audio output files are located in this repository at en_US-hfc_male-medium.wav and en_US-hfc_female-medium.wav.

License

Do What the Fuck You Want to Public License WTFPLv2

About

rhasspy/piper Native Messaging host for TTS streaming

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published