-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The Persona Sprint #4
Labels
documentation
Improvements or additions to documentation
Comments
45 tasks
JarbasAl
added a commit
to OpenVoiceOS/ovos-plugin-manager
that referenced
this issue
Oct 6, 2023
listener has AudioTransformers core has MetadataTransformers and UtteranceTransformers this PR extends the concept to ovos-audio for text before TTS stage, and wav file after TTS but before playback use case demonstration https://gist.github.com/JarbasAl/16788bdcff3fa5bfb3634d6f2116cc04 this ties into OpenVoiceOS/ovos-persona#4
This was referenced Oct 6, 2023
JarbasAl
added a commit
to OpenVoiceOS/ovos-plugin-manager
that referenced
this issue
Oct 8, 2023
if using a fallback the thread is started twice fix test fix test missing import Mycroft -> OpenVoiceOS fix type hints feat/tts_transformers . move PlaybackThread to ovos-audio . feat/ovos_dialog_tts_transformers listener has AudioTransformers core has MetadataTransformers and UtteranceTransformers this PR extends the concept to ovos-audio for text before TTS stage, and wav file after TTS but before playback use case demonstration https://gist.github.com/JarbasAl/16788bdcff3fa5bfb3634d6f2116cc04 this ties into OpenVoiceOS/ovos-persona#4
It would be very cool to have support for TTS per persona, as well as wakeword per persona. |
36 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
we hit our stretch goal for persona!
this issue will document the progress
Framework
(during ovos-core 0.0.9 dev cycle)
Why
Voice interface
How
code lives here: https://github.com/OpenVoiceOS/ovos-persona
Session
(during ovos-core 0.0.8 dev cycle)
Why
consider the utterance `"chat with {persona_name}" in a multi user setup, this makes the assistant answer all questions using the requested persona for every individual client
what happens when different users are accessing core? whenever someone changes the persona it changes for everyone!
image: a typical hivemind setup, we want each client to interact with a specific persona
How
make the persona configuration come from the message.context so it is per query, not global, this is defined via a Session, in the future this also allows OVOS to become user aware per query
the following properties in OVOSSkill should reflect session data live, so skills need no explicit handling of Sessions at all:
session implementation:
footnote: this is the reason ovos-bus-client was created instead of OVOS still using mycroft-bus-client
Pipeline
(during ovos-core 0.0.8 dev cycle)
Why
consider LLMs and how they interact with skills/user commands, how do we know when to use a skill or when to ask a LLM? Factual info usually comes from the web, "chatbot speech" should come from a persona, do we just hardcode this in the utterance handling logic?
each persona handles questions via a selection of solver plugins, which can be directly implemented as a FallbackSkill for example
we want to be able to define where in the intent stage this happens, this also allows the pluginification of the intent systems and eventually even those can be replaced with a LLM if desired, giving a persona bias even to intent selection
image: intent service (in green) should be configurable, globally and per persona
How
Solvers
(during ovos-core 0.0.9 dev cycle)
Why
once added to the pipeline, personas need to be able to answer arbitrary questions, they also need to handle input in multiple languages
persona definitions include a list of solver plugins and respective configs, an utterance is sent to the solvers until one can answer the question.
How
solver plugins have automatic bidirectional translation so they can understand and answer in any language, even if the implementation is language specific
the persona definition specifies solver configs and the order in which they are tried
solvers of interest:
knowledge base solvers:
chatbot like solvers:
LLM solvers:
- [ ] subclass from generic hugging face LLM solver + provide personas
Solver documentation
A plugin can define the language it works in, eg, wolfram alpha only accepts english input at the time of this writing
Bidirectional translation will be handled behind the scenes for other languages
**Developing a solver: **
Plugins are expected to implement the get_xxx methods and leave the user facing equivalents alone
Using a solver:
solvers work with any language as long as you stick to the officially supported wrapper methods
Example Usage - DuckDuckGo plugin
single answer
chunked answer, for conversational dialogs, ie "tell me more"
Auto translation, pass user language in context
Server
(during ovos-core 0.0.9 dev cycle)
Why
some personas won't be able to run fully on device due to hardware constraints, LLMs in particular
How
Marketplace
(during ovos-core 0.0.9 dev cycle)
Why
users should be able to one click select a persona and have a nice UI
users should have a way to evaluate how useful each persona is before installing it
How
Skill Dialogs
(during ovos-core 0.0.9 dev cycle)
Why
Skills with personalities and flexible dialogs!
"you are beautiful"
, you say"tu és lindo/linda
"thank you"
, you say"obrigado"/"obrigada"
"increase sarcasm by 20%"
How
New file format,
.jsonl
jsonl format info: https://jsonlines.org/
.jsonl
file if it exists, else old.dialog
filemycroft.conf
/ current active persona.jsonl
fileoriginal issue OpenVoiceOS/OVOS-workshop#56
Dialog and TTS Transformers
Why
skills won't cover every personality, the previous technique works to change the dialog content, but as a persona we also want to change the dialog style
a persona should be able to mutate the text before TTS, and also to modify the audio after TTS
How
under persona json
The text was updated successfully, but these errors were encountered: