Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speech controller refactor #2348

Merged
merged 34 commits into from
Jul 23, 2020

Conversation

Udumft
Copy link
Contributor

@Udumft Udumft commented Mar 9, 2020

Resolves #2268.

RouteVoiceController together with it's subclass MapboxVoiceController had 3 goals: tracking navigation events, vocalizing SpokenInstructions and handling fallback mechanism for speech synthesizers.
In this PR these tasks are separated by introducing SpeechSynthesizing protocol.
RouteVoiceController now tracks navigation events and triggers configured speechSynthesizer to vocalize the instructions.
SpeechSynthesizersController handles an arbitrary array of SpeechSynthesizing implementations to handle the fallback feature.
SystemSpeechSynthesizer and MapboxSpeechSynthesizer provide complete and expandable implementations for using iOS built-in speech synthesizer and MapboxSpeech framework respectively.

@Udumft Udumft added op-ex Refactoring, Tech Debt or any other operational excellence work. release blocker Needs to be resolved before the release. topic: voice backwards incompatible changes that break backwards compatibility of public API labels Mar 9, 2020
@Udumft Udumft added this to the v1.0.0 milestone Mar 9, 2020
@Udumft Udumft self-assigned this Mar 9, 2020
Comment on lines 111 to 116
self.delegate?.voiceController(self, spokenInstructionsDidFailWith: SpeechError.apiError(instruction: modifiedInstruction,
options: options,
underlying: error))
self.completion?(SpeechError.apiError(instruction: modifiedInstruction,
options: options,
underlying: error))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error propagation is still a TODO.

I feel the need to have explicit completion for speak method to allow it to by async, but at the same time we want to have a delegate to track various events. Also, some errors may be blocking for Speech Synthesizer to actually speak, others may be not.
What is the best way to handle errors in this situation? We can:

  1. report any errors occur with via delegate, and completion will receive error only if it was blocking (or nil otherwise)
  2. report any errors occur with via delegate, and completion will just have boolean success, and thus the last error reported was a blocker.
  3. get rid of completion and make it a part of delegate protocol (with one of the 2 above points implemented)
  4. something else?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have anything bubble up to the delegate, it should be after the speech controller has finished processing all the speech synthesizers. So if there’s an error, try falling back to another speech synthesizer, and only notify the delegate if all the speech synthesizers fail.

///
func stopSpeaking()
///
func interruptSpeaking() // ??
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if we need to, but maybe it could be useful to have different ways to shut down ongoing phrase? For example: one method to mark that phrase should be stopped gracefully whenever appropriate, and another to abort it immediately.

///
func changedIncomingSpokenInstructions(_ instructions: [SpokenInstruction])
///
func speak(_ instruction: SpokenInstruction, completion: SpeechSynthesizerCompletion?)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following what was mentioned in original ticket, I don't feel like such protocol should have API to control ducking because:

  1. It is responsibility of a SpeechSynthesizer to know and control when it needs to duck, if it needs at all
  2. Ducking feature is performed via system methods and does not depend on particular synthesizer implementation.

Comment on lines 12 to 20
public var muted: Bool = false // ???
public var volume: Float {
get {
return NavigationSettings.shared.voiceVolume
}
set {
// ?!?!?!
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handling volume and muting with respect to NavigationSettings.shared.voiceVolume is not implemented yet

@@ -272,7 +191,7 @@ public protocol VoiceControllerDelegate: class, UnimplementedLogging {
- parameter synthesizer: the Speech engine that was used as the fallback.
- parameter error: An error explaining the failure and its cause.
*/
func voiceController(_ voiceController: RouteVoiceController, didFallBackTo synthesizer: AVSpeechSynthesizer, error: SpeechError)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method feels like a dog-nail that we had to support MapboxVoiceController's feature of a 'backup' synthesizer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That’s true to some extent, though I think the other consideration was that we wanted to send a telemetry event whenever speech got interrupted. There’s other ways of doing that, though. Do you think it would still be useful for the application to know when all the available speech synthesizers fail?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For sure, knowing that speech synths had fail is important, but I thought that SpeechSynthesizerController describes a single speech Synth implementation, and thus it "doesn't know" about other implementations (so there is no such term as "fallback" in it's context). At this point, responsibility to manage multiple synths and notify when they all failed should be somewhere else. In current situation that is done by MapboxSpeechSynthesizerController, which will try to call all synths and if all of them fail - will report the latest error.

import MapboxSpeech

///
open class MapboxSpeechSynthesizerController: NSObject, SpeechSynthesizerController {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the separation of concerns would be clearer if we rename this class to SpeechSynthesizerController and rename the SpeechSynthesizerController protocol to SpeechSynthesizing. This class doesn’t need to conform to SpeechSynthesizing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about this class as some kind of a wrapper or a proxy which manages multiple speech engines. At the same time it is possible to use a single speech engine directly in RouteVoiceController if user want's to, so MapboxSpeechSynthesizerController should adopt SpeechSynthesizerController too.

In this case, MapboxSpeechSynthesizerController does not have any crucial role in the mechanism and can be safely removed, or used with any custom set of speech synthesizers.

Yes, looks like naming could be better to display the idea :)

@@ -39,7 +39,7 @@ class CustomViewController: UIViewController, MGLMapViewDelegate {

let locationManager = simulateLocation ? SimulatedLocationManager(route: userRoute!) : NavigationLocationManager()
navigationService = MapboxNavigationService(route: userRoute!, locationSource: locationManager, simulating: simulateLocation ? .always : .onPoorGPS)
voiceController = MapboxVoiceController(navigationService: navigationService)
// voiceController = MapboxVoiceController(navigationService: navigationService)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could reimplement this customization hook by exposing the navigation service’s (read-only) speech controller so that the application can append a SpeechSynthesizing instance to the speech controller’s speechSynthesizers array.

@Udumft
Copy link
Contributor Author

Udumft commented Mar 10, 2020

I feel the need to clarify my vision on the refactoring being performed to make sure we are on the same page.

As I understood, original problem was that RouteVoiceController (together with it's subclass MapboxVoiceController) did 3 tasks at the same time: monitoring the actual route progress events to check when voice instructions should be pronounced, handled the speech synth engine(s), and took care about fallback mechanism when multiple engines were used to work together.

Next, all entity names are what they are at the moment in current draft, just for convenience

The solution is to separate these tasks to make such modules replaceable. What I am trying to achieve:

  • RouteVoiceController remains responsible for monitoring route progress to pass voice instructions to the speech engine
  • Speech engine is hidden behind a protocol (SpeechSynthesizerController at the moment) which allows simple actions to actually "speak" an instruction. Each implementation is responsible only for it's own integrity and is independent
  • To maintain functionality of "fallback" synthesizers, MapboxSpeechSynthesizerController (current name) is introduced, which mimics SpeechSynthesizerController functionality, but in fact just aggregates real implementations and thus performs mechanism of a "backup" speech synthesizers.

In result, each element can be replaced by User's implementation. User may implement his own RouteVoiceController to monitor events in the fashion he wants and still use any SpeechSynthesizerController to vocalize instructions, User may implement his own SpeechSynthesizerController and pass it to MapboxSpeechSynthesizerController to be a part of a fallback sequence. User may not use MapboxSpeechSynthesizerController at all to have only 1 speech engine directly in RouteVoiceController.

With that said, when we talk about SpeechSynthesizerController delegate, it's methods describe state, errors, and any other events related to this particular implementation. In such delegate, events like "fallback" have no sense because one SpeechSynthesizerController is not related to other implementations.

Does it match with your vision? :)

@Udumft Udumft requested review from JThramer and 1ec5 March 11, 2020 15:34
@Udumft Udumft requested a review from chezzdev June 2, 2020 14:38
@1ec5 1ec5 modified the milestones: v0.40.0, v1.0.0 Jun 4, 2020
@Udumft Udumft changed the base branch from master to release-v1.0-pre-registry July 14, 2020 08:30
@Udumft Udumft requested a review from a team July 14, 2020 11:50
Copy link
Contributor

@1ec5 1ec5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is mostly in great shape; however, there’s some missing functionality related to locale matching. Once we plumb through the locale, we can merge this PR.

@@ -18,7 +18,6 @@ class CustomViewController: UIViewController, MGLMapViewDelegate {
var userRouteOptions: RouteOptions?

// Start voice instructions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment doesn’t make sense anymore; let’s remove it.

let modifiedInstruction = delegate?.speechSynthesizer(self, willSpeak: instruction) ?? instruction
let ssmlText = modifiedInstruction.ssmlText
let options = SpeechOptions(ssml: ssmlText)
options.locale = locale
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By default, this locale should be the speechLocale from the route response, if present, to ensure that we use the voice that the text was designed for. Otherwise, the fallback to autoupdatingCurrent is fine, but that shouldn’t be the default. I think this context can be plumbed through the speak(_:during:) call.

let modifiedInstruction = delegate?.speechSynthesizer(self, willSpeak: instruction) ?? instruction
let ssmlText = modifiedInstruction.ssmlText
let options = SpeechOptions(ssml: ssmlText)
options.locale = locale
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also needs to set the locale to the route’s speechLocale. I think this context can be plumbed through the didPassSpokenInstructionPoint(notification:) call.

@@ -69,5 +68,12 @@ public enum SpeechError: LocalizedError {
- parameter instruction: The instruction that failed.
- parameter progress: the offending `RouteProgress` that omits the expected `SpeechLocale`.
*/
case undefinedSpeechLocale(instruction: SpokenInstruction, progress: RouteProgress)
case undefinedSpeechLocale(instruction: SpokenInstruction, progress: RouteProgress) // to be removed?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the points about MapboxSpeechSynthesizer needing speechLocale, I’m not sure we can remove this case just yet.


// Only localized languages will have a proper fallback voice
if utterance?.voice == nil {
utterance?.voice = AVSpeechSynthesisVoice(language: Locale.preferredLocalLanguageCountryCode)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should also try to pass in the speechLocale here too. This code has used Locale.preferredLocalLanguageCountryCode since #187, even before we moved guidance instruction generation to the server side, so I think this is buggy behavior that we’ve overlooked for a long time.

…ing Locales directly when speaking. Updated corresponding logic
@@ -1,6 +1,6 @@
binary "https://mapbox-gl-native-ios.s3.amazonaws.com/public/internal/Mapbox-iOS-SDK.json" "5.9.1000"
binary "https://www.mapbox.com/ios-sdk/MapboxAccounts.json" "2.3.0"
binary "https://www.mapbox.com/ios-sdk/MapboxNavigationNative.json" "14.1.5"
binary "https://www.mapbox.com/ios-sdk/MapboxNavigationNative.json" "14.1.6"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if update to nav. native 14.1.6 should be part of this changelist.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's Cartfile.resolved. This PR does not bump any dependencies. Anyway, after rebasing to the latest commit such change should vanish.

Copy link
Contributor

@MaximAlien MaximAlien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Thank you.

MapboxNavigation/MultiplexedSpeechSynthesizer.swift Outdated Show resolved Hide resolved
MapboxNavigation/SpeechSynthesizing.swift Outdated Show resolved Hide resolved
Comment on lines +654 to +657
2B2B1EDD2424B95600FA18A6 /* ExampleUITests.xctest */ = {isa = PBXFileReference; explicitFileType = wrapper.cfbundle; includeInIndex = 0; path = ExampleUITests.xctest; sourceTree = BUILT_PRODUCTS_DIR; };
2B2B1EDF2424B95600FA18A6 /* ExampleUITests.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = ExampleUITests.swift; sourceTree = "<group>"; };
2B2B1EE12424B95600FA18A6 /* Info.plist */ = {isa = PBXFileReference; lastKnownFileType = text.plist.xml; path = Info.plist; sourceTree = "<group>"; };
2B2B1EEA2424D0A700FA18A6 /* ViewController+UITests.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = "ViewController+UITests.swift"; sourceTree = "<group>"; };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need ExampleUITests? I don't see that they're used in this changelist.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should be a leftover code, which should disappear after rebasing

MapboxNavigation/SystemSpeechSynthesizer.swift Outdated Show resolved Hide resolved
MapboxNavigation/MapboxSpeechSynthesizer.swift Outdated Show resolved Hide resolved
MapboxNavigation/SpeechSynthesizing.swift Outdated Show resolved Hide resolved
MapboxNavigation/SpeechSynthesizing.swift Outdated Show resolved Hide resolved
MapboxNavigation/SystemSpeechSynthesizer.swift Outdated Show resolved Hide resolved
wait(for: [deinitExpectation], timeout: 3)
}

func testSystemSpeechSynthesizer() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not critical, but maybe it'd be good to also verify whether properties like muted, isSpeaking etc were set correctly as well in test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a few more tests for parameters, but unfortunately could not find a stable way to test all of them. For example isSpeaking does not have a dedicated event after it can be "guaranteed to talk". Or volume value for SystemSpeechSynthesizer relies on system values and is not controlled by Multiplexed synth.

CHANGELOG.md Outdated Show resolved Hide resolved
@Udumft
Copy link
Contributor Author

Udumft commented Jul 21, 2020

Corrections applied

@Udumft Udumft requested a review from 1ec5 July 21, 2020 13:26
@@ -25,9 +25,12 @@ open class MapboxSpeechSynthesizer: NSObject, SpeechSynthesizing {
public var isSpeaking: Bool {
return audioPlayer?.isPlaying ?? false
}
public var locale: Locale = Locale.autoupdatingCurrent
public var locale: Locale? = Locale.autoupdatingCurrent
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does it mean if the developer sets locale to nil, given that the default value is already autoupdatingCurrent? If we should retain this edge case instead of falling back to autoupdatingCurrent, then let’s document it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For vocalizing an instruction a propertylocale or a 'locale' argument to corresponding speak method can be used.
I was thinking about SpeechSynthesizing usage scenario as to provide user a flexible ways to provide Locales:

  • it can be set by default for all instructions (property is set, method argument is nil)
  • it can be overridden for a specific instruction (property is set, method argument set)
  • it can be required to manually specify Locale for each instruction (property is nil, method argument is set)

The latter case requires property to by optional
For MapboxSpeechSynthesizer I decided to stick to the first variant, so I've added a default value.

Comment on lines +70 to +72
let localeCode = [locale.languageCode, locale.regionCode].compactMap{$0}.joined(separator: "-")

if localeCode == "en-US" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be cleaner to check if locale.languageCode == "en" && locale.regionCode == "US".

@@ -5,7 +5,7 @@ import MapboxCoreNavigation
import MapboxSpeech

/**
`SpeechSynthesizing` implementation, using `AVSpeechSynthesizer`. Supports only english language.
`SpeechSynthesizing` implementation, using `AVSpeechSynthesizer`. Supports only English language.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implementation seems to support more than just English now. I don’t think it was ever the intention to have it support only English. Only the Alex voice is English-specific.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backwards incompatible changes that break backwards compatibility of public API op-ex Refactoring, Tech Debt or any other operational excellence work. release blocker Needs to be resolved before the release. topic: voice
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Replace voice controller inheritance with SpeechController registry
3 participants