Skip to content

SSML Tags and Functions

Fabian Celik edited this page Mar 22, 2020 · 9 revisions

Supported SSML Tags

Speak

The <speak> tag is the root element of all SSML text. All SSML-enhanced text must be enclosed within a pair of <speak> tags.

<speak></speak>

Break

To add a pause to your text, use the <break> tag. You can set a pause based on strength (equivalent to the pause after a comma, a sentence, or a paragraph).

<break strength="weak"/>

Option strength

  • weak: Sets a pause of the same duration as the pause after a comma.
  • strong: Sets a pause of the same duration as the pause after a sentence.
  • x-strong: Sets a pause of the same duration as the pause after a paragraph.

Emphasize

To emphasize words, use the <emphasis> tag. Emphasizing words changes the speaking rate and volume.

<emphasis level="strong"></emphasis>

Whispering

This tag indicates that the input text should be spoken in a whispered voice rather than as normal speech.

<amazon:effect name="whispered"></amazon:effect>

Language

Specify another language for a specific word, phrase, or sentence with the <lang> tag. Foreign language words and phrases are generally spoken better when they are enclosed within a pair of <lang> tags. To specify the language, use the xml:lang attribute. These language codes are W3C language identification tags (ISO 639-3 for the language name and ISO 3166 for the country code).

<lang xml:lang="en-US"></lang>

Paragraph

To add a pause between paragraphs in your text, use the <p> tag. Using this tag provides a longer pause than native speakers usually place at commas or the end of a sentence. Use the <p> tag to enclose the paragraph.

<p></p>

Say-As

Use the <say-as> tag with the interpret-as attribute to specify how to say certain characters, words, and numbers. This enables you to provide additional context to eliminate any ambiguity on how TTS should render the text.

<say-as interpret-as="spell-out"></say-as>

Option interpret-as

  • spell-out: Spells out each letter of the text, as in a-b-c.
  • number: Interprets the numerical text as a cardinal number, as in 1,234.
  • ordinal: Interprets the numerical text as an ordinal number, as in 1,234th.
  • digits: Spells out each digit individually, as in 1-2-3-4.
  • fraction: Interprets the numerical text as a fraction. This works for both common fractions such as 3/20, and * * mixed fractions, such as 2 ½.
  • expletive: "Beeps out" the content included within the tag.

Say-As Date

Same as "say-as", but only for option "interpret-as". You also need to indicate the format of the date.

<say-as interpret-as="date" format="mdy"></say-as>

Option format

  • mdy: Month-day-year.
  • dmy: Day-month-year.
  • ymd: Year-month-day.
  • md: Month-day.
  • dm: Day-month.
  • ym: Year-month.
  • my: Month-year.
  • d: Day.
  • m: Month.
  • y: Year.

Substitute

Use the <sub> tag with the alias attribute to substitute a different word (or pronunciation) for selected text such as an acronym or abbreviation.

<sub alias="Enter Substitute Text Here"></sub>

Phonetic Pronunciation

Use the <phoneme> tag with the ph attribute to uses the pronunciation specified instead of the standard pronunciation associated by default with the language used by the selected voice for the selected text.

<phoneme alphabet="ipa" ph="pɪˈkɑːn"></phoneme>

Breathing

Natural-sounding speech includes both correctly spoken words and breathing sounds. By adding breathing sounds to synthesized speech, you can make it sound more natural.

<amazon:auto-breaths></amazon:auto-breaths>


Special Functions

Parse

Uses the formatted editor text as input and try to interpret as SSML.

  • Add and
  • Emphasize bold and underlined text
  • Add SSML breaks to line breaks

Validate

Does a very basic validation for SSML in editor text.