Understand Voice Modeling


Voice modeling helps Alexa recognize and understand voice requests that mention your skill (the invocation name) and your catalog content (artists, albums, songs, playlists, genres, and stations). Some examples of voice requests are "Alexa, play music on <skill name>" and "Alexa, stop." Voice modeling is an automated process managed by the Alexa service, but it's important to be familiar with the process so that you can build your skill and catalogs for success.

A music skill automatically takes advantage of Alexa's existing data sets for popular artists, albums, and so on. The voice modeling process takes incremental steps to incorporate additional catalog content that might be unique to your skill. The voice modeling process begins within two weeks of uploading your catalogs. As the process progresses, the scope of requests that Alexa recognizes for your skill grows.

Voice modeling best practices

To help make sure that your skill gets the most benefit from the voice modeling process, refer to the following guidelines. The most important thing to keep in mind is that catalogs must be voice friendly. Text like names, titles, and so on that work well in print such as magazines and web pages don't necessarily work well with a voice interface. As part of the catalog ingestion process, we attempt voice optimization in many cases, but in other cases it's up to you to provide the proper spelling, aliases, or alternate names.

The following guidelines apply to all catalog types.

Basic guidelines

To improve the voice modeling results for your skill name and catalog content:

  • Use lowercase alphabetic characters
  • Use spaces between words
  • Use possessive apostrophes
  • Use periods and spaces for abbreviations or spoken letter sequences
  • Use dictionary words
  • Spell out non-alphabetic characters (for example, numbers)

The following examples demonstrate how to use these guidelines to set a skill invocation name. The invocation name is the alias that you provide in the Skill Names section of the Alexa Skills Kit developer console.

  • Skill name: XYZ Tunes

    Alias: x. y. z. tunes

    Explanation: XYZ is a spoken letter sequence.

  • Skill name: John's #1 Music

    Alias: john's number one music

    Explanation: Spell out numbers and symbols.

  • Skill name: Musik for Phun

    Alias: music for fun

    Explanation: Musik and Phun are not dictionary words.

  • Skill name: Money Mu$ik

    Alias: money music

    Explanation: Mu$ik isn't a dictionary word.

These same guidelines apply to your catalog content.

Aliases and alternate names

If your skill invocation name or catalog content includes homophones or near homophones (words that sound the same or similar), you might see improved results by including an alternate name for these words. That way, Alexa understands the similar words to mean the same thing.

As described in the preceding section, you can create multiple aliases for the skill invocation name. You can also create aliases, known as alternate names, for the content in your catalog. For information about how to specify alternate names in a catalog, see the catalog reference.

The following examples demonstrate ways to use homophones for skill names.

  • Skill name: Beat Music

    First alias: beat music

    Second alias: beet music

    Explanation: Beat and beet are homophones.

  • Skill name: Music of Dissent

    First alias: music of dissent

    Second alias: music of descent

    Explanation: Dissent and descent are near homophones.

Using aliases for the skill name and alternate names for catalog content can help Alexa recognize your skill and content across a range of users and variations in pronunciation. You should spend time testing your skill to make sure it performs as expected. When you test with an Alexa device, you can see how Alexa interprets your voice requests by reviewing the request history in the Amazon Alexa app (in the app, navigate to Settings and then History). You can use this history to help you choose aliases and alternate names.

Names for catalog content

For the names of items in your catalogs, we recommend that you use the official or legal name of the entity (the artist name, album title, etc.). This attribute is used during voice prompts when Alexa is interacting with the user. However, it's also helpful to include alternate names for other ways that a user might refer to the item. See the following examples.

Album example #1: Bob Marley and the Wailers, 'Live!' (1975)

  • Album name: "Bob Marley and the Wailers Live 1975"

    Alternate names:

    • "Bob Marley Live 1975"
    • "Bob Marley Live in 1975"
    • "Bob Marley Live"

Album example #2: Abba Greatest Hits Vol 1

  • Album name: "Abba Greatest Hits Volume 1"

    Alternate names:

    • "Abba Greatest Hits Vol 1"
    • "Abba Greatest Hits 1"

Artist example: Eminem

For an artist name, we recommend using the artist's well-known name first, instead of the legal name. For example, in the case of Eminem, we recommend the following:

  • Artist name: "Eminem"

    Alternate names:

    • "Marshall Bruce Mathers"
    • "Marshall Bruce Mathers the third"
    • "Double M"
    • "M&M"

Provide additional variations in for alternate names as it makes sense to do so. This increases the probability of a match. For example, the artist "Puff Daddy" might be requested by users as "Sean Combs", "P Diddy", "Diddy", "Puffy", and so on.

Catalog content popularity

The popularity values for your catalog content determine what's included during voice modeling. Voice modeling is conducted for the most popular items in your catalogs. The best voice modeling results occur when the popularity values reflect the frequency or expected frequency of user requests for that content.

Follow these guidelines:

  • 95 to 100 = Extremely popular or important top entities, or new releases expected to be popular
  • 1 to 94 = Entities ranked according to their popularity and demand
  • 0 = Ignore

Catalog updates and upload frequency

After the initial catalog upload, we recommended that you provide only new, updated, or removed catalog entries. Providing only the differences helps ensure that your latest catalog entries are available promptly.

We recommend that you upload up to fifty files per day for each of your catalogs. That is, we recommend batching your catalog updates into fewer uploads of larger files rather than more uploads of smaller files. Higher numbers of catalog file uploads per day might result in throttling.

Supported user requests

Alexa music skills support several kinds of user requests. The simplest request is one that refers only to your skill invocation name (or aliases). For example:

  • "Alexa, play music on <skill name>."
  • "Alexa, play <skill name>." (For users who have enabled your skill.)

Users might also make requests to play specific catalog content from your skill. For example:

  • "Alexa, play music by <artist name> on <skill name>."

    Variations:

    • "Alexa, play songs by <artist name> on <skill name>."
    • "Alexa, I'd like to hear music by <artist name> from <skill name>."
    • "Alexa, let's hear songs by <artist name> from <skill name>."

    Example: "Alexa, play music by Fleet Foxes on <skill name>."

  • "Alexa, play the album <album name> on <skill name>."

    Variations:

    • "Alexa, play album <album name> by <artist name> on <skill name>."
    • "Alexa, play the <album name> album on <skill name>."
    • "Alexa, I'd like to listen to the album <album name> by <artist name> on <skill name>."

    Example: "Alexa, play the album Vitalogy on <skill name>."

  • "Alexa, play the song <track name> on <skill name>."

    Variations:

    • "Alexa, play track <track name> by <artist name> on <skill name>."
    • "Alexa, I'd like to hear the song <track name> on <skill name>."
    • "Alexa, play song <track name> from <album name> on <skill name>."

    Example: "Alexa, play the song Poker Face on <skill name>."

  • "Alexa, play the playlist <playlist name> on <skill name>."

    Variations:

    • "Alexa, play the <playlist name> playlist on <skill name>."
    • "Alexa, play the <skill name> playlist <playlist name>."
    • "Alexa, play my <playlist name> playlist on <skill name>."

    Example: "Alexa, play the All Time Hits playlist on <skill name>."

  • "Alexa, play the station <station name> on <skill name>."

    Variations:

    • "Alexa, play my <station name> station from <skill name>."
    • "Alexa, play <station name> radio on <skill name>."

    Example: "Alexa, play the station News Today on <skill name>."

  • "Alexa, play <genre name> music on <skill name>."

    Example: "Alexa, play ballroom dancing music on <skill name>."

Users might also make more ambiguous requests. For example, "Alexa, play Fun on <skill name>." In this case, "Fun" is ambiguous because it can refer to an artist, a genre, or an album. Alexa might determine that Fun is a genre and look for Fun in your genre catalog, even if you also have Fun in your artist catalog. To address these conflicts, users can use less ambiguous requests, as shown in the preceding examples.

User requests can also specify a device for playback by adding the following phrases to the request:

  • "… in the <device location>."
  • "… from my <device type>."
  • "… from the <device location> <device type>."

In these example phrases, <device type> is an Alexa-enabled device (Echo, Echo Dot, Echo Show, and so on) and <device location> is a room or group name (kitchen, living room, upstairs, and so on). For example: "Alexa, play songs by Lady Gaga from <skill name> in the kitchen."

Users can also request music alarms by modifying the request with phrases like the following:

  • "Alexa, set an alarm for <recurrence> at <time> with <content> on <skill name>."
  • "Alexa, play <content> from <skill name> as alarm <recurrence> at <time>."

In these example phrases, <time> identifies a time of day, and <recurrence> identifies a simple recurring pattern (Mondays, weekdays, weekends, and so on). For example: "Alexa, set an alarm for weekdays at seven a. m. with music by Pearl Jam on <skill name>."


Was this page helpful?

Last updated: Nov 27, 2023