One of the first things to consider is whether to host your skill in the cloud.
When you create an Alexa skill, you can store your code and resources yourself. However, if you create an Alexa-hosted skill, then Alexa stores your code and resources on Amazon Web Services (AWS) for you.
If you choose to build an Alexa-hosted skill, then you’ll ensure its scalability as your usage grows.
At Alexa Live 2021, we announced removal of the AWS Free Tier limits for skills on the service. You can learn about the extended free usage limits here.
Learn more on how to set up an Alexa-hosted skill.
More and more customers are choosing Alexa-enabled devices with a screen — in fact, Amazon Echo devices with visual experiences are the top-selling devices — and customers are expecting skills to include multimodal experiences. Customers want immersive experiences that use rich sounds and visuals to convey information, and to use both voice and touch for input. The Alexa Presentation Language (APL) is the technology that enables you to build these experiences.
With APL, you can create visual experiences — including animations, graphics, images, slideshows, and video — to integrate with your skill content. Customers can see and interact with these visual experiences on supported devices such as the Echo Show, Fire TV, some Fire tablets, and others.
In this video, Steven Arkonovich, Alexa Champion and creator of the Big Sky skill, explains how he implemented APL to build a visually enticing experience for his weather skill:
Visual cues can complement your voice skill in many ways. This is one more opportunity to be creative.
Resources:
APL in the technical documentation
Sample code
A quality skill understands the intention behind a customer utterance and provides a satisfactory response. Under its hood, automatic speech recognition (ASR) understands what customers are saying, while natural language understanding (NLU) maps customer utterances to the right intents and slots.
Even when both ASR and NLU skill components work perfectly, there still might be cases of customers facing friction. For instance, a skill doesn’t support this particular user’s requests, or a skill can’t handle variations in requests, or users are simply unfamiliar with it.
We have tools to help you close such feedback loops. These are tools to build, detect, diagnose, and fix skill quality issues:
Anticipating variations of customer interaction patterns can be challenging, especially for new skills. Oftentimes we hear developers say “I don’t know if my skill has enough sample utterances to provide voice user interface (VUI) for the skill” or “When do I stop adding sample utterances to my skill?“.
To address this problem, we’ve launched AI-SURE (AI-based Sample Utterance Recommendation Engine), a self-serve recommendation tool that uses machine learning techniques to generate variations of the sample utterances provided by developers. The tool assists developers to quickly bootstrap new skills with improved VUI coverage and existing skills to identify missing variations of sample utterances.
With Alexa Entities, you no longer have to source, acquire, or manage your own catalogues of general knowledge information for use in your skills.
Alexa Entities makes Built-in Slot Types more useful by linking entities to Alexa’s Knowledge Graph. Knowledge is automatically updated as Alexa learns new facts — when a new movie is released, for example — reducing the effort it takes to keep skills up to date. As a result, you can build more engaging and intelligent experiences for customers.
Connections between entities can be used to create more natural dialogues, like so:
You can also use Knowledge to help customers disambiguate between similarly named entities:“Did you mean Anne Hathaway, born in 1556, or the actress in films such as Les Misérables?” Or you can simply make skills more fun and engaging: “Did you know that tortoises can live to 120 years old!”
Slot values in your custom intents are automatically resolved to Alexa Entity IDs for supported Built-in Slot Types. These IDs can be used to make additional calls to fetch data via our Linked Data API from within your skill’s code.
For example, suppose you have defined a custom intent CelebrityBirthdayIntent with a sample utterance “When was {celebrity} born?”, where the slot {celebrity} is assigned to the Built-in Slot Type AMAZON.Person. When a customer asks your skill “When was Beyoncé born?”, you will receive the Alexa Entity ID corresponding to Beyoncé in your intent request. This ID can then be used to fetch additional knowledge (Beyoncé’s birthday).
Furthermore, the new Custom Pronunciation feature allows skill builders to ingest your own pronunciations into Alexa to improve the speech recognition of rare or custom words (character names, game titles, pharmaceutical drugs, etc.), which are unique to your skills. Before this feature, developers had to build additional wrappers or add incorrect spellings for entities to mitigate speech recognition issues.
You can identify common failure cases through quality dashboards. After you’ve done that, we recommend that skill builders use Evaluation tools to constantly monitor ASR and NLU performance.
ASR Evaluation tool allows you to evaluate the ASR performance of your model. We’ve made it even simpler by launching 1-click testing frameworks that automatically create NLU and ASR test sets for batch testing. These testing frameworks learn from your interaction model, frequent utterances to your skills, and high-friction patterns identified in the quality dashboards.
1-click testing drastically reduces time to get started by over 50% and includes vast variations with 100% intent coverage. In other words, it makes your testing more comprehensive and diverse.
In order to continuously iterate on skill models, it’s pivotal to constantly keep track of flawed interaction patterns. Typically, skill builders use skill rating and reviews as a proxy to learn about incorrect skill behavior. To that end, our quality dashboard consists of Skill Performance and Customer Friction metrics.
Skill Performance provides an end-to-end view of accuracy for eligible skills with Estimated System Error, insights into Intent Confidence and endpoint health (Endpoint Latency, Endpoint Response).
If you have a high-traffic skill, you can further review the Customer Friction dashboard. It includes Friction Summary indicative of overall customer friction.
Alexa uses various implicit and explicit customer signals to identify friction utterances, similar to how humans interpret an interaction. For instance, explicit signals are:
Each dashboard includes a frequent utterance table that points to specific utterance patterns resulting in high friction.
Learn more about these new skill metrics here.
Check out the Developer console to ensure your skill is high quality.
Sometimes, customers don’t remember a skill’s name. The Name-Free Interaction (NFI) Toolkit, a self-service method, makes it easier for your customers to open skills without having to remember and say the skill’s name. When you add NFI launch phrases and intents to your skill, Alexa learns how to dynamically route customers’ name-free utterances to your skill. This enables customers to launch your skill and intents without knowing its name. Engagement is different for every skill, but after implementing the NFI toolkit, some of the skills participating in the preview have seen significant increases in dialogs. For example:
To learn more about implementing this functionality, check out the NFI step-by-step guide and technical documentation. If you have NFI toolkit questions or issues, please post them to the NFI Developer Forum.