APIs are designed for communication between computer systems, which means they don't often translate into a natural conversational interface. Simply put, it's not enough to wrap voice around your API calls.
For example, let's say you're using a third-party API to access a database of actors and actresses and it only allows you to look them up by an ID. ID 1 returns information about Brad Pitt and ID 2 returns information about Johnny Depp. If you simply wrapped voice around the API, your user would have to say something like "Give me information about number 1." From the user's perspective, the number 1 is arbitrary. They are most likely going to say, "Give me information about Johnny Depp." or "Tell me about Brad Pitt."
If you own the API, you could update it to support lookup based on the name, but there is no way to edit the third-party API. So how do you make the experience more natural?
You can use entity resolution to map synonyms to the IDs. For example, if you're using the AMAZON.Actor slot, you can extend it by adding IDs for each actor in the third-party API's database.
Extending AMAZON.Actor:
If you're using a custom slot, you can set the value as the actor name, and the ID to the API ID. In either case, you can use synonyms to support nicknames and improve accuracy.
Defining a custom slot type called, actorType:
When the user says "Brad Pitt," the JSON that is sent to your skill will then include the ID.
...
"actor": {
"name": "actor",
"value": "brad pitt",
"resolutions": {
"resolutionsPerAuthority": [
{
"authority": "amzn1.er-authority.echo-sdk.amzn1.ask.skill.xxxx.AMAZON.Actor",
"status": {
"code": "ER_SUCCESS_MATCH"
},
"values": [
{
"value": {
"name": "Brad Pitt",
"id": "1"
}
}]
}]
}
}
...
From your skill's backend, you can access the ID and pass it to your API to make your request. The API will then return the detailed information about the actor that we looked up based on the ID that entity resolution resolved for us.
This approach solves the problem we had where the third-party API didn't support lookup by name with minimal effort and no longer requires the user to say, “Tell me about number one.” They can now simply say, “Tell me about Brad Pitt,” which is so much more natural.
The great thing about this approach is that most of the work is done for you. Through your interaction model, you are providing training data that dictates how the synonyms are resolved, and the Alexa cloud takes care of it for you. Your skill only needs to unpack the data and pass it to the third-party API.
In the end, your skill's interaction is more natural and it works with a third-party API.
Check out these resources for more information about entity resolution:
Every month, developers can earn money for eligible skills that drive some of the highest customer engagement. Developers can increase their level of skill engagement and potentially earn more by improving their skill, building more skills, and making their skills available in in the US, UK and Germany. Learn more about our rewards program and start building today.