Today's guest post comes from Jim Kresge from Capital One Engineering.
In March 2016, Capital One became the first company to offer its customers a way to interact with their financial accounts through Alexa devices. With the Capital One skill for Alexa, customers can access in real time all of their Capital One accounts -- from credit cards to bank accounts, to home and auto loans. The skill is highly rated on the Alexa app, with 4/5 stars.
The Capital One team has continued to update the skill since launch, including a recent update to the skill called “How much did I spend?” With the update, Capital One customers can access their recent spending history at more than 2,000 merchants. Customers who have enabled the skill can now ask Alexa about their spending for the past six months--by day, month, or a specific date range--through questions posed in natural language such as:
Q: Alexa, ask Capital One, how much did I spend last weekend?
A: Between December 9th and December 11th, you spent a total of $90.25 on your Venture Card.
Q: Alexa, ask Capital One, how much did I spend at Starbucks last month?
A: Between November 1st and November 30th, you spent a total of $43.00 at Starbucks on your Quicksilver Card.
Q: Alexa, ask Capital One, how much did I spend at Amazon between December 1 and December 15?
A: Between December 1st and December 15th, you spent a total of $463.00 at Amazon on your Quicksilver Card.
The building of the skill was a collaborative effort between product development, engineering and design teams at Capital One. I have the privilege of representing the great work of the entire team in this blog post to give a behind the scenes look at the building of the Capital One skill.
In summer 2015, a group of engineers at Capital One recognized the potential to develop a skill for accessing financial accounts using Amazon Echo. We got together for a hackathon, worked our way through several possibilities, and began building the skill. The Beta version included a server-side account linking mechanism that we built ourselves. We were able to use an enhanced beta version of the Capital One mobile app to provide the account linking interface and created some AWS infrastructure to support it. We then demoed the Beta at the AWS re:Invent conference in October 2015.
Having proved out the Beta version of the skill, we became really driven and focused on building the first skill for Alexa that would enable people to interact with their financial accounts.
We began working on a production version in December, 2015, with the goal of delivering a product by March, 2016. Working in an iterative design model, we found that coding the skill for Capital One financial accounts was relatively straightforward. But, as with anything game-changing, we realized that what we were attempting involved some things no one had done before. First, we were attempting to integrate sensitive data with Alexa, which no company with a skill on Alexa had done yet. It was also the first time we had built a conversational UI. And, the Alexa Skills Kit was still maturing and evolving as we were building the skill, which meant that we needed to be flexible in quickly making adjustments to code.
We started with the premise that in the first iteration, Capital One credit card and bank customers can ask Alexa things like their current account balance, their recent transactions, and when their next bill is due.
Data security is always top of mind for us, as was creating an experience for customers that was friction-free and simple.
With Amazon, we worked through possible solutions within the Alexa infrastructure to build in a security layer that ensures data integrity while still providing a simple, hands-free experience. In addition to using OAuth to securely link accounts, we added a security solution that involves an in-channel spoken “personal key.” As users set up the Capital One skill and pair their accounts using OAuth, Alexa asks the user if they would like to add a “personal key,” a 4-digit personal identification code.
In addition to wanting users to be able to control access to their account information, we wanted the language Alexa uses in her conversations with customers to be warm and humorous at times. We learned a lot through testing and are using that feedback as we fine tune tone and wording along the way.
We built the Capital One skill using node.js. We also use AWS to host our skill and internal APIs to get customer account information. The basic engineering work is straightforward and the Amazon developer portal documentation makes it easy to learn. Here are a few of the creative technical solutions we added on top of the basic engineering work to help us move fast with high quality:
The Capital One utterance compiler
We created a tool that automatically generates an expansive set of utterances from just a few input parameters. This allows us to avoid maintaining a huge list of individual utterances for our skill. For example, in our "AccountBalance" intent, we have many ways of asking for the balance on an account. To this already long list we then added account types (e.g. checking, savings, etc). After that we added product names (e.g. Venture credit card, Quicksilver credit card). Our list of utterances for that intent is now huge when you incorporate all the different ways customers can ask for their balance across account types and product names. Our utterance compiler makes it simple to generate and maintain all these utterances.
For example:
AccountBalance Intent
{ "intent": "AccountBalance", "slots": [ { "name": "LastFour", "type": "AMAZON.FOUR_DIGIT_NUMBER" }, { "name": "AccountType", "type": "AccountType" }, { "name": "ProductType", "type": "ProductType" } ] },
Sample Utterance Compiler Input
how much [is|do I have] in {My} ([{AccountType}|{ProductType}])? Account
Sample Utterance Compiler Output
AccountBalance how much is in my {AccountType} account AccountBalance how much is in my {ProductType} account AccountBalance how much is in my account AccountBalance how much do I have in my {AccountType} account AccountBalance how much do I have in my {ProductType} account AccountBalance how much do I have in my account
Our Abstraction Layer
We built a translation layer that separates our interface to third party Natural Language Processing (NLP) services like Alexa from our back-end interaction model and API orchestration. The interaction model and API orchestration converts intents and their associated slot values into orchestrated API calls and then generates and sends the response once the data is received from the API. The translation layer allows us to format the incoming and outgoing data to any number of 3rd-party NLP solutions. This lets us use a common code base for all voice implementations that leverage the utterance/intent/slot model - which many do - and lets us move extremely quickly. For any new 3rd-party NLP solution that follows the pattern of utterance/intent/slot we simply add a translation so that the incoming and outgoing messages match the particular 3rd party’s structure. This is a key element in our architecture and modularization that lets us move fast.
Our Pre-Production Testing Tool
We built a tool that allows non-developers to easily engage in pre-production testing and validation so that anyone can test the skill without any special knowledge or tools and without the hassle of manually changing account linking to test across multiple test accounts with different data conditions. It is extremely useful and has expanded our capacity, speed, and quality all at the same time.
Test Interface Screen:
The goal was to enable customers to easily locate what they need, have easy access to help if necessary and to genuinely like using the skill. That means focusing on really great design. And, in the case of Alexa, it meant designing an experience that relies on voice cues – not visual cues.
An Alexa user who also has a Capital One credit card, bank, auto, home equity or mortgage account, and who has registered for the Capital One skill and paired that skill to their Echo or other Alexa device, can start the process by addressing Alexa:
“Alexa, open Capital One.” Or, they could say,” Alexa, ask Capital One what my account balance is.” Alexa will respond with the most current account information. Simple as that.
The skill officially launched for Alexa in March, 2016. We launched the skill at SXSW and demoed the skill to an enthusiastic audience at the conference.
We found the publishing process with Amazon to be a straightforward 2-step process: 1) submit the skill to Amazon for review, and 2) publish the skill once Amazon has approved that skill.
Conversational interfaces are the future of how people will engage with technology. The Capital One skill for Alexa has enabled us to serve our customers through a differentiated experience and most importantly making it even easier to manage their money whenever and wherever they are. Using the new skill for Alexa, Capital One customers can manage their money easily (hands free) and intuitively (conversational UX).
Are you ready to build your first (or next) Alexa skill? Build a custom skill or use one of our easy tutorials to get started quickly.
Share other innovative ways you’re using Alexa in your life. Tweet us @alexadevs with hashtag #AlexaDevStory.