A computer science major at the University of Massachusetts, Dartmouth, Cameron Sheedy has had a natural interest in voice technology for years. He had developed several skills for Alexa, including two trivia skills, and a binary number converter. But even after this experience, Sheedy was hungry to learn more. So, when Amazon announced the Alexa Skills Challenge: Multimodal, he saw an opportunity to develop a rich multimodal user experience he could be really proud of. A few months later, Sheedy’s fun and engaging Crazy Conversation skill earned him the Bonus Prize for Best Multimodal Living Room Experience, as well as a total of $8000 in prizes.
“I caught the bug for developing for voice and kept going with it, developing more skills, and learning everything I could,” said Sheedy. “I figured I'd throw my hat in the ring for the competition and give it a shot.”
When he saw the Multimodal challenge, Sheedy was excited to take it on. With the help of the Alexa Presentation Language (APL) he tackled his first visual interface for Alexa, creating an engaging voice-first game with visual enhancements that work on a variety of Alexa devices, including the Echo Spot, Echo Show, and Fire TV.
“Even if you’ve never developed anything before, Amazon offers everything you need for developing with Alexa: office hours, tutorials, and all kinds of community support,” said Sheedy. “The barrier for entry into voice technology development is really low, but the potential opportunities for developers are enormous.”
What intrigues Sheedy about voice technology is that it’s a relatively new interface that’s intuitive and easy to use. Unlike the familiar graphical user interface (GUI)—which in Sheedy’s words has been “done to death”—voice is fresh and full of possibilities.
“In the past, voice interfaces are something we've only seen in science fiction,” said Sheedy. “Talking to a computer and interacting with it through voice is a relatively new experience for the user. Yet voice is how people naturally interact with each other. It’s exciting to develop for it.”
Sheedy first conceived the idea for a multi-player word game where players could use Echo Buttons to buzz in their answers, after reading about another skill challenge in fall 2018. A heavy course load prevented him from pursuing that contest—the Echo Buttons Game Skills Contest with Hackster.io—but the timing of the Alexa Skills Challenge: Multimodal was perfect. It gave him a chance to bring his idea for Crazy Conversation to life and make it even more exciting by adding engaging visual effects.
“The Multimodal challenge was for a voice-first skill with visual components, and my idea for my Crazy Conversation skill seemed like a perfect fit,” said Sheedy. “It’s designed so you can still have a good time playing it without a screen. But like most games, there’s a visual aspect that adds to the enjoyment.”
Crazy Conversation is an engaging skill with a unique premise: the player has one minute to find a common phrase hidden within a group of words that seem to make no sense together. For instance, the clue “a maze on all hex ah” translates to “Amazon Alexa.” The clue appears on the device screen and Alexa says it out loud twice, while the user tries to solve it before the timer counts down to zero. Each game is made up of 10 individual clues, with enough clues for hundreds of games. Crazy Conversation can be played again and again, either by a single player or a group with everyone working together to get the right answer.
Sheedy developed the voice-first aspect of the skill and tested it on the Echo Dot before working on the visuals. He then turned to APL to create a complementary visual experience by adding the verbal clue, countdown timer, and score tally on the Alexa device screen. Sheedy used APL to lay out the screen so it scales properly to each individual device screen, automate the timer, and create seamless transitions from one clue to the next. Even with no prior visual developing experience, Sheedy had all the resources he needed to make it happen with APL and the Alexa Skills Kit (ASK).
“APL made the layout easy,” says Sheedy. “It makes it simple to nest and stack components so that they'll move around where you want them on the screen. I also used APL to create an actual countdown timer, instead of just plugging in a video. I did that by adding the auto pager command so it would cycle through every second down to the next number. It creates kind of an appearance of a 60 second timer. That was a big breakthrough for me.”
Sheedy is looking ahead to a career in voice development for Alexa. His win in the competition—and seeing the types of skills that engage users—has opened him up to all kinds of possibilities.
“I love developing for voice, I’m passionate about it, and I’m already building my knowledge of it,” said Sheedy. “If I could start a business focused around voice development, that would be incredible. I’m also interested in voice developer jobs once I graduate. Really, my dream job is working on something with Alexa.”
Sheedy feels his key to success in developing a top performing skill is to start with a simple yet engaging idea. He says a simple skill that the user can learn to interact with is going to be more popular than a complicated skill that’s difficult to learn. With so many uses for voice technology today, there are great ideas just waiting to be developed. No matter what the future holds, Sheedy plans to keep developing Alexa skills.
“I’ll keep developing for Alexa because I love doing it,” said Sheedy. “I can develop fun skills and have fun while I’m working on them. With all of Amazon’s resources available, voice technology is easy to learn. It’s the perfect project for me.”
Check out the APL resources below and get started with building your own multimodal skills today.