Why Alexa?
There’s no doubt Amazon's Alexa is currently one of the best consumer virtual assistants to find. In general, it seems consumers feel more comfortable speaking to a virtual assistant at home than in public (yet). More particularly, Alexa currently has over 8.2 million active users in the U.S. and awareness of the devices increased to 82% as of Dec. 31, 2016, up from 47% a year before that and 20% on March 31, 2015. Lastly, Alexa high demand for 3rd party service integrations. For these reasons and many more, I’d say it’s a smart move to integrate your service as well.
The Challenge
The challenge with integrating with Alexa when your service's interface is a bot, is that Alexa isn’t really set up for conversation. Since most services don’t provide a conversational interface, Alexa provides an interaction model that offers basic pattern matching and NLP. If you’ve built your own NLP engine, this is really bad news. It means you’ll need to classify thousands of patterns to enable the same ability you've already achieved with your bot.
Luckily, I’ve found a work around thats actually so simple, it shouldn’t take you more than 10 minutes! In general, all you need to use is a slot type called AMAZON.LITERAL. From the Alexa Skills Kit documentation: "AMAZON.LITERAL passes the recognized words for the slot value with no conversion. You must define a set of representative slot values as part of your sample utterances." A while back, Amazon deprecated this solution but recently announced its return for good, so have no worries!
How Alexa works
For this tutorial, I’ll use a weather bot called Weabo, configured with a custom HTTPS endpoint (you can also use an AWS lambda function if you'd like). This is what happens when a user says something to Alexa:
- A Voice message comes in from user. The message pattern could be: “Alexa, ask Weabo what’s the weather in San Fransisco”
- An Alexa NLP engine classifies the message intent and extracts entities defined in advance. For example, the intent here could be “weather.get” and the entities “location=San Fransisco”.
- Results from the NLP engine are constructed into a request.
- The request is routed to a custom HTTPS endpoint of your choice, or to an AWS lambda function.
- A message response is returned from the request, and back to the user. A message response could be: "The weather in San Fransisco is 70 degrees".
Solution
We won't go into details of how to set up your Alexa skill, since there's so many great tutorials out there. Take a look at this one by Amazon, on developing an Alexa skill in under 5 minutes with a Lambda function, or this one for developing with a HTTPS endpoint.
Let's assume this is our endpoint function (written in Node.js):
function respond_to_incoming_message(req, res) { // Extract text message from request var text = req.body.text; // Extract text intent, entities and generate bot response var bot_response = respond_to_user_message(text); // Return response res.send(bot_response); }
What we'd like to do, is to make sure a message sent to Alexa, arrives to the above endpoint as is, so your bot can take care of the parsing it and returning a proper response.
Once you've reached the 'Interaction Model' area of your Alexa skill configuration, add the following to the Intent Schema section :
{ "intents": [ { "slots": [ { "name": "MessageText", "type": "AMAZON.LITERAL" } ], "intent": "FreeText" } ] }
After that, add the following to the 'Sample Utterances' section:
FreeText {this is a sentence|MessageText} FreeText {this is another sentence|MessageText}
Looking back at our endpoint function above, replace the first line with the following:
var text = req.body.intent.slots.MessageText.value;
That’s it! This will allow your users to talk freely with Alexa while using your own NLP engine.