Ready to build a conversational bot for your business, but confused with the variety of platforms? Let's talk!
A Dialogue ExampleLet’s look at the ways we can ask a system to find ‘asian food near me.’ The variety of search phrases and utterances could look similar to this:
- Asian food near me please
- Food delivery place not far from here
- Thai restaurants in my neighborhood
- Indian restaurant nearby
- Sushi express places please
- Places with asian cuisine
Dialogue Structure as NLP engineers see itFrom the example above we can see that each expression from the users has the intent to take some action. An Intent is the core concept in building the conversational UI in chat systems, so the first thing that we can do with the incoming message from the user is to understand its Intent. This means mapping a phrase to a specific action that we can really provide. Along with the Intent, it’s necessary to extract the parameters of actions from the phrase. In the previous example with ‘asian food’, the words ‘nearby’ or ‘near me’ correspond to the current location of the user. Parameters, also called entities, often belong to a particular type. Examples of entity types that are commonly supported in language understanding systems are:
- Enumeration (predefined list of named things)
- Understand the language in a plain text (or voice translated into text) as well as the Intent with Parameters.
- Process the Intent with Parameters and execute the next action to continue a dialogue with the user. (Result is a response or a subsequent question to continue the conversation by getting more data from the user and filling needed parameters in order to fulfill the action).
- Maintain the Context and its state with all parameters received during the single Session in order to get the required result to the user.
Microsoft Language Understanding Intelligent Service (LUIS)LUIS was introduced during this year’s Microsoft Build 2016 event in San Francisco, together with Microsoft Bot Framework and Skype Developer Platform, which can be used to create Skype Bots. In this article we leave aside Bot Framework and look at language understanding features from LUIS. LUIS provides Entities that you can define and then teach to recognize a LUIS system from a free-text expression. There are also Hierarchical Entities that are helpful for recognizing different types or sub-groups. For instance, a entity can have a and a which can be recognized separately. Currently, there are limitations of up to 10 Entities of each type per application, which will be enough for a middle-size service. Besides Intents and Entities, there is also the concept of Actions that can be triggered by the system once the Intent and all required parameters are present. Moving closer to automatic language understanding and the acting upon completion of Intents with parameters, there is another feature called Action Fulfilment, which is currently present only in preview mode, but you can already play with it and plan for the future. The idea is that once we have an Intent the system can automatically execute predefined Channel Actions like , or your own to an arbitrary API. Dialogue support, which also presents only in a preview mode, can help us to organize the conversation and ask relevant questions to the user in order to fill in the missing parameters for the intent. To train the model with different utterances, LUIS provides a Web interface where we can type an expression, see an output from the model, and make changes in labels or assign new intents. Additionally, LUIS stores all incoming expressions in the Logs section and provides semi-automatic learning features with Suggestion, where the system tries to predict the correct intents that are already present in the model. Once we have the trained model, we can use the API to ask questions and receive intents, entities and actions with parameters for each expression as an input. LUIS has the export/import feature for the trained model in a plain JSON with all expressions and markups for entities, which we can then repurpose in our code – or even substitute LUIS completely, if we decide later to build our own NLP engine. Currently, LUIS is in beta and free to use for up to 100k requests per month and up to 5 requests per second for each account. Next we will look at Wit.ai from Facebook.
Facebook Wit.ai Bot EngineWit.ai, an AI startup that aims to help developers with Natural Language Processing tasks through the API, was acquired by Facebook in January 2015. During the F8 conference in April, 2016, Facebook introduced a major update to their platform and rolled out their own version of Bot Engine that extends a previous intent-oriented approach to the story-oriented approach. Building the conversation interfaces around story feels more natural and easier to follow than a separate intent string by the context variable. Under the hood, during the logic implementation, you still work extensively with the context and need to do all tasks required to maintain the conversation’s correct state. In Wit.ai we can use Entities, Intents (it’s actually just a custom entity type here), Context and Actions concepts that together form the model based on Machine Learning, and statistics can be used later for understanding the language. On the bot side, during the story definition, we can execute any action that we might need to fulfill the context, user action, and prepare data and/or states in the context. Effectively, the Wit.ai Converse API will resolve the user utterance and the given state into the next state/action of your system, thus giving you the tool to build a Finite State Machine that describes sequences of speech acts. However, all actions are executed on our server, and Wit.ai just orchestrates the process and suggests the next call of state mutations based on the model that we’ve trained. Everything, from understanding the user inputs to the training expressions and list of entities, is available through the extensive Wit.ai API. Like other systems, Wit.ai provides a handy Inbox feature where you can access all incoming utterances from the users, and label them if they were not recognized correctly. In one of the latest updates, Wit.ai introduced the chat UI for testing conversations so we can see steps that systems recognize, which helps during both the creation and the debugging of the model. Wit.ai supports 50 different languages including English, Chinese, Japanese, Polish, Ukrainian and Russian. Projects can be Open or Private, without any apparent limitations. Open projects can be forked and you can create you own version of the model on top of existing community projects. The Wit.ai API is completely free with no limitations on request rates, thus it is a good choice for your next bot experiments. UPDATE 2016-12-22: Wit.ai is continuously pushing new features and capabilities. Since the release of the first version of this article they’ve make a better builder for the Stories and added support for Quick Replies, Branches (if/else) and Jumps in Stories which is great for describing complex flows.
Api.ai – conversational UX PlatformApi.ai was created by a team who had built a personal assistant app for major mobile platforms with speech and text-enabled conversations. To give you a better understanding of how API is different from other platforms, here is the answer their CEO gave on Product Hunt: Indeed, the service provides all the features you might expect from a decent conversational platform including support of Intents, Entities, Actions with parameters, Contexts, Speech to Text and Text to Speech capabilities, along with machine learning that works silently and trains your model. Everything starts from Agents that represent the model and rules for your application. The interesting thing is that API.ai has built-in domains of knowledge (Intents with Entities and even suggested Replies) on topics like small talk, weather, apps and even wisdom. It means that your new Agent on the system can recognize these Intents without any additional training – and even provide you with the response text which you can use as the next thing your bot will say. There are up to 35 different domains with full English support and partial support for the other six languages. When you create an Intent, you directly define which Context the Intent should expect and produce as a result. You can also define several speech responses which an agent will return to your app through the API, so you don’t even need to store such variations in your app. Api.ai provides integrations with different bot platforms including Slack, Facebook Messenger, Kik, Alexa and Cortana. For example, you can build the conversational flow completely on the platform and then deploy it automatically on Heroku, or use a pre-built Docker container with the app. There is also an embedded integration mode available so you can have an agent that works without connection to the internet and is independent from any API. Just think about use cases like embedded hiking assistants or in-car assistants. Api.ai looks like a decent solution that you can use for building sophisticated conversational interfaces. Like LUIS-beta from Microsoft or Wit.ai from Facebook, it’s Free with a limitation in bandwidth and speech recognition feature, though Preferred plan without limitation is also available by request. UPDATE from Dec 1, 2016: Well, Google have bought Api.ai since the first version of this article. Good for the founders, but this means the community has lost the powerful independent NLP service, although a couple of other startups are emerging from the stealth mode.
Amazon Alexa Skill SetIt only works with Amazon Alexa. At first glance, this looks like the simplest language processing algorithm available among all other systems, but it’s deployed, tested and exposed to more than 3 million Amazon Alexa users who are already using conversational interfaces on a daily basis. With Amazon Alexa Skills Kit you can define Intents and Entities for your task. Alexa system recognizes an intent correctly with variations in words only when you provide every possible example of expressions that could exactly match how users might say it to Alexa. It feels like they are still working on their own version of machine learning in order to simplify the work needed for model training. The great thing is that a whole new skill for Alexa could easily be built with AWS Lambda functions, that seamlessly integrates with the Alexa Skills Kit. Anyway, Amazon Alexa Skills Kit is an outstanding system that you should keep in mind,following their development, because Amazon is currently a leading household platform for conversations and custom bot integrations, which they are aggressively pushing forward with new device offerings and features. UPDATE from Dec 1, 2016: Yesterday, Amazon revealed Amazon Lex – a conversational interface API with NLP features and tight integration to Amazon services such as Lambda, Dynamo DB, SNS/SES and others. We’ll look into Amazon Lex internals once it becomes available.
IBM Watson Developer Cloud ServicesYou probably remember the famous IBM Watson’s game when it won against two humans on the TV quiz show “Jeopardy” in 2011. So the good news is that IBM moved the technology behind the Watson into the cloud and released the set of API that you can use in your own conversational applications. The API set includes language understanding offerings from a natural language classifier to concept insights and dialogue processing. There are a lot of building blocks that you can use in your application, but you probably will spend a fair amount of time integrating them into one solution. We’ve used IBM Alchemy Language for sentiment analysis and keywords extraction for our experiments and it worked well. We think that IBM’s solution is the ideal choice for enterprises that want to be 100% sure of their API provider. For a recent IBM Watson demonstration you can watch a fireside chat with Dr. John Kelly, who leads the Watson team at IBM, at TechCrunch Disrupt 2015 in San Francisco. IBM Watson is, however, a costly solution and you can expect to pay up to $0.02 per API call in Dialogue API, so it may be too expensive to experiment with in building bots for Facebook Messenger when you still don’t have a working business model. The full list of available API’s from IBM Watson Developer Cloud available here.
Ready to build a conversational bot for your business, but confused with the variety of platforms? Let's talk!