Chatbots categories and their limitations

Advanced Technologies & Embedded

from the series: A practical approach between myth and reality

Especially over the last few years, many businesses have been considering the idea of building or acquiring their own AI-enabled conversational agent. However, despite the interest, research, and effort that is put in, there are still just a handful of successful attempts.

While developing a chatbot inside Tremend, we came across some new information that not only cleared the path for developing such a project but also shed some light on the prospects and limitations of these Artificial Intelligent “creatures”.

First of all, I will approach a common myth: the expectation that chatbots should be able to perform complex conversations, regardless of the topic. There are some distinctions to be made when talking about chatbots and each comes with some limitations.

Below we expose two types of classifications that will make choosing the right chatbot easier.

Chit-Chat vs. Task-Oriented Chatbots

The first classification splits the conversational agents into chatbots destined for the sole purpose of maintaining a conversation (chit-chat bots), being interesting, creative, or fun, and task-oriented chatbots that offer customer support or act as personal assistants, helping users to achieve a certain task.

Chit-Chat Bots

Chatbots in the first category do not particularly try to reach an informational target, they are more focused on the generative aspect of the conversation – offering answers as creative as possible, not repeating themselves, and keeping the conversation interesting for the person they are chatting with.

You can develop one starting from a Sequence to Sequence model [1], usually trained end-to-end on question-answer pairs. A few relevant examples would definitely include the one talking like Shakespeare initially trained on his original plays, an app which aims to impersonate yourself, learning from your own writing style called Replika and Woebot, developed by Andrew Ng – a prominent figure in the AI world, designed to help people through cognitive behavior therapy and natural language processing (more information is available here.

Other use case scenarios I can think of include telling imaginative stories about a certain place while the user walks around with his headphones on, or generating soothing descriptions of nature while enjoying a spa treatment.

Task-Oriented Bots

The chatbots in the second category, the task-oriented agents, are designed for dealing with specific scenarios, such as placing an order, scheduling an event, or helping with troubleshooting. Many of the giants in the industry offer frameworks to build your own chatbot, customized to your needs and easy to integrate with other known services and devices, such as cloud platforms or messenger apps.

Among the most popular platforms for building a bot from scratch, it is worth mentioning Google’s Dialogflow, Amazon Alexa, the Bot Framework from Microsoft, Facebook’s Wit, or IBM’s Watson Assistant tool. In general, these options provide an interface for customizing a set of intents, entities, and actions after reaching certain states within the conversation, as well as a tool for testing your newly developed bot.

The benefits of using one of these are undeniable: you can obtain a quick minimum viable product (MVP), without having much previous AI knowledge, you can design the chatbot to fit a specific domain and you can integrate it with other services you probably already use. Moreover, some of them are free to use at a business level and many are available in multiple languages.

Open-Domain vs. Closed-Domain Chatbots

Another classification of chatbots is based on the sort of information they are expected to provide.

On one hand, there are the open-domain chatbots, destined to retrieve all sorts of information for questions such as “What will the weather be like in three days?”, “What year was Salvador Dali born in?” or “How many species of frogs exist worldwide?”. One system that became better over the years at finding the answers to all sorts of questions is Google search. Using the PageRank algorithm, along with NLP techniques like information retrieval and information extraction, this system basically acts as a conversational agent for each search.

These bots are hard to perfect, as not only is the language versatile, but the entire world requires common sense in order to understand it properly – which is even harder to grasp by computers. Personal assistants such as Siri (from Apple), Cortana (from Windows), Alexa (from Amazon), or Google Assistant attempt to return an answer for each task they receive, at least by providing the corresponding Internet search results.

However, the above examples were designed as closed-domain chatbots, namely to perform certain tasks related to the properties and applications you can find on a particular device. Closed-domain agents, also known as domain-specific, operate through information regarding a specific area of interest, aiming to provide answers for usually narrow scenarios, like offering guidance through a museum and providing specific types of information to the visitors (e.g. the location of the exhibit, the year it was brought in, etc.). In general, these agents tend to perform well in real environments.

The wider the scenarios a domain-specific chatbot is supposed to dabble in, the closer it becomes to the open-domain bots implementation difficulties. Denny Britz, the former resident on the Google Brain team, explains some of them in his article and later on, we will provide our findings which were crucial to the domain-specific bot we built.

All that being said, the question is why we do not see much more chatbots performing large-scale tasks in all businesses.

The short answer is that these systems work almost flawlessly when building simple scenarios which can be easily expressed through a small but close to complete set of examples. Whenever you want to add a layer of complexity, like processing separately a part of the conversation flow or specifying words from the vocabulary as entities with other semantical values than the ones we normally use in everyday conversation, things get harder to manage through a generic model.

Although the intended purpose is to replace the tedious work of human agents, our work has uncovered that it is not as simple as training a very complicated neural network model, which we will detail in a further article.

References

[1] Sutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence to sequence learning with neural networks. In Advances in neural information processing systems (pp. 3104-3112)