Dialog management

At Mercury.ai we are convinced that at the heart of conversational AI is dialog behavior that plays out the intricate moves of a conversation. Moves that are not planned as a tree, no matter whether pre-defined or trained, but that emerge from the state of the conversation and related contextual knowledge. This yields more natural and flowing conversations.

Dynamic dialog behavior

Dialog behavior is the bot's strategy to determine its response to user input. In Mercury.ai bots, dialog behavior is dynamic in three main senses.

First, it is context-dependent, i.e. it takes into account what has happened before in the conversation, what is currently being talked about, as well as parameters like the time of day or the location of the user. This moves the dialog beyond simple pairs of one request and one response, towards long threads of an actual conversation. For example, it enables the user to successively explore options, as the bot keeps track of what has been asked for before and which parts of it are currently relevant. Moreover, rich context knowledge allows the bot to disambiguate user input. For example, if the user is presented two recipe suggestions and asks "Is the first one vegetarian?", the bot can resolve which of the suggestions "the first one" refers to; moreover, if the user next asks "And the other one?", the bot can infer that the implicit question is whether the other suggested recipe is vegetarian.

Second, dialog behavior is personalized, i.e. it takes into account knowledge about the user, ranging from the user's location to restrictions the user mentioned (such as dietary or financial) and preferences that have emerged over time. This allows the bot to zero in on suggestions and results that are most relevant for the user.

Finally, dialog behavior is data-driven, i.e. it takes into account the underlying data source and results of user queries. For example, based on how many relevant results a user query has, the bot can adapt its response accordingly: are there too many results, it can try to first narrow down the search space by asking more questions; are there too little results, it can actively widen the search space again, e.g. by looking for similar matches.

Modular dialog behavior

The most important aspect of dialog behavior is that it is a result of the interplay of different modular dialog behaviors that we call dialog games. Different user stories call for different strategies for reaching the particular goal of that user story. For example, an explorative search for cars that fit the user's wishes can be much more open-ended and flexible than determining whether the user is eligible for a particular credit. A one-fits-all approach to dialog is doomed to fail. Dialog behavior is thus modularized in our approach: There are several different behaviors, bundled into templates, and a bot designer can pick the right behavior for each task.

The fact that these dialog behaviors are user story-specific means that a bot typically inhabits several different behaviors and is able to freely switch between them. Especially, there is no need for fixed predefined paths through a conversation. Since the bot keeps track of the previous and current state of the conversation, it can flexibly switch between topics, and can easily resume previous threads. Dialog with Mercury.ai bots is therefore mixed-initiative: it is not only the user or only the bot that drives the conversation, rather both work together.

Moreover, having conversational components that encapsulate particular dialog behaviors reduces the overall project complexity as editors only work with separated parts of a bot and do not run the risk of unintentionally making changes with unwanted effects on other parts of the bot. This not only improves the maintainability but also is key to the re-use of conversational scenarios. They can be saved outside of a bot in the library for use in other projects.

Learn more about building dialogue behavior in the Games section.

How dialog management works

When a user sends a message to your bot, the dialog manager will follow three steps:

  1. It runs the NLP pipeline. This comprises general language detection, spell checking, and named entity linking, and then asks all games to bid on the incoming message with their interpretation (if they have any).
  2. It then filters the bids and decides which interpretations to act on, based on confidence, contextual knowledge, and principles like dialog coherence.
  3. It collects the reactions from the winning games, checks for possible conflicts, and finally sends out the bot's response.

Note that games act independent of each other. You can thus add and remove user stories in a plug-and-play fashion.