Saturday, February 10, 2018

Don't Build your NLP Bot like a GUI


There has been an ongoing buzz about conversational user interfaces (CUI) and how they can enhance human-to-computer interaction. Many of us have experienced them to some extent both in play with voice interfaces such as Siri and Alexa and at work with messaging interfaces in tools like Slack. When CUIs are infused with a good dose of NLP (and context aware, machine learning..etc), they have the potential to become more than just a primitive command line or simple minded messaging/voice interface. The potential for building NLP powered virtual assistants and benefiting from the productivity that they can provide does exist, but you will have to get out of your comfort zone as a developer.

The Mental Leap to CUIs

Building CUIs using NLP and AI can take some getting used to when you have spent your career or education building graphical user interfaces (GUI). Be wary of getting caught in the trap of building your natural language interactions like you would build a GUI. A virtual assistant, like how some NLP bots function, can behave in many of the same ways you would interact with another human that would be working on your behalf. Now, a GUI is a very capable and functional interface, but is not a virtual human. And you have to keep reminding yourself that the bot you are building needs to behave like a virtual human, because it is easy to get caught in the trap of trying to build your NLP virtual assistant like you would a traditional GUI. It ends up being the worst of both worlds if you end up doing that.

Not that there is Anything Wrong with GUIs

Now there is nothing wrong with GUIs. For some tasks and functions they will always be superior to natural language. But for many things we do in our daily lives, language is a superior medium especially if the engaging entity is intelligent.

Fundamentally how you would ask (via text or voice) a virtual assistant to perform an action or make an inquiry on your behalf is different than using a GUI. The way you reach a decision or action point in a CUI can be different than a GUI. Let's take a very simple example to compare.

Say you are using a business application to manage pending requests that you must take action upon on a daily basis. And there are different types or classes of requests you must review. Some requests are specific to you, some to your direct team members, and other requests are company wide actions you must review, but you are required to review and make decision/action on all of them at some point during the day or week.

In a GUI you might navigate to the screen and see some quick filters that let's you select which requests types to view or all requests might be shown in some sorted of grouped order on a scrolling page. And you would navigate through the information to decide what to do first. There are many ways to represent the request and perform filtering and sorting. It also depends on the your preferences for what tasks you like to tackle first and how you manage your day. The ways to make the GUI optimal for your particular usage pattern depends on many factors, many of whom are specific to you. This is where GUIs begin to breakdown. They can sometimes overwhelm us with information or not adapt to our unique usage patterns - basically they lack the ability to easily adapt to the extreme personalization you can achieve with a CUI.

All Things being Equal CUIs Win

Now this all depends on your implementation of the GUI and CUI. You can obviously build a horrible CUI and a wonderful ultra personalized GUI, but all things being equal I claim that a CUI will always be superior. The advantage the CUI has is that the user experience does not have to change significantly to improve personalization. The inherent nature of conversational interaction is something natural to all humans. So as a developer, as you improve the conversational flow of your virtual assistant, and as you release new versions of the bot, the user can adapt much more easily because the flow is fundamentally the same, the interaction is still a conversation and it is the bot that is getting smarter/better and always assisting you through the same conversational experience to help you accomplish your objectives.

So let's go back to our request/action application we described earlier. In the CUI case, a smart virtual assistant might look at all the pending requests and tell you have many pending requests of different types and suggest you start with your direct reports first, since there is only one of those, if that was the case. The ability of the CUI to branch off in different directions is much more dynamic than what a GUI can do, and this can be done in a way that does not force the user to learn a brand new interface with each new release of the software, since the virtual assistant is your interface and your personal guide at the same time. The CUI sort of has built in help since it is a virtual assistant by design.

Mixed Mode CUI and GUI

Many of the popular bot platforms such as Slack and Facebook Messenger have incorporated some very convenient visual GUI components into their messaging and flows which allow developers to mix conversation question/answer dialog with short-cut GUI actions and interactions. This can be great way to meld conversational with GUI, but at the same time it can get CUI developers sucked back into the GUI world, and get developers focusing too much on injecting too many GUI interactions into the CUI flow. So use these tools wisely and keep the interactions focused on the conversational flow between the user and the human like virtual assistant.

Grow your Bot Like You are Raising a Child

So be careful, always think like you are building a smart virtual assistant not a GUI. You will not have all the smarts built into the bot from day one, but make that your mission to keep the conversational flow focused on the interactions and dialog between the human user and the virtual assistant, and everything else will follow as your bot gets smarter.

No comments: