See our latest developments   Know More


  • Admin • 11 October 2021

Chatbot vs. Voice AI: Is there a difference?

While chatbots have gained a bad reputation over the past few years, it seems like Voice AI is the next big promise in conversational AI. As a tech lover, you might ask yourself, is it the same as a voice enabled chatbot? Well, you might be technically correct to some extent. The intention of this blog post is not to explore the technical depths of the similarities between these two. What we wish to highlight are the different challenges in voice automation and also highlight the practical benefits if we do it right.

First let’s recall what we saw as chatbots so far !

So, you go to your browser and visit some website. You have some queries. Here comes the shiny little chat box, popping up at the right bottom corner of your screen. It greets you well and you type your first query out of curiosity. In less than 2 minutes of exchanging messages, you either close the chat window or the entire website out of frustration. Your experience is no different from other users of that website. In effect you have given up on that company and its products!

The main problem with chatbots is the rigid, robotic responses that it is programmed to generate. They really struggle when users deviate from the “happy path” of a conversation — that is the intended path of the conversation flow as envisaged by the developer of the chatbot. If you carefully look at some useful (hence successful) implementation of chatbots sitting in websites, sometimes you will notice that this is achieved by clever manipulation of the front-end rather than some AI techniques. A good example is front- end validation such as clickable options, graphical date pickers, etc. Here, the trick is to limit the input space and force the users (kind of gently) to provide predictable answers. Also you can write a clever script for the chatbot and make the chatbot ask questions in a way that the user is going to provide a type of answer you planned for. In summary, controlling users’ input space without damaging the conversation too much seems to be a popular technique in successful chatbot implementations. Yes, it is tricky but it works for that limited purpose! But, such constrained implementations totally fail when the user goes away from the scripted path.

When you compare chatbots and voice AI, there is an important behavioral aspect of humans in general that we should factor in. That is, the way we perceive information through reading vs listening is different even in the same context. This means typical chatbots with rigid responses are going to sound even worse if we just turn them into voice bots. Also you cannot control the user input space with the same methods that you used in text based conversational scenarios. Added to this, the highly dynamic nature of typical voice conversations itself is a huge challenge for a voice AI solution to tackle.

Let’s try to carefully look at how some of the products in the market try to deal with this situation.

As I explained in my previous articles, making a true conversational AI is still a far-fetched idea. So, it is not easy to solve the issues pertaining to random, unscripted user queries with rigid responses or make the true AI kind of solution. We have to simulate “intelligence” rather than really having it in the machine. Beside certain improvements on Natural Language Understanding (NLU) techniques, nowadays a lot of voice AI products seem to promote “human-like” voices. It is true that it bridges the humans and the machines to a certain extent. These “human-like” voices are actually an outcome of advancements in Text to Speech (TTS) technologies while leaving the conversational AI piece with the same fundamental issues.

Voice AI is going to be a game changing productivity tool if we do it the right way. The number of valuable man hours a voice AI can save is much higher compared to what a text-based chatbot can do. By the very nature of the contact centre operation, agents have to completely occupy himself or herself with one particular user during a voice engagement, while a conversational AI chat agent can effectively augment the operation by simultaneously engaging with multiple users. Even though telephony is nowadays considered as old fashioned technology, the need for voice based customer interactions is something that is on rising demand. In fact, a Salesforce survey claims that voice, along with email count for more than 95% of customer service today. When you put all of these real-world facts together, it is clear that voice automation is inevitable!.

Some of the interesting voice AI related use cases in the last few years give us a peek into what the future would look like. Recently, banking and finance company JP Morgan took an interesting initiative to use Alexa to provide research and analytics reports to their customers. Another early adopter of voice automation is Capital One Financial Corporation who became the first bank to offer their services through Alexa to its customers to assist them with banking. One survey reveals that in 2017, 29% of online shoppers in U.S. used voice and 41% were planning to do so. The recent launch of Google Duplex seems to be a game changing update to the traditional Google Assistant where AI can make reservations for you. Even though telephony automation is still at an early stage, it is more critical for smart-device-based voice AI use cases. It can reduce resolution time and increase customer satisfaction by making the human agent focus on more value added conversations. One of the major but less spoken issues that telephony automation can solve is the silos that exist between voice- and text-based channels, providing the data needed for a consistent and seamless customer experience.

As I mentioned before, there are certain hurdles faced in the journey towards making voice AI work in real-world applications.’s aim is to provide the definitive technology stack to arm voice automation implementers to change the business world. Last year we released our conversational AI platform Sofia version 1.0 powered by our patent pending technology for the exact same reason. Our innovation provides a no-code solution for developers to build highly interactive AI in less than half the time or half the complexity compared to current techniques out there. Of course, our smart team is working every day to make it even better.

With Sofia platform, your voice bot will not sound again like “sorry I didn’t get you” when the user starts like “By the way, I want to change to topic…”.

The AI can deviate from the happy-path of the conversation to talk about other things of interest to the user and more importantly Sofia will bring the user back to the original topic in a smooth manner.

We are planning on the release of version 2.0 of the Sofia Conversational AI platform by early 2022. Our goal is to fill certain gaps in current telephony automation and completely change it with our innovative technology. With this new release, you will be able to develop a voice AI use case ridiculously fast and deploy at scale. I thought of concluding this article with the promise for more exciting content on Sofia Platform version 2.0 soon.