In the first part of this series, we saw how top firms with their different assistants are vying to acquire a space in the dialogue market. In this second and final part of this blog-series on Conversational AI, I go more technical to discuss the fundamentals of the underlying concept behind building a Dialogue system i.e. the cornerstone of any Language Understanding tool. Moreover, I explain this by reviewing one such Language Understanding tool as an example that is available in the IBM Bluemix suite, called as IBM Conversation.
IBM Conversation within Bluemix
IBM Conversation was built on the lines of IBM Watson from the IBM Bluemix suite. It is now the for dialogue construction after IBM Dialog was deprecated.We start off by searching and then creating a dedicated environment in the console.
Conversation component in IBM Bluemix is based on the Intent, Entity and Dialogue architecture. And the same is the case with Microsoft LUIS (LUIS stands for Language Understanding Intelligent Service). One of the key components involves doing what is termed as Natural Language Understanding or NLU for short. It extracts words from a textual sentence to understand the grammar dependencies to construct high level semantic information that identifies the underlying intent and entity in the given utterance. It returns a confidence measure i.e. the top-most extracted intent out of the many pre-specified intents that gives us the most likely intent from the given utterance as per our trained model.
These are all statistically/machine learned based on the training data. Go over the demo, tutorial and documentation to get a more in-depth view of things at IBM Conversation.
The intent, entity and dialogue based architecture forms the crux of any SLU system to extract semantic information from speech and enables such a system to be generic across the various Language Understanding toolkits.
Another huge advantage that ASK provides for building such an architecture, is that it has multi-lingual support.
Intents can be thought of as classes where one classifies the input examples into one of them. For example,
Call Mark is mapped to the MOBILE class and Navigate to Munich is mapped to the ROUTE class
The entities are labels, so e.g. from above, you can have
Mark as a PERSON and Munich as a CITY.
Major advantage and drawback
Both Conversation and LUIS use a non-Machine Learning based approach for software developers or business users to create a fast prototype. It is definitely easy to begin with and gives a lot of options to create drag and drop based dialogue system. However, it can’t scale up to large data. A hybrid approach that can combine or build a dynamic system on top of this static approach is needed for scalable industry solutions.
Moreover, an end to end workflow can be built by plugging in components from Node-RED and introduction to the same can be viewed in the below video.
What’s good is that they have a component for Conversation as well. So, we can build a complete chatbot starting from a speech to text component to get the human commands translated to text, followed by a conversation component to build up the dialog and lastly by a text to speech component to translate this textual dialogue back to speech to be spoken by a humanoid or a mobile device!
Missing components and possible future work
It is not possible to add entities/intent dynamically through the UI after the initial workspace is constructed. The advanced response tab doesn’t allow to edit (add) the entities in the response field, like for example adding variables to the context. We can edit it (highlighted in orange) but it doesn’t save or get reflected.
“text”: “I understand you want me to turn on something. You can say turn on the wipers or switch on the lights.”
“toppings”: “<? context.toppings.append( ‘onions’ ) ?>”
“appliance”: “<? entities.appliance.append( ‘mobile’ ) ?>”
Moreover, the link which only mentions accessing intents and entities but not modifying them.
The only place to add the intent, entities is back in the work space and not programmatically at run time. Perhaps, a possible solution can be to use UI with DB data to save the intermediate and newly discovered intent/entity values and then update the workspace later.
As I end this blog, perhaps there would be another AI assistant released that has moved beyond its embryonic stage to conquer real life application scenarios. Conversational AI is hot property, so dive in to reap its benefits, both from an end user and developer’s perspective!
Note: Hope you enjoyed the read. I have deliberately kept the content a mix of non technical and technical to build the excitement and buzz going around this exciting field of conversational AI! Publishing this blog was on my list as I was compiling lot of facts since last few weeks but I had to hurry even more, given the recent news surrounding this upsurge. As always, any feedback as a comment below or through a message are more than welcome!