Skip to content
Steve Martinelli edited this page Aug 1, 2018 · 17 revisions

Short Name

Create a web based chatbot with voice input and output

Short Description

Use IBM's Watson Speech to Text, Watson Text to Speech and Watson Assistant services to build a web based chatbot that has audio as input and output.

Offering Type

Cognitive

Introduction

We all know that chatbots are AI’s answer to improved customer service and cost savings. Chatbots are available in many user interfaces and input forms. Having a user input text has been done in previous code patterns, in many mediums (Slack, web interface, Facebook Messenger). In this code pattern, we'll be using a web interface again, but instead of using text input, we'll be using voice input and output. This code pattern will use Watson Assistant to control the conversation dialog, and Watson Speech to Text and Watson Text to Speech to handle the speech recognition and playback.

Authors

By Poornima Trikkur Anantharaman and Ramesh Poomalai

Code

Video

Overview

This is a web based application that uses Watson Text-to-Speech, Watson Speech-to-Text, and Watson Assistant. We've created previous code patterns around these services: like the Speech Sandbox Code Pattern, that uses a VR headset as input; or the Watson Assistant and Slack Code Pattern that uses text and a Slack interface; and the Facebook Messenger and Watson How-To that uses text and Facebook. In this code pattern, the main difference is that we'll be using voice input and audio output, and the interface will be a web browser.

The main website is built using jQuery whereas the API calls are made using Python flask. A websocket connection is created to make the calls to the various Watson services. An sample insurance conversation is used for the dialog.

When the reader has completed this code pattern they will understand how to:

  • Make a Watson Speech To Text call using a Web Socket Connection
  • Make a Watson Text to Speech REST API call
  • Send and receive messages to Watson Assistant using REST APIs
  • Integrate Watson Speech To Text, Watson Text To Speech and Watson Assistant in a web app

Flow

architecture

  1. User selects the microphone option on the browser and speaks.
  2. The voice is passed on to Watson Speech To Text using a Web Socket connection.
  3. The text from Watson Speech to Text is extracted and sent as input to Watson Assistant.
  4. The response from Watson Assistant is passed onto Watson Text to Speech.
  5. The audio output is sent to the web application and played back to the user, while the UI also displays the same text.

Included components

  • Watson Conversation: Create a chatbot with a program that conducts a conversation via auditory or textual methods.
  • Watson Speech-to-Text: A service that converts human voice into written text.
  • Watson Text to Speech: Converts written text into natural sounding audio in a variety of languages and voices.

Featured technologies

  • Artificial Intelligence: Artificial intelligence can be applied to disparate solution spaces to deliver disruptive technologies.
  • Cloud: Accessing computer and information technology resources through the Internet.
  • Python: Python is a programming language that lets you work more quickly and integrate your systems more effectively.

Blog

Links