Q&A Blog with Sue Rudd - Advanced Speech Recognition and how Radisys Breaks Through the Barriers

February 19, 2019 Al Balasco

During a recent Fierce Wireless webinar, Sue Rudd, Director of Networks and Service Platforms for Strategy Analytics, joined Radisys to discuss Advanced Speech Recognition (ASR) and how Radisys is helping service providers break through some of the barriers in implementing speech recognition solutions. We took an opportunity to continue the conversation with Sue about the opportunities for service providers, and I am pleased to share it with you here.

Hi Sue, thanks for taking the time to talk with us about ASR. What do you consider to be the key milestones to date in this market?

Rudd: The ASR evolution has followed a path from large systems that were device and application dependent, and it is now moving toward natural language solutions that are independent of the device, the user, and the location. ASR has gone from interactive voice recognition (IVR) in PBXs and call centers in the 70s and 80s, to personal digital assistants (PDAs) in the 90s where the device would connect to the network to retrieve the information. And then, obviously, smart phones came with Siri and associated applications. Siri became a good user interface, and it freed up the device and the user. All of this gave rise to the home speaker market like Amazon ECHO and Google Home.

While the user interface has improved dramatically, there is a lot of complicated compute processing required to do it at scale. However, a challenge for service providers is that end users now take the interface for granted, but costs need to decline in order for high volume services to be deployed. To support natural language, ASR requires expensive infrastructure for actual semantic language processing. If you’re a service provider, you have to put a lot of CPU processing behind it to enable ASR to do the language processing. Both Amazon and Google have the processing power already, but for a service provider who wants to give a good, very fast response time to millions of users simultaneously, it will require a lot of hardware.

This is where Radisys’ solutions come into the picture. Its embedded advanced speech processing approach enables service providers to offer mass market ’in-call‘ speech recognition services at significantly lower cost.

Which markets are benefiting from ASR?

Rudd: There are some significant opportunities for service providers to develop ASR applications in number of growing markets. In a recent Strategy Analytics forecast, we identified three “’hot’” segments where we see ASR poised for significant growth – that offer opportunities for network providers: Smart Speakers, In-Vehicle UI/UX, and voice enabled IoT for verticals. The first two are very environment specific, while voice enabled IoT offers opportunities across many different vertical applications.

Smart speakers, like Amazon Echo and Google Home, have an installed base that already approached 100 million by Q3 2018. As smart home devices proliferate and are connected, ASR could play an increasingly important role.

The application of ASR for In-Vehicle user experience is growing as well. We are seeing lots of voice activated applications for cars as touchscreen interfaces decline. And because cars are very much a mobile application, there are significant opportunities for network providers to enable in-vehicle services.

The opportunities around voice-and video enabled IoT are huge, but they will be especially useful for markets where live hands free network support is important e.g. for field service workers repairing equipment. There are all sorts of wonderful opportunities here, like speech controlled access to service manuals or a voice controlled ‘heads up’ display for equipment diagnostics. ‘Hands free’ speech navigation is also likely to be integrated with video screens for many mobile market services. Many CSPs around the world are already committed to a VoLTE standard for IoT voice and video transmission called IoT, which will enable many new applications where speech recognition and related media analytics technologies like sentiment analysis and sound “understanding” add value – security, manufacturing, smart cities, etc.

Voice Enabled IoT is expected to experience significant growth according to Strategy Analytic ‘IoT Cellular Connections‘ report. What were the verticals that are positioned to benefit most from IoT significant disruption / revolution?

Rudd: The voice Enabled IoT list of opportunities is long - robotics for smart surgery, warehouse logistics, and transportation logistics where people are moving around and require mobile communications.

I think a good example would be in the healthcare vertical, where opportunities can range from things like a smart watch that keeps track of heath metrics and talks back to you, to an applications that calls you to check to see if you, or a loved one, has taken their medication. Once you get into two-way speech communications, you’re into Telephony and that is where smart speakers are headed.

Ultimately, ASR will enable a unified ecosystem where multiple systems will be more integrated. Many verticals will evolve to use mobile communications as they speech becomes the best User Interface (UI) to control functionality in the network. This is a great opportunity for service providers.

Looking at some of the new market opportunities, where do you see service providers having the greatest opportunity for growth?

Rudd: I think there are three good opportunities that are created by independent third-party solutions like those we’re seeing from Radisys.

The first is to move the call center into the network which allows the traditional call center to achieve a scale that makes cost effective for smaller businesses – via a managed service from a network provider.

Another opportunity for service providers could center around in-home smart speakers, like Echo or Google Home. New third-party solutions could leverage lower cost, network-based speech recognition as part of a mobile App. instead of a cloud-based service.

Today home device Apps are specific-to and compliment one particular cloud provider’s offerings but adding a telecoms extension opens up new possibilities. For example, smart speakers could begin to support telephone calls for the elderly like “Call my grandson,” or “I can’t get up please call emergency services,” which you are hard to do today. Through network based ASR applications, a smart speaker could open up third-party opportunities to customize voice calling and health monitoring.

For service providers that are considering expanding their ASR solutions, what are some of the things they must consider as they evaluate their current and future platforms?

Rudd: A good place to start would be with current small business users perhaps a hosted solution to trial of a ‘virtual call center’. ASR offers a cost effective solution for hosted pilot service.

Radisys has added some very innovative features to its network-based ASR platform including automated statistics capture and mood detection, that add significant value for call center managers. And service providers can rapidly test and deploy a wide range of cost-effective add-on services as customers request them. These new scalable hosted solutions that are no longer anchored to an expensive PBX or expensive IVR software offer a very cost effective solution for small or highly mobile businesses.

Looking at history of ASR evolution as a whole, as new applications and mass-market opportunities are introduced and benefit from embedded solutions, where do you see next breakthrough taking place?

Rudd: The successful application of advanced speech recognition will ultimately be about creating a seamless user experience for any of user applications, whether a fitness app, or business video conferencing, or just a regular call that shares video instantly with friends.

Service providers will discover that solutions like Radisys’ MediaEngine allow for compute intensive natural language processing at low cost and provide higher quality results regardless of location and background noise. And can offer faster response time to a wide range of vocabulary and languages without the need for the user to train the system. It is time for network service providers to offer highly scalable, low CPU cost-per user solutions that that are independent of device, user and location.

As the ASR evolution continues, service providers can offer fully integrated seamless fixed and mobile applications that follow users wherever they go, to put the network operator at the center of the speech recognition value chain.

Thank you, Sue. I really appreciate your insights into this exciting marketing opportunity for service providers.

About the Author

Al Balasco is the Head of Media, Core and Applications Business. Prior to his current role, he was the Sr. Director of Product Management for the Media Server business. Before joining Radisys in October 2010, Al was the Director of Product Management in Avaya Inc.’s Unified Communications business unit where he was responsible for the delivery of a variety of collaboration solutions and partnerships. Prior to Avaya, Mr. Balasco was the Vice President of Product Management at Spectel and was instrumental in defining the company’s VOIP conferencing and collaboration strategy. He also served as Director of Marketing for Sonexis Inc. and Director of Product Management at Brooktrout Software. Mr. Balasco has over 25 years of product management, business development and marketing experience in the telecommunications industry and has an MBA from Northeastern University in Boston.