When voice emerged in 2011 with Siri, no one could have predicted how it would drive tech innovation. Now, more than a decade later, it’s estimated that 35% of Americans over the age of 12 own at least one smart speaker (i.e., Google Home, Amazon Echo) and mobile users are three times more likely to use voice search. Brands such as Amazon and Google are continuing to fuel this trend as they compete for market share. Now, voice assistants and interfaces are advancing at an exponential rate in all industries. The most notable growth has been in healthcare and banking, as companies race to release their own voice technology integrations to keep pace with consumer demand.
The main driver for this shift towards voice user interfaces is changing user demands. There is an increased overall awareness and a higher level of comfort, demonstrated specifically by Millennial consumers.
The mass adoption of artificial intelligence in users’ everyday lives is also fuelling the shift towards voice applications. The number of IoT (Internet of Things) devices – such as smart thermostats, appliances, and speakers – have given voice assistants more utility in a connected user’s life. Smart speakers have become the number one way voice is used, but they’re only the beginning.
Applications of this technology are everywhere, so where will it take us through 2022 and beyond? Let’s take a look at a few predictions.
Integrating voice-tech into mobile apps has become a hot trend. Voice-powered apps increase functionality and save users from complicated app navigation. Furthermore, voice-activated apps make it easier for the end-user to navigate in general. This is even true if they don’t know the exact name of the item they’re looking for or where to find it in the app’s menu.
In 2020, AI-powered chatbots and virtual assistants played a vital role in the fight against COVID-19. Chatbots helped screen and triage patients. Apple’s Siri can even now walk users through CDC COVID-19 assessment questions and then recommends telehealth apps. Voice and conversational AI made health services more accessible to those unable to leave their home during COVID-19.
Now that patients have a taste for what is possible with voice and healthcare, behaviors are not likely to go back to re-pandemic norms. Expect more investment in voice-tech integration in the healthcare industry over the years to come.
Voice search has been a hot topic of discussion. Visibility of voice will undoubtedly be a challenge. This is because the visual interface with voice assistants is missing. Users simply cannot see or touch a voice interface unless it is connected to the Alexa or Google Assistant app. Search behaviors, in turn, will see a big change. In fact, if Statista’s predictions are correct, the global speech recognition market should reach nearly $30 billion by 2026.
Brands have experienced a shift in which touchpoints have transformed into listening points, and organic search will be the main way in which brands have visibility. As voice search grows in popularity, advertising agencies and marketers expect Google and Amazon to open their platforms to additional forms of paid messages.
Voice assistants will also continue to offer more individualized experiences as they get better at differentiating between voices.
Google Home is able to support up to six user accounts and detect unique voices. This allows Google Home users to customize many features. Users can ask “What’s on my calendar today?” or “Tell me about my day?” and the assistant will dictate commute times, weather, and news information for individual users. It also includes features such as nicknames, work locations, payment information, and linked accounts such as Google Play, Spotify, and Netflix.
Similarly, for those using Alexa, simply saying “learn my voice” will allow users to create separate voice profiles. That way the technology can detect who is speaking for a more individualized experience.
Machine learning tech and GPU power development commoditize custom voice creation and make the speech more emotional. This, in turn, makes computer-generated voices that are indistinguishable from the real thing. You just use a recorded speech and then a voice conversion technology transforms your voice into another. Voice cloning becomes an indispensable tool for advertisers, filmmakers, game developers, and other content creators.
In recent years, smart displays are on the rise as they expanded voice-tech’s functionality. Now, the demand for these devices is even higher, with consumers showing a preference for smart displays over regular smart speakers.
In the third quarter of 2020, the sales of smart displays rose year-on-year by 21 percent to 9.5 million units, while basic smart speakers fell by three percent. By 2023, we expect a lot of innovation in the world of smart displays, especially the integration of more advanced technology and more customization. Smart displays, like the Russian Sber portal or the Chinese smart screen Xiaodu, for example, are already equipped with a suite of upgraded AI-powered functions, including far-field voice interaction, facial recognition, hand gesture control, and eye gesture detection.
It takes a lot of time and effort to record spoken dialogue for every character in a game. Now, developers can use sophisticated neural networks to mimic human voices. In fact, looking ahead, neural networks will even be able to create appropriate non-playable character (NPC) responses. Some game design studios and developers are working hard to create and embed this dialogue block into their tools, so seeing games include dynamic dialogue isn’t too far off.
Mobile phones are already personalized, more so than any website. Additionally, there is very little screen space on mobile, making it more difficult for users to search or navigate. With larger product directories and more information, voice applications allow consumers to use natural language. This eliminates or reduces manual effort, and makes accomplishing tasks much faster.
Rogers has introduced voice commands to their remotes, allowing customers to quickly browse and find their favorite shows or the latest movies with certain keywords; an actor’s name, for example. Brands need to focus on better mobile experiences for their consumers and voice is the way to do so. Users want quicker and more efficient ways of accomplishing tasks, and voice is quickly becoming the ideal channel for this.
Whether it’s finding out information, making a purchase, or achieving a task, voice is the new mobile experience.
With just these simple scenarios it’s easy to see why voice assistants are becoming the hubs of our connected homes.
Voice technology is becoming increasingly accessible to developers. For example, Amazon offers Transcribe, an automatic speech recognition (ASR) service that allows developers to add speech-to-text capability to their applications. Once voice capability is integrated, users can analyze audio files and receive a text file of the transcribed speech.
Google has made moves in making Assistant more ubiquitous by opening the software development kit through Actions, which allows developers to build voice into their own products that support artificial intelligence. Another of Google’s speech-recognition products is the AI-driven Cloud Speech-to-Text tool which enables developers to convert audio to text through deep learning neural network algorithms.
This is only the beginning of voice technology; we expect to see major developments in the user interface in the years to come. With the advancements in VUI, companies need to start educating themselves on how they can best leverage voice to better interact with their customers. It’s important to ask what the value of adding voice will be; it doesn’t necessarily make sense for every brand to adopt it. How can you provide value to your customers? How are you solving their pain points with voice? Will voice enhance the user experience or frustrate the user?
Every day, voice-enabled apps get better at not only understanding what we say, but how we say it, and most importantly, what we mean.
However, there are still a number of barriers to overcome before voice applications will see mass adoption. Technological advances are making voice assistants more capable, particularly in AI, natural language processing (NLP), and machine learning. To build a robust speech recognition experience, the artificial intelligence behind it must become better at handling challenges such as accents and background noise. And, as consumers are becoming increasingly more comfortable and reliant upon using voice to talk to their phones, cars, smart home devices, etc., voice technology will become a primary interface to the digital world and with it, expertise for voice interface design and voice app development will be in greater demand.
Advancements in a number of industries have helped digital voice assistants become more sophisticated and useful. Voice has now established itself as the ultimate mobile experience. However, a lack of skill and knowledge make it particularly hard for companies to adopt a voice strategy. There is a lot of opportunity for much deeper and much more conversational experiences with customers. The question is, is your brand willing to jump on this opportunity?