Audio SoCs Prove Vital for Always-on Voice Activation Platforms and Edge Device AI

April 29, 2019 by Majeed Ahmad

As voice-activated digital assistant devices continue to gain popularity, AI- and audio-focused SoCs evolve.

As voice-activated digital assistant devices continue to gain popularity, AI- and audio-focused SoCs evolve.

Smart processing and sensing designs are proliferating with the adoption of voice-enabled applications such as smart assistants, including features like voice commands and voice search. From smartphones, wearables, to a variety of Internet of Things (IoT) devices, voice-activated interfaces are becoming standard. 

With demands rising on better power management and audio processing in these smart assistant devices, SoCs (system-on-chips) need to keep up.


Continual Growth of the Virtual Assistant Market

Virtual assistants have been gaining traction for the last several years. It seems that most of the tech giants have invested in some form of assistant: Apple has Siri, Google has Google Assistant, Microsoft has Cortana, and Samsung has Bixby. Some platforms are crowdfunded, like the one developed by MATRIX Labs and, among those, some are even open-source, such as Mycroft. There are also international players such as Russia's Alice by Yandex and China's Duer by Baidu, both internet-focused juggernauts of their respective countries.

A more recent case is from Chinese smartphone maker OPPO, which has integrated a voice-enabled digital assistant in its recently-launched Reno handset. The new smartphone is employing an intelligent assistant called “Breeno” to access services based on artificial intelligence (AI), cloud computing, and virtual reality (AR) technologies. Breeno, which OPPO launched at its developer conference in Beijing in December 2018, is targeted at the 5G era application scenarios. 

The always-on voice functionality in OPPO’s new smartphone is powered by DSP Group’s DBMD4 audio SoC that enables device activation and operation via voice commands while maintaining ultra-low power consumption. The audio/voice processor allows Reno phone to access Breeno through high-accuracy far-field and two-way voice algorithms for natural user experience.


The DBMD4. Image from the data brief.


However, while always-on voice activation platforms open many doors to microphone-equipped devices such as smartphone by improving the human-machine interaction, it can also be a drain on battery life. Likewise, executing the far-field voice capabilities is easier said than done. This is where, for all of the end-user features offered on these devices, the actual hardware becomes important.

The Need for Low Power and High Accuracy in Audio SoCs

There's already a healthy number of SoCs available for AI in mobile devices. From Qualcomm's Snapdragon collection of hardware to Huawei's Kirin 980 chipset, there are plenty of big players working towards smarter on-device processing. 

The situation changes, however, when specifically talking about audio devices that work with the voice command interface. Issues such as audio processing, language-recognition machine learning, and low-latency audio distribution become important considerations for device designers. Adding on the challenges of always-on functionality and mobility introduce issues of power use and physical chip size.

Let's take a look at the example of Breeno, which uses the SmartVoice design platform built around the DBMD4 always-on audio processor. The DBMD4 device incorporates Sensory's Truly Handsfree and Google hotword technologies for a power-optimized implementation of always-on voice features.


A visual graphic of how Sensory sees the path to accurate voice interaction with a device. Image from Sensory


According to DSP Group, DBMD4 is the first DSP with Sensory’s low-power always-listening hardware block and that allows it to deliver processing efficiency for low-power voice activation platforms. Consequently, the audio chip enables battery-operated devices to actively listen and sense voice activity and commands while in ultra-low-power mode.

Next, DBMD4 incorporates a suite of voice enhancement algorithms that significantly improve user experience and accuracy of speech-driven applications, particularly in high-noise environments. It leverages these algorithms to achieve more effective isolation of voice from surrounding environmental sounds.

The DBMD4 chip also uses pre-process algorithms to carry out tasks such as acoustic echo canceling (AEC), automatic gain control (AGC) and beamforming. That, in turn, allows the audio processor to ensure noise reduction in loud surroundings and provide accuracy in trigger word listening environments.

DSP Group’s audio chip comes along with a complementary software framework and a complete suite of Android drivers for Lollipop 5.x. It also supports both digital and analog microphones and facilitates a variety of processor interfaces such as SPI, I2C, UART, and SLIMbus.

With a 1.8 x 2.1 mm form factor, DSP Group claims that its ultra-low-power, always-on voice and audio processor can facilitate a broad array of personalized intelligent services in mobile, wearable, and IoT designs.

Another example SoC for this space is the QCS400 from Qualcomm, announced in March. Where Qualcomm's QCC5100 SoC (announced in January 2018 at CES) focused on low-power earbuds, the QCS400 is designed specifically for on-device AI and sound quality for smart speakers


The Qualcomm QCS400. Image from Qualcomm


Other audio SoC options have included the BM94803AEKU audio SoC from ROHM. Also in the running in the past has been TI's nearly decade-old TAS3308 digital audio SoC, which TI recommends designers replace with the TLV320AIC3256 stereo codec where possible as the TAS3308 is only still in production for existing customers.



What other audio SoCs are you familiar with? Do you agree that on-device AI is going to be the next big thing? What's your experience with designing high-accuracy audio systems? Share your thoughts in the comments below.