Health AI: vocal biomarkers enable smartphone medtech

Multiple medtech firms are exploring diagnostic prospects for vocal biomarkers using smartphone speech recordings and health AI.
19 October 2023

Health AI looking for vocal biomarkers in sound.

Getting your Trinity Audio player ready...

Telemedicine puts medical professionals just a call away from patients, but the microphones enabling those conversations have diagnostic potential of their own. Today, there’s huge interest in the use of vocal biomarkers – signatures in human speech – that can provide wellbeing warnings when decoded by health AI.

This month, researchers from Klick Labs in Toronto, Canada, demonstrated how vocal biomarkers in speech samples could provide an extremely affordable way of screening for type 2 diabetes. The approach, which makes use of health AI to find telltale features in audio submissions that point to the disease, costs just cents compared with dollars for conventional tests.

A prediction model for type 2 diabetes

“Using the participants’ smartphones, we recorded over 18,000 voice segments from individuals with type 2 diabetes and non-diabetic individuals,” said Jaycee Kaufman, first author of the vocal biomarker study published in Mayo Clinic Proceedings: Digital Health. “We then created a prediction model in order to predict type 2 diabetes status.”

The accuracy of using vocal biomarkers was between 86-89% compared with 85-92% for glucose-based screening and could dramatically speed up diagnosis globally. Kaufman points out that there are an estimated 200 million people worldwide with undiagnosed diabetes. Rather than having to make their way to the clinic, people could instead record their voice on a smartphone as a simple screening first step.

Diabetes is just one of many medical conditions that developers believe could be highlighted using vocal biomarkers. The potential of health AI for speech-based patient monitoring soon becomes apparent when you consider that talking is a combination of muscle actions, cognitive control, and physiology.

Free speech analysis-software provides a starting point for discovering vocal biomarkers using health AI.

Spectrogram starting point: free speech analysis-software such as Praat (shown here analysing the spoken phrase, “Welcome to TechHQ”) offers a glimpse into how vocal biomarkers can be identified using health AI.

If any of those speech production elements are impaired, there’s a chance that those changes will manifest as vocal biomarkers. Considering type 2 diabetes, it’s hypothesized that point-in-time glucose concentrations affect the elastic properties of the vocal cords.

Voice-related disorders include signs of cancer in the vocal tract and respiratory infections. Conditions such as Parkinson’s, Alzheimer’s disease, and strokes – to give just a few examples – can diminish muscle control, which impacts voice articulation. And health screening and patient monitoring applications don’t stop there.

Vocal biomarkers can also signpost a reduction in mental health and well-being to give early identification of stress, anxiety, and depression. Given the broad application of this health AI approach, the global vocal biomarker market is predicted to quadruple over the next decade. Fact.MR – a market research firm – estimates that the sector could be valued at more than US $9 billion by 2033.

“Voice is much more than just spoken words,” explains Dagmar Schuller – CEO and co-founder of Audeering. “We can identify a lot of states and traits throughout the speaking process.” Gender, height, age, cognitive load – whether we are alert or tired, and personality information gathered through psychological models, are just a few of the details that can be deciphered.

List of medtech firms using vocal biomarkers and health AI to diagnose medical conditions:

Also, if you’d like to contribute to this area of research, you can donate voice samples to initiatives such as ColiveVoice – run by the Luxemburg Institute of Health. “We need many participants worldwide, speaking different languages and being either healthy or living with a chronic disease,” writes the ColiveVoice team.

How does health AI find vocal biomarkers in speech?

Deep learning techniques are ideal for pattern-matching and have broken records in image recognition. Like many teams working on vocal biomarker applications, the Klick Labs team performs its analysis by converting audio data into spectrograms, which – at a high level – can be thought of as pictures of sound.

In fact, you can start exploring this approach for yourself by downloading free speech analysis software dubbed Praat, which has long been a favorite tool for doing phonetics by computer. Praat is available for Windows, Macintosh, Linux, Raspberry Pi, and Chromebook. It can also be accessed using Python by installing the Parselmouth library, which is the route taken by the Klick Labs researchers to perform feature extraction.

Spectrograms show time along the bottom with frequency data on the vertical axis. The darker the signals, the higher the amplitude or energy. Just by looking at the image, it’s possible to see patterns, and AI tools take that process to the next level.

Speech recognition has long made use of spectrograms, although new AI techniques such as voice cloning – something that Microsoft has shown is possible using just 3 second voice clips – work with audio codecs instead.

Given the ubiquity of microphones and sound recorders in a world with billions of mobile phones, the ease of capturing data for medical screening using voice biomarkers and health AI is striking.


Is voice the new blood?

Few of us would want to be under full audio surveillance, but sampling our speech periodically could – thanks to biomarkers and health AI – be beneficial if it provided early warning of feeling unwell.

Intriguingly, such algorithms may also shine a light on historical figures – where audio recordings are available – and discover much more information about speakers than is currently known.