News

Artificial intelligence could soon diagnose illness based on the sound of your voice

By: Carmen Molina Acosta | Lisa Weiner | NPR
Posted on: Monday, October 10, 2022

WASHINGTON, D.C. (NPR) — Voices offer lots of information. Turns out, they can even help diagnose an illness — and researchers are working on an app for that.

The National Institutes of Health is funding a massive research project to collect voice data and develop an AI that could diagnose people based on their speech.

Everything from your vocal cord vibrations to breathing patterns when you speak offers potential information about your health, says laryngologist Dr. Yael Bensoussan, the director of the University of South Florida’s Health Voice Center and a leader on the study.

“We asked experts: Well, if you close your eyes when a patient comes in, just by listening to their voice, can you have an idea of the diagnosis they have?” Bensoussan says. “And that’s where we got all our information.”

Dr. Yael Bensoussan poses for a photograph in a white doctor's coat with a hand on a machine that analyses vocal chords. — Yael Bensoussan, MD, is part of the USF Health’s department of Otolaryngology – Head & Neck Surgery. She’s leading an effort to collect voice data that can be used to diagnose illnesses. [Allison Long]

Someone who speaks low and slowly might have Parkinson’s disease. Slurring is a sign of a stroke. Scientists could even diagnose depression or cancer. The team will start by collecting the voices of people with conditions in five areas: neurological disorders, voice disorders, mood disorders, respiratory disorders and pediatric disorders like autism and speech delays.

The project is part of the NIH’s Bridge to AI program, which launched over a year ago with more than $100 million in funding from the federal government, with the goal of creating large-scale health care databases for precision medicine.

“We were really lacking large what we call open source databases,” Bensoussan says. “Every institution kind of has their own database of data. But to create these networks and these infrastructures was really important to then allow researchers from other generations to use this data.”

This isn’t the first time researchers have used AI to study human voices, but it’s the first time data will be collected on this level — the project is a collaboration between USF, Cornell and 10 other institutions.

“We saw that everybody was kind of doing very similar work but always at a smaller level,” Bensoussan says. “We needed to do something as a team and build a network.”

The ultimate goal is an app that could help bridge access to rural or underserved communities, by helping general practitioners refer patients to specialists. Long term, iPhones or Alexa could detect changes in your voice, such as a cough, and advise you to seek medical attention.

To get there, researchers have to start by amassing data, since the AI can only get as good as the database it’s learning from. By the end of the four years, they hope to collect about 30,000 voices, with data on other biomarkers — like clinical data and genetic information — to match.

Dr. Olivier Elemento poses for a portrait while wearing a suit in front of a window — Dr. Olivier Elemento of Weill Cornell Medicine is the other co-principal investigator on the project. [Travis Curry | Olivier Element]

“We really want to build something scalable,” Bensoussan says, “because if we can only collect data in our acoustic laboratories and people have to come to an academic institution to do that, then it kind of defeats the purpose.”

There are a few roadblocks. HIPAA — the law that regulates medical privacy — isn’t really clear on whether researchers can share voices.

“Let’s say you donate your voice to our project,” says Yael Bensoussan. “Who does the voice belong to? What are we allowed to do with it? What are researchers allowed to do with it? Can it be commercialized?”

While other health data can be separated from a patient’s identity and used for research, voices are often identifiable. Every institution has different rules on what can be shared, and that opens all sorts of ethical and legal questions a team of bioethicists will explore.

In the meantime, here are three voice samples that can be shared:

Parkinson’s Disease

Glottic Cancer

Vocal Fold Paralysis

Credit to SpeechVive, via YouTube.

The latter two clips come from the Perceptual Voice Qualities Database (PVQD), whose license can be found here. No changes were made to the audio.

Transcript :

STEVE INSKEEP, HOST:

This radio program is, among other things, a demonstration of the complexity of the human voice. People convey messages by what they say and also the way they say it. And the voice may also help doctors to diagnose conditions. Dr. Yael Bensoussan directs the Health Voice Center at the University of South Florida.

YAEL BENSOUSSAN: When we talk about the human voice, we talk about the sound that the vocal cords make, sounds like E, the way that the vocal cords vibrate, but also the way we speak. And speech is the way we articulate the sound, the way the sound goes into what we call our resonators – the nose cavity, the mouth cavity – and the way we use our breathing to talk.

LEILA FADEL, HOST:

Dr. Bensoussan says the voice gives her clues to five main categories of conditions.

BENSOUSSAN: The first one is neurological disorders, like Alzheimer’s, Parkinson’s, ALS. Then we have voice disorders, like laryngeal cancer. Then we have mood disorders. We know that we can find a lot of changes in voice in depression and mood disorders. Fourth categories is respiratory disorders. People with lung diseases can cough a different way. And then the last one is pediatric disorders, like autism and speech delays.

INSKEEP: So what do those variations sound like? Here’s an example – a voice with vocal fold paralysis from the Voice Foundation at St. John’s University.

UNIDENTIFIED PERSON #1: The blue spot is on the key again. How hard did he hit him?

FADEL: And here’s the voice of someone with cancer in their larynx, also from the Voice Foundation at St. John’s University.

UNIDENTIFIED PERSON #2: The blue spot is the key again. How hard did it hit him?

FADEL: And a voice of someone diagnosed with Parkinson’s from a company called SpeechVive.

UNIDENTIFIED PERSON #3: When sunlight strikes rainbows in the air, they act like a prism and form a rainbow.

BENSOUSSAN: Somebody with Parkinson’s disease not only has a lower voice so that usually the frequency of the voice is a little bit lower; the speed of the way they speak is a little bit lower, more monotonous.

FADEL: Now with funding from NIH, Dr. Bensoussan and her co-director on the project are attempting to build an app to help diagnose diseases by listening to the quality of someone’s voice.

BENSOUSSAN: We’re developing the database for researchers to have access and also the tool to capture the voice.

INSKEEP: Yeah. As more people upload their voices to this app and the database grows, the algorithm’s disease detection accuracy may improve. The researchers plan to include many voices to ensure their data is reflective of the population.

BENSOUSSAN: Training an algorithm on a group of 30 white males that are all 70 years old is not going to give good accuracy. And that’s why serving the remote communities and underserved communities – it’s so important. We want to capture the voices of these people as well that are underrepresented to make sure that the tools we develop are applicable to them.

FADEL: So can an app like this replace a regular medical screening in the future? Dr. Bensoussan says, no. She says the app will only flag signs of disease.

BENSOUSSAN: A family doctor, for example, that’s in a remote community could use our tools and record the voice of the patients, put in the history, and the app could say, there’s a very high chance this is cancer, actually. You should definitely have them seen by an expert in a very timely manner.

INSKEEP: Of course, if you’re giving clues about yourself through your voice, that does raise privacy concerns. So the doctor says they have a team of bioethicists on board.

BENSOUSSAN: We know that technology can do crazy things. So it’s kind of our job and our responsibility to make sure we put boundaries to that.

INSKEEP: Dr. Bensoussan says with the help of other researchers, it should take around four years to develop a tool that can help doctors with diagnosis and screening.

(SOUNDBITE OF MUSIC) Transcript provided by NPR, Copyright NPR.