How Artificial Intelligence (AI) can detect diabetes with a 10-second voice sample

AI can analyze speech patterns to detect type 2 diabetes with astonishing accuracy. The method could prove to be a useful diagnostic tool. But it comes with a warning label.

Sep 07, 2024

September 7, 2024

Medical diagnostic tools using advanced voice analysis are becoming increasingly precise. Analyzing speech patterns can provide valuable insights, particularly for diseases such as Parkinson's or Alzheimer's.

Mental illness, depression, post-traumatic stress disorder and heart disease can also be detected using voice analysis.

Artificial intelligence (AI) can even detect signs of constricted blood vessels or exhaustion. This allows medical professionals to treat patients sooner and reduce any possible risks.

According to a study published in the Mayo Clinic Proceedings: Digital Health medical journal, a short voice recording is all it takes to determine with surprising accuracy whether an individual has type 2 diabetes.

Undetected disease

This technology is intended to help identify people living with undiagnosed diabetes. Worldwide, some 240 million adults have diabetes and don't know it.

Nearly 90% of cases are type 2 diabetes, according to the International Diabetes Federation.

People with type 2 diabetes have an elevated risk of cardiovascular diseases, such as heart attack, stroke and poor circulation in the legs and feet.

Diabetes screening tests using voice analysis would significantly improve detection.

Most other tests require a trip to a health care provider. These include the fasting blood glucose test (FBG), the oral glucose tolerance test (OGTT), or the glycated hemoglobin test (A1C). The latter is performed to measure average blood sugar levels over the course of two to three months.

How does voice analysis work?

With voice frequency analysis, changes in the voice that are inaudible to the human ear are analyzed by AI. Oftentimes, recordings of a phone conversation are all the software needs for the analysis.

It examines factors such as speech melody, cadence, pauses and pitch. Certain symptoms have characteristic phonetic traits, such as how the vowel A is pronounced over a period of five seconds.

The human voice can display up to 200,000 distinct characteristics. AI algorithms can filter through all of these to identify particular vocal patterns that match certain symptoms.

Remarkably accurate

The newly developed AI screens voice recordings lasting between six and 10 seconds, looking for differences in vocal pitch and intensity.

Combined with basic health data such as age, gender, height and weight, the program can gauge whether the speaker has type 2 diabetes.

Its results are remarkably accurate but are skewed slightly depending on gender. Due to differences in vocal variances between male and female speakers, the tests were 89% accurate when examining females, and 86% when assessing males.

Distinct acoustic features

In order to train the AI, Jaycee Kaufman and her team at Ontario Tech University in Canada recorded the voices of 267 individuals who either did not have diabetes or who had already been diagnosed with type 2 diabetes.

Over the course of two weeks, participants recorded a short sentence six times daily on their smartphone.

This generated over 18,000 voice samples, from which 14 acoustic features were singled out, as they differed between participants with and without diabetes.

"Current methods of detection can require a lot of time, travel, and cost," said Kaufmann, a research scientist at Klick Labs. "Voice technology has the potential to remove these barriers entirely."

In the future, Klick Labs hopes to examine whether voice analysis can also assist in detecting other conditions, such as prediabetes or hypertension.

Dangers of voice analysis

Proponents of voice analysis as a diagnostic tool emphasize the speed and efficiency with which diseases could be detected using a patient's voice.

But even if AI-supported tools are able to provide very specific information, a handful of voice samples are not enough to develop a well-founded diagnosis.

The risk of obtaining false positives or overdiagnosis is also high. In the end, assessments should always be made based on human expertise.

Indications, not medical diagnosis

This is particularly the case for psychological illnesses. Certain vocal tonality might be an indication of depression, for example, but only a thorough examination by a trained human professional can provide surety.

AI might be able to analyze one's voice to recognize if a person is speaking more impulsively or less structured than usual, but only a medical professional can determine if this is related to something like attention-deficit hyperactivity disorder.

Misuse cannot be ruled out

Critics and data protectionists have warned of the enormous risk of voice analysis software being used in bad faith, for example by employers or insurance call centers.

There is a risk voice analysis software could be used without explicit consent and that customers or employees could be disadvantaged on the basis of personal health information.

It would also be comparatively easy for sensitive medical information to be passed on, hacked, sold or otherwise misused.

However, clear regulations and limits on voice analysis as a diagnostic tool cannot be set by scientists. This falls squarely within the purview of politics.

This article was originally written in German.

Tuzara Post Newsletter

Discussion about this post